This simple discussion of PERL4 describes the essential features needed to get going for general purpose programming
-
A few Perl5 points are made when appropriate
|
i.e. it does not describe the special concerns needed for systems programming but is aimed at what you need for writing CGI programs
|
We reference in detail Llama Book: Learning PERL by Randal L. Schwartz and published by O'Reilly and Associates. ISBN: 1-56592-042-2
|
More detailed is the recently updated Camel book: Programming PERL by Larry Wall, Tom Christiansen and Randal L. Schwartz and also published by O'Reilly and Associates. ISBN: 1-56592-149-6
-
This is one of few authoritative Perl5 discussions
|
Another useful book which lies between Llama and Camel books in completeness is: PERL by Example by Ellie Quigley, Prentice Hall. ISBN 0-13-122839-0
|
This simple discussion of PERL4 describes the essential features needed to get going for general purpose programming
-
A few Perl5 points are made when appropriate
|
i.e. it does not describe the special concerns needed for systems programming but is aimed at what you need for writing CGI programs
|
We reference in detail Llama Book: Learning PERL by Randal L. Schwartz and published by O'Reilly and Associates. ISBN: 1-56592-042-2
|
More detailed is the recently updated Camel book: Programming PERL by Larry Wall, Tom Christiansen and Randal L. Schwartz and also published by O'Reilly and Associates. ISBN: 1-56592-149-6
-
This is one of few authoritative Perl5 discussions
|
Another useful book which lies between Llama and Camel books in completeness is: PERL by Example by Ellie Quigley, Prentice Hall. ISBN 0-13-122839-0
|
Assignments are $Next_Course = "CPS615";
|
$Funding = $Funding + $Contract;
|
The latter can be written as in C as
|
$Funding += $Contract;
|
Similarly can write for strings:
|
$Name= "Geoffrey"; $Name .= " Fox" # Sets $Name to "Geoffrey Fox"
|
Example: $A = 6; $B = ($A +=2); # sets $A = $B = 8
|
AutoIncrement and Autodecrememt: as in C
-
$a = $a + 1; $a +=1; and ++$a; # are the same and increment $a by 1
|
++ and -- are both allowed and can be used BEFORE(prefix) or AFTER(suffix) variable(operand). Both forms change operand in same way but in suffix form result if used in expression is value BEFORE variable incremented.
-
$a=3; $b = (++$a) # sets $a and $b to 4
-
$a=3; $b = ($a++) # sets $a to 4 and $b to 3
|
We can use scalar variables in strings
-
$h= "World"; $hw= "Hello $h"; # sets $hw to "Hello World"
-
$h= "World"; $hw= "\UHello $h"; # sets $hw to "HELLO WORLD"
-
showing how \U and similarly \L operate on interpolated variables
|
As mentioned, there is NO interpolation for single quoted strings
|
There is also no recursion as illustrated below:
|
$fred= "You over there"; $x= '$fred'; $y= "Hey $x"; # sets $y as "Hey $fred" with no interpolation
|
Use \$ to ensure no interpolation where you need real $ character
-
$fred= "You over there"; $y= "Hey \$fred"; # sets $y as "Hey $fred" with no interpolation whereas:
-
$fred= "You over there"; $y= "Hey $fred"; # sets $y as "Hey You over there" with interpolation used
|
Use ${var} to remove ambiguity as in
-
$y= "Hey ${fred}followed by more characters";
|
Logical and: && as in $x && $y;
|
If $x is true, evaluate $y and return $y
|
If $x is false, evaluate $x and return $x
|
Logical or: || as in $x || $y;
|
If $x is true, evaluate $x and return $x
|
If $x is false, evaluate $y and return $y
|
Logical Not: ! as in ! $x;
|
Return not $x
|
and is same as &&, or the same as || and not is same as ! but lower (indeed lowest) precedence
|
More complicatedly we can set constructed lists equal to each other
|
($a, @fred) = @fred; # sets $a to first element of @fred and removes this first element from @fred
|
($a,$b,$c) = (1,2,3); # sets $a=1, $b=2, $c=3
|
Curiously setting a scalar equal to an array returns length of array
|
$a = @fred; # returns $a as length of @fred whereas
|
The function length returns number of characters in a string
|
$a = length (@fred); # returns length in characters of first entry in @fred
|
($a) = @fred; # defines two lists equal to each other but as LHS only has one element, this instruction sets $a to be first entry of @fred
|
When variables are undefined or set to undefined as in
-
$a = $b ; # and $b has not been defined
|
They are given special value undef which typically behaves the same as null (character string) or zero (numeric) value
-
<STDIN> returns undef when End of File is hit
-
$fred =(0,1,2,3); $a = $fred[6]; # sets $a equal to undef
-
$fred = (0,1,2,3); $fred[6]=7; $a= $fred[5]; # leaves $a and $fred[4,5,6] undefined
|
$index = $#fred; # sets $index as index value of last entry in @fred
-
$a=@fred; $b=$#fred; # imply that $b=$a-1
|
Useful functions defined() and exists() will be discussed in Perl5 notes -- they allow precise tests on defined variables
|
push adds information at end of a list(array)
-
push(@stack,$new); # is equivalent to @stack = (@stack, $new);
-
One can also use a list for second argument(s) in push as in
-
push(@stack,6,"next",@anotherlist);
|
pop is inverse operator to push and removes the last element in argument as well as returning value of this last element
-
Note chop(@stack) removes last character of each entry of list -- not like pop which removes last entry of list
|
unshift is idential to push except works on left (lowest indices) of list -- not on end of list
|
shift is idential to pop except works on left (lowest indices) of list -- not on end of list
|
reverse(@list) leaves @list unaltered but returns reversed list
|
sort(@list) leaves @list unaltered but returns sorted list
|
foreach is similar to statement by this name in C-Shell
|
foreach $index (@some_list returned by an expression perhaps) {
-
statement-block for each value of $index;
-
}
|
$index is local to this construct and is returned to any value it had before foreach loop executed
|
An example that also prints 1 to 10 is
|
@back= (10,9,8,7,6,5,4,3,2,1);
|
foreach $num (reverse(@back)) {
|
In above case one can write more cryptically (a pathological addiction of UNIX programmers)
|
foreach (sort(@back)) { # sort and reverse give same results here
-
print $_,"\n"; # If an expected variable($num here) is omitted
-
}
|
An associative array is a "software implemented" associative memory where you can fetch values by names or attributes or technically keys
|
An associative array is a set of pairs (key,value).
|
The whole array is referred to as %dict and is typically set with instructions like
-
$dict{keyname} = value; # NOTE Curly braces {} to show array associative
|
The values can be used in ordinary arithmetic such as
-
$math{pi}=3.14; $math{pi} += .0016; # sets $math{pi}=3.1416;
-
pi or "pi" is allowed for specifying key
|
If key pimisspelt has not been defined then $math{pimisspelt} returns undef as value and so one can easily see if a particular key has been set.
|
Alternatively function exists($math{pimisspelt}) returns false unless key pimisspelt has been set
|
The order of storage of pairs in an associative array is arbitrary and nonreproducible.
-
one cannot push or pop an associative array
|
@listmime = %mime; # produces a list of form (key1,value1,key2,value2 ...)
-
This list can be manipulated like any list
-
One can also create an associate array by defining such a list where adjacent elements are paired so that in above example
|
%newmime = @listmime; # creates an associative array identical to %mime
|
One can delete specific pairs by delete command so for example:
-
%fred = (key1, "one", key2, "two"); # Quotes on key1 optional
-
delete $fred{key1}; # leaves %fred with one pair (key2,"two")
|
Regular expressions should be familiar as they are used in many UNIX commands with grep as best known
-
grep pattern file; # Prints out each line of file containing pattern
|
The rules for pattern are rich and we will discuss later -- consider here the simple pattern Fox
|
Then we can write the PERL version of grep as follows:
|
$line =0;
|
while (<>) {
-
if( /Fox/ ) { # Generalize to /Pattern/ to test positive if Pattern in $_
-
print $line, "$_"; } # $_ is current line by default
-
$line++; }
|
Another familiar operator should be s in sed (the batch or stream line editor) where
|
s/Pattern1/Pattern2/; # substitutes Pattern1 by Pattern2 in each line
|
The same command can be used in PERL with again substitution occuring on $_
|
Sequence is c1c2c3.. -- a sequence of single characters
|
* or {0,} is "zero or more" of previous character
|
+ or {1,} is "one or more" of previous character
|
? or {0,1} is "zero or one" of previous character
|
All matching is greedy -- they maximize number of characters "eaten up" starting with leftmost matching
-
In Perl5 one can follow specification with ? to instruct Perl5 to find smallest match (first occurrence) so that
-
.*?: matches to first : in line while .*: matches to last : in line.
|
Curly Brace Notation:
|
c{n1,n2} means from n1 to n2 instances of character c
|
c{n1,} means n1 or more instances of character c
|
c{n1} means exactly n1 instances of character c
|
c{0,n2} means n2 or less instances of character c
|
We have finally finished study of regular expressions and have illustrated this for substitution operator (s) acting on default variable $_. We can generalize this operation in many ways
|
The result of ( Variable Name =~ /Regular Expression/ ) is true if and only if value of Variable Name matches Regular Expression. For example
|
if ( <STDIN> =~ /^(T|t)(O|o):/ ) { # <STDIN> is $_
-
..; # Process to: field of mail } # matches if current input
-
line contains to: with any case at start of line
|
There is an implied match operator above which we can make explicit with m
|
$line =~ m/^(T|t)(O|o):/ and we can use m to change delimiter from / to any character and
|
$line =~ m%^(T|t)(O|o):% # is equivalent to previous statement
|
Note m/^to:/i equivalent to above as modifier i instructs pattern match to ignore case
|
Variables may be used in Regular expressions and are interpolated as in usual double quoted strings. Use \$ to represent a real dollar except at end of string when it safely represents end of string anchor.
|
In match /regexp/i , the i instructs one to ignore case in match
|
In substitution s/regexp1/regexp2/g, the g instructs substitution to occur at all possible places in string -- normally only the first match in a string is found
|
i and g can be used together
|
$line =~ s/regexp1/regexp2/ ; # Illustrates how we use substitution s on general variable
|
As with m, s can use any delimiter and so
|
$line =~ s#regexp1#regexp2# ; # is equivalent form
|
$loc=index($string,$substr); # returns in $loc the location(first character in $string is location 0) of first occurrence of $substr in $string.
-
If $substr is not located, return -1
|
$loc=index($string,$substr,$firstloc); # will return $loc which is at least as large as $firstloc
-
Use to find multiple occurrences, setting $firstloc as 1+ previously found location
|
rindex($string,$substr,$lastloc) is identical to index except scanning starts at right (end) of string and not at start. All locations still count from left but if you give a third argument $lastloc, the returned $loc will be at most $lastloc in value
|
The construct local defines variables which are local(private) to a particular function
|
For example the routine on following foil invoked by
|
@new = &bigger_than(100,@list);
|
Returns in @new all entries in @list which are bigger than 100.
|
local() is an executable statement -- not a declaration!
|
The first two statements in bigger_than can be replaced by:
|
local($test,@values) = @_; # local() returns an assignable list
|
In Perl5, my tends to eplace local as my scope confined to routine but local extends scope to any function called from block in which local statement defined
|
Note can use my/local in any block (not just a function) enclosed in { ... } to define temporary variables of limited scope
|
sort() is a builtin PERL function with three modes:
|
@result = sort @array; # equivalent to sort { $a cmp $b} @array;
|
and sorts using stringwise comparisons, the variables in @array returning them in @result
|
@result = sort BLOCK @array; # where statement BLOCK enclosed in {} curly brackets returns -1 0 1 given values of $a $b
|
@result = sort { $age{$a} <=> $age{$b} } @array; # sorts by age if entries in @arrays are keys to associative array %age which holds numeric age for each key
|
@result = sort SUBNAME @array; # uses subroutine (which can be specified as value of scalar variable) to perform sorting
|
sub backsort { $b <=> $a; } # Reverse order for Integers
|
@result = sort backsort @array; # sorts in numerically decreasing order
|
tr/ab/XY/ translates a to X and b to Y in string $_
|
As for m and s, one can apply tr to a general string with =~
|
$string =~ tr/a-z/A-Z/; # translates letters from lower to upper case in $string
|
Note use of - to specify range as in regular expressions although tr does NOT use regular expressions
|
tr can count and returns number of characters matched
|
$numatoz = tr/a-z//; # $numatoz holds number of lower case letters in $_
|
if final string empty no substitutions are made
|
if second string shorter than first, the last character in second string is repeated
|
tr/a-z/A?/; # replaces a by A and all other lower case letters by ?
|
if the d option used, unspecified translated characters are deleted
|
tr/a-z//d; # deletes all lower case letters
|
the c option complements characters in initial string
|
tr/a-zA-Z/_/c; # translates ALL nonletters into _
|
the s option squeezes multiple consecutive copies of any letter in final string and replaces them by a single copy.
|
There are a set of ways of doing simple tests which imply fewer curly braces and other punctuation
|
expr1 if testexp; # is equivalent to
|
if (testexp) {
|
}
|
last, redo and next can be followed by such tests e.g.
|
last DOREALWORK if userendofinitializationhit ;
|
There are similar abbreviations for unless,while,until
|
dothisexpression unless conditionholds;
|
dostandardstuff while normalconditionholds;
|
dostandardstuff until specialconditionseen; # should be self explanatory
|
thatcommand if thiscondition; # is equivalent to
|
thiscondition && thatcommand;
|
because PERL will not continue with && (logical and) if it finds a false condition. So if thiscondition is false, thatcommand is not executed
|
Similarily:
|
thatcommand unless thiscondition; # is equivalent to
|
thiscondition || thatcommand;
|
Note can use and instead of && and or instead of ||
-
not (instead of !) and xor (instead of ^) also allowed
|
We can use a C like expression
|
expression ? Truecalc : Falsecalc; # which is equivalent to
|
if (expression) { Truecalc; } else { falsecalc; }
|
Files are like statement labels designated by a string without a special initial character. It is recommended that you use all capitals in such labels
|
STDIN STDOUT STDERR (and diamond <> null name) have been introduced and correspond to UNIX stdin, stdout and stderr (and concatenation of argument files if <> operator)
|
Filehandles allow you to address general files and the syntax is similar to UNIX standard I/O (stdio.h) support
-
open(FILEHANDLE,"unixname"); # opens file unixname for reading -- can use <
-
open(FILEHANDLE,">unixname"); # opens file unixname for writing
-
open(FILEHANDLE,">>unixname"); # opens file unixname in append mode
|
close(FILEHANDLE) closes file
|
Errors can be handled with die construct
|
open(FH,'>'.$criticalfile) || die("Print an error message if file can't be opened\n"); # Note how we add '>' (or ',' '>>') to file name stored in Perl variable
|
As illustrated <FILEHANDLE> reads either single line or full file depending on whether one stores it in a scalar or an array
|
print FILEHANDLE list; # writes list onto FILEHANDLE and simple
|
print list; # is equivalent to
|
print STDOUT list;
|
There are a whole set of test operators which act on File NAMES not FileHANDLES
|
-e $filename returns true if $filename EXISTS
|
-r $filename returns true if $filename is READABLE
|
-w $filename returns true if $filename is WRITABLE
|
-x $filename returns true if $filename is EXECUTABLE
|
format FORMATNAME =
|
fieldline (called picture line in Perl Manual)
|
value1, value2, value3 ...
|
fieldline
|
value1, value2, value3 ...
|
etc
|
.
|
The terminal dot as first character of line terminates format definition
|
FORMATNAME is label of this format and in simplest case one uses a format label which is identical to that of FILEHANDLE on which we wish to output
|
fieldlines specify fixed text as well as places and formats to print data which are listed as Perl variable names on following valueline. Clearly white space is significant in fieldline but not associated value line.
|
$~ = "ADDRESSLABEL"; # sets format for current FILEHANDLE to ADDRESSLABEL
|
$FORMAT_NAME = "ADDRESSLABEL"; if use English
|
FILEHANDLE->format_name("ADDRESSLABEL"); if use FileHandle
|
format ADDRESSLABEL =
|
====================================
|
| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
|
$name
|
| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |
|
$address
|
| @<<<<<<<<<<<<<<<<, @< @<<<<<<<<<<<<<< |
|
$city, $state,$zip
|
====================================
|
.
|
@ followed by N <'s specifies left justified field with N+1 characters in it.
|
write ; # outputs current values of $name,$address,$city,$state,$zip into 5 line template on currently selected file.
|
chdir($name); transfers to directory specified in $name
|
mkdir($name, mode); # makes directory with given name $name and MODE (typically 3 octal characters such as 0755)
|
opendir(DIRHANDLE,$name); # opens directory with directory handle DIRHANDLE. Such names can be assigned independently of all other names and are in particular not connected with FILEHANDLEs
|
closedir(DIRHANDLE); # closes directory associated with handle DIRHANDLE
|
readdir(DIRHANDLE); # returns file names (including . and ..) in directory with handle DIRHANDLE
-
If scalar result, readdir returns "next" file name
-
If array result, readdir returns all file names in directory
|
system("shellscript"); # dispatchs shellscript to be execute by /bin/sh and anything allowed by shell is allowed in argument
-
system returns code returned by shellscript
|
system("date > tempfil"); # executes UNIX command date returning standard output from date to file tempfil in current directory
|
system("rm *") && die ("not allowed\n"); # terminates if error in system call as shell programs return nonzero if failure (opposite of open and most PERL commands)
|
Variable Interpolation is done in double quoted arguments and so one can include Perl variables in arguments of system
|
$prog="nobel.c"; system("cc -o $prog"); # (I) is equivalent here to
|
$ccompiler="cc";
|
system($ccompiler,"-o","nobel.c"); # (II) but in general not identical as in first form (I) shell interprets command list but in second form (II) the arguments are handed directly to command given in first entry in list given to system
|
%ENV is set as the shell environment which the Perl program was invoked
|
Any UNIX processes invoked by system, fork, backquotes, open inherits an environment specified by %ENV at invocation of child process.
|
One can change %ENV in the same way as any associative array
|
%ENVIN = %ENV ; $oldpath = $ENV{"PATH"}; # saves input environment
|
$ENV{"PATH"} = $oldpath . ":/web/cgi"; # resets PATH to include an extra directory to be used by child process -- later we run
|
%ENV=%ENVIN; # Restores original environment
|
One can see what has been passed in %ENV by using Perl keys function
|
foreach $key (sort keys %ENV ) {
-
print "$key=$ENV{$key}\n"; # both $key $ENV{} are interpolated
|
}
|
This is most powerful method with fork creating two identical copies of program -- parent and child
|
unless (fork) { ;} # child indicated by fork=0
|
; # otherwise fork=child process number for parent
|
The child program typically invokes exec which replaces child original by the argument of exec. Meanwhile parent should wait until this exec is complete and child has gone away.
|
unless (fork) {
-
exec("date"); # child process becomes date command sharing environment with parent
|
}
|
wait; # parent process waits until date is complete
|
The child process need not terminate naturally as with exec() and if child code was for instance
|
print FILEHANDLE @hugefile; # in parallel with parent
|
exit; # is required else child will continue with parents code whereas we wanted parent and child to work in parallel on separate jobs
|
The associative array %SIG is used to define signal handlers (subroutines) used for various signals.
|
The keys of %SIG are the UNIX names with first SIG removed. For instance, to set handler() as routine that will handle SIGINT interrupts do something like:
|
$SIG{'INT'} = 'handler';
|
sub handler { # First argument is signal name
-
local($sig) = @_;
-
print("Signal $sig received -- shutting down\n");
-
exit(0);
|
}
|
kill $signum, $child1, $child2; # sends interrupt $signum to process numbers stored in $child1 and $child2
|
$signum is NUMERICAL label (2 for SIGINT) and $child1,2 the child process number as returned by fork or open(PROCESSHANDLE,..) to parent
|
As in many interpreters, PERL allows you to generate a line from the interpreter using an eval function (JavaScript is similar)
|
Suppose you had two arrays $fred[$index] and $jim[$index] and you wanted to load them given value of $index and an ascii string $name (which could have been read in) taking value 'fred' or 'jim'. This can be achieved by:
|
eval('$' . $name . '[' . $index . ']') = $value;
|
eval returns result of evaluating(executing) argument as PERL script and continues
|
In this case, you can achieve the same results with indexed associative arrays:
|
$options[$index]{$name} = $value;
|
using the nultidimensional array notation introduced in PERL5
|