Full HTML for Basic Tutorial on PERL

Full HTML for

Basic foilset Tutorial on PERL

Given by Geoffrey C. Fox,Nancy McCracken,Tom Scavo at Computational Science for Information Age Course CPS616 on Sept 20 97. Foils prepared Sept 20 97
Outside Index Summary of Material

This simple discussion of PERL describes the essential features needed for general-purpose programming

It does not describe the special concerns needed for systems programming but is aimed at what you need for writing CGI programs

We reference in detail the Llama Book: Learning PERL (2nd ed) by Randal L. Schwartz and Tom Christiansen published by O'Reilly and Associates. ISBN: 1-56592-284-0

More detailed is the Camel book: Programming PERL (2nd ed) by Larry Wall, Tom Christiansen, and Randal L. Schwartz, also by O'Reilly. ISBN: 1-56592-149-6

This is one of few authoritative Perl5 discussions

Another useful book which lies between the Llama and Camel books in completeness is: PERL by Example by Ellie Quigley, Prentice Hall. ISBN 0-13-122839-0

Table of Contents for full HTML of Tutorial on PERL

Denote Foils where Image Critical

Denote Foils where Image has important information

Denote Foils where HTML is sufficient

denotes presence of Additional linked information which is greyed out if missing

1

Overview of Perl -- Technologies for the Information Age
2

Abstract of PERL Overview
3

General Remarks on PERL I
4

General Remarks on PERL II
5

General Remarks on PERL III
6

Scalar Data I -- Numbers
7

Scalar Data II --
Single-Quoted Strings
8

Scalar Data III --
Double-Quoted Strings
9

Scalar Variables and Comments
10

Operators for Numbers and Strings I
11

Operators for Numbers and Strings II -- Comparison
12

Operators for Numbers and Strings III --
Binary Assignment
13

Interpolation of Scalars into Strings
14

Some Simple Scalar I/O Capabilities
15

Logical Operators
16

Arithmetic Operators
17

Bitwise Logical Operators
18

Arrays and Lists of Scalars I
19

Arrays and Lists of Scalars II -- Construction
20

Arrays and Lists of Scalars III -- Construction
21

Arrays and Lists of Scalars IV -- Element Access
22

Arrays and Lists of Scalars V -- Undefined
23

Arrays and Lists of Scalars VI -- Printing
24

Arrays and Lists of Scalars VII --
Operators on Arrays
25

Control Structures -- if,else,unless,elsif
26

Control Structures --
What is true and false?
27

Control Structures --
while,until Statements
28

Control Structures -- for Statement
29

Control Structures -- foreach Statement
30

Hashes -- Definition
31

Hashes -- Examples
32

Hashes -- Storage and Access
33

Hashes -- Operators: keys, values, each
34

Basic Input
35

Basic Output
36

Regular Expressions -- Analogy with grep
37

Regular Expressions -- Patterns
38

Backslash Escapes
39

Predefined Character Classes in Regular Expressions
40

Repetition in Regular Expressions
41

Anchoring and Alternation in
Regular Expressions
42

Parentheses in Regular Expressions
43

Some Regex Examples
44

Matched Variables in
Regular Expressions
45

More Regex Examples
46

The Matching Operator and
Regular Expressions
47

The Substitution Operator and
Regular Expressions
48

More Substitution Examples
49

Split and Join Operators
50

The index and rindex Functions
51

The substr Function
52

Functions and Subroutines I
53

Functions and Subroutines II
54

Functions and Subroutines III --
local and my
55

Functions and Subroutines IV -- An Example
56

Binary Equality Operators
57

Sorting with Various Criteria
58

The Translation Operator tr
59

Additional Control Flow Constructs I
60

Additional Control Flow Constructs II
-- Statement Labels
61

Additional Control Flow Constructs III -- Accelerated Tests
62

Additional Control Flow Constructs IV
63

FileHandles
64

Using FileHandles and Testing Files
65

A Perl "Here" Document
66

Some Special Capabilities in Formatted Writes
67

Globbing
68

Directory Access
69

Execution of UNIX Commands -- system
70

Processing the Environment %ENV
71

Execution of UNIX Commands -- backquotes
72

Execution of UNIX Commands --
Filehandle Mechanism
73

Execution of UNIX Commands --
fork and exec
74

Signals, Interrupt Handlers, kill
75

The eval Function and Indexed Arrays of Hashes
76

What is CGI.pm?
77

CGI.pm Import Tags
78

CGI.pm Syntax
79

Using CGI.pm - Form Processing
80

Using CGI.pm - Form Generation

Outside Index Summary of Material

HTML version of Basic Foils prepared Sept 20 97

Foil 1 Overview of Perl -- Technologies for the Information Age

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Written by Geoffrey Fox

February 1995

Updated by Tom Scavo

September 1997

NPAC

111 College Place

Syracuse University

Syracuse, NY 13244-4100

http://www.npac.syr.edu/projects/tutorials/CGI/

HTML version of Basic Foils prepared Sept 20 97

Foil 2 Abstract of PERL Overview

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

This simple discussion of PERL describes the essential features needed for general-purpose programming

It does not describe the special concerns needed for systems programming but is aimed at what you need for writing CGI programs

We reference in detail the Llama Book: Learning PERL (2nd ed) by Randal L. Schwartz and Tom Christiansen published by O'Reilly and Associates. ISBN: 1-56592-284-0

More detailed is the Camel book: Programming PERL (2nd ed) by Larry Wall, Tom Christiansen, and Randal L. Schwartz, also by O'Reilly. ISBN: 1-56592-149-6

This is one of few authoritative Perl5 discussions

Another useful book which lies between the Llama and Camel books in completeness is: PERL by Example by Ellie Quigley, Prentice Hall. ISBN 0-13-122839-0

HTML version of Basic Foils prepared Sept 20 97

Foil 3 General Remarks on PERL I

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

PERL is an interpreter (written by Larry Wall) and is a cross between C, the UNIX shell, sed, and awk. I certainly consider it easier to use than all four of these, since PERL produces clearer code (than especially the UNIX Shell) which is easier to write and debug.

In these notes, the emphasis is on PERL4, but some PERL5 features will be covered as needed.
Later we will describe PERL5 in detail, which was released late 1994 and is analogous to C++ in same way PERL is analogous to C.

Note C as a compiler will be more efficient than PERL. We use PERL for those tedious high-level things that take a long time to write but don't take much execution time.

Computationally intensive loops should be coded in C (or equivalent) and called from PERL
Note PERL is comparable to C for I/O and UNIX system calls but can be thousands of times slower than C for arithmetic

HTML version of Basic Foils prepared Sept 20 97

Foil 4 General Remarks on PERL II

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

PERL code is usually put in a file and executed at the command line with

% perl -w myfile # w = warning msgs

where myfile contains PERL source code.

PERL code may also be put in an executable file (chmod +x on UNIX) whose first line is

#!/usr/local/bin/perl -w

where the path depends on your installation. (CGI scripts written in PERL must have such a line.) In this case, simply type

% myfile

to compile and execute the code

HTML version of Basic Foils prepared Sept 20 97

Foil 5 General Remarks on PERL III

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

PERL may be run interactively from the command line. For example,

% perl -v

will report the version of PERL in use.

The PERL command

% perl -e 'print "@INC\n";'

prints the search path for included files.

The command

% perl -MLWP -e 'print "libwww-perl-$LWP::VERSION\n";'

prints the installed version of LWP.

The complex PERL command

% perl -pi.bak -e 's/str1/str2/gi' `find . -name \*.html -print`

performs a case insensitive, global search-and-replace on all files ending with "html" in the current directory and all its subdirectories.

HTML version of Basic Foils prepared Sept 20 97

Foil 6 Scalar Data I -- Numbers

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Scalars are either numbers or a string of characters as in C although in both cases there are significant differences

numbers and strings are not explicitly typed
Note Perl is "safer" than JavaScript in this regard and it is very rare that numbers and strings are confused

Numbers are stored internally as integers if this represents them adequately -- otherwise as double-precision numbers

Perl and the runtime system make certain that this is transparent to user
Wolfram's SMP (the forerunner of Mathematica) at Caltech (which I worked on) also made the more extreme choice of everything being double precision
For example 1, 5.0, 4.5E23, and 7.45 E-15, are all numbers

Octal and Hexadecimal numbers are allowed with

0377 (initial zero) assumed to be octal and so is equal to 255 decimal
0X or 0x represents hexadecimal with letters A to F corresponding to numbers 10 to 15 as normal. 0XFF is hex FF or 255 decimal

HTML version of Basic Foils prepared Sept 20 97

Foil 7 Scalar Data II --
Single-Quoted Strings

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

There are two types of strings defined by a stream of characters surrounded by either single quotes ' or double quotes "

Single-quoted strings are perhaps the simplest

Inside such strings ALL characters including newlines are treated literally except that ' must be represented as \' and \ as \\

In particular, \n does NOT represent a newline
Example: 'don\'t' is the five character word don't

As in C, ALL strings are stored as zero-byte terminated byte streams so that

'' and "" are both stored internally as one byte 00 (octal zero)

HTML version of Basic Foils prepared Sept 20 97

Foil 8 Scalar Data III --
Double-Quoted Strings

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Double-quoted strings are very similar to C with many special characters given in table 2-1 of Llama book and online PERL man page. (See later in foils)

Examples: \n is newline, \t is tab and \cC is Control-C
Example: 'Hello
- World' is equivalent to "Hello\nWorld"
Note \L instructs PERL that all following characters until a \E are to be interpreted as lower case
\U ... \E is similar but intervening characters are upper case

A critical feature of double-quoted strings is that they "interpolate" variables (e.g., with $ as initial character). For this reason use \$ to denote a literal dollar sign in a double-quoted string.

Variables are NEVER interpolated in single-quoted strings

HTML version of Basic Foils prepared Sept 20 97

Foil 9 Scalar Variables and Comments

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Scalar variables are named variables holding numeric or string scalars (or both). There are NO types (integer, float, char) of variables. The interpretation of a scalar is always determined by its context and not "type" of variable.

$cps616_num is a scalar variable representing number of students in CPS 616

Variables are set by conventional equal sign

$cps616_num = 15;

Note statements are ended by semicolons -- not by newlines and # can be used to denote a comment

$Instructor= "Fox"; # Not really true as Wojtek did all the work

Comments may appear after statements or on a line on their own. They are terminated by newlines.

HTML version of Basic Foils prepared Sept 20 97

Foil 10 Operators for Numbers and Strings I

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Conventional arithmetic operators are available in PERL

+, -, =, /, ** (latter is raise to power of) mean what you think
2+3 # is 5
10/3 is 3.33333 (and not 3) as numbers are floating point if necessary

One important string operator is . for concatenate

"Hello" . " World" is identical to "Hello World"

Less used is x (times) used to replicate string data

"hip " x 3 . "hurrah" # is "hip hip hip hurrah"
Note that (3+2) x (3+1) # is NOT a number but rather the string "5555"

HTML version of Basic Foils prepared Sept 20 97

Foil 11 Operators for Numbers and Strings II -- Comparison

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

There are six basic comparison operators which are DIFFERENT for numeric and string comparisons. Here PERL is using operators to create context to define type

There are a complex set of precedence rules but I always use parentheses and do not remember rules!

"CPS" . (12*50) # is string "CPS600"

HTML version of Basic Foils prepared Sept 20 97

Foil 12 Operators for Numbers and Strings III --
Binary Assignment

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Assignments are $Next_Course = "CPS615";

$Funding = $Funding + $Contract;

For numeric quantities, the latter can be written as

$Funding += $Contract;

Similarly we can write for strings:

$Name= "Geoffrey"; $Name .= " Fox" # Sets $Name to "Geoffrey Fox"

Example: $A = 6; $B = ($A +=2); # sets $A = $B = 8

Auto-increment and auto-decrememt as in C

$a = $a + 1; $a +=1; and ++$a; # are the same and increment $a by 1

++ and -- are both allowed and can be used BEFORE (prefix) or AFTER (suffix) variable (operand). Both forms change operand in same way but in suffix form result used in expression is value BEFORE variable incremented.

$a=3; $b = (++$a) # sets $a and $b to 4
$a=3; $b = ($a++) # sets $a to 4 and $b to 3

HTML version of Basic Foils prepared Sept 20 97

Foil 13 Interpolation of Scalars into Strings

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

We can use scalar variables in strings

$h= "World"; $hw= "Hello $h"; # sets $hw to "Hello World"
$h= "World"; $hw= "\UHello $h"; # sets $hw to "HELLO WORLD"
showing how \U and similarly \L operate on interpolated variables

Remember, there is NO interpolation for single-quoted strings!

There is also no recursion as illustrated below:

$fred= "You over there"; $x= '$fred'; $y= "Hey $x"; # sets $y as 'Hey $fred' with no interpolation

Use \$ to ensure no interpolation where you need literal $ character

$fred= "You over there"; $y= "Hey \$fred"; # sets $y as "Hey $fred" with no interpolation whereas:
$fred= "You over there"; $y= "Hey $fred"; # sets $y as "Hey You over there" with interpolation

Use ${var} to remove ambiguity as in

$y= "Hey ${fred}followed by more characters";

HTML version of Basic Foils prepared Sept 20 97

Foil 14 Some Simple Scalar I/O Capabilities

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

STDIN is a file handle (or pointer) and <STDIN> is a scalar representing the next line from the input stream

$line = <STDIN>; # sets $line to next line read from standard input

Unusually, this line always includes terminating newline and so we have a special function to remove terminating newlines

chomp($line); removes the newline in $line and returns a newline (or not, if none is present)

$nl = chomp($line) # set $nl to "\n" and remove newline from $line
chop is similar but removes any character (not just newline)

We can also print scalars with print $line;

print is more powerful and we will learn about it later as argument can be a scalar but is normally a list or array

HTML version of Basic Foils prepared Sept 20 97

Foil 15 Logical Operators

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Logical and: && as in $x && $y;

If $x is true, evaluate $y and return $y

If $x is false, don't eval $y and return false

Logical or: || as in $x || $y;

If $x is true, don't evaluate $y and return true

If $x is false, evaluate $y and return $y

Logical Not: ! as in ! $x;

Return not $x

and is same as &&, or the same as || and not is same as ! but lower (indeed lowest) precedence (see Table 2-3 in Llama Book)

HTML version of Basic Foils prepared Sept 20 97

Foil 16 Arithmetic Operators

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

+ Addition as in $x + $y;

- Subtraction as in $x - $y;

* Multiplication as in $x * $y;

/ Division as in $x / $y;

% Modulus as in $x % $y; # 10%3 is 1 etc.

** Power as in $x**$y;

HTML version of Basic Foils prepared Sept 20 97

Foil 17 Bitwise Logical Operators

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Definitions and Examples

Truth Table for Bitwise Operators

HTML version of Basic Foils prepared Sept 20 97

Foil 18 Arrays and Lists of Scalars I

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

PERL has four types of variables distinguished by their initial character

Scalars with $ as initial character
File handles with nothing special as their initial characters and conventionally represented by names that are all capitalized
Arrays with @ as the initial character
Hashes (associative arrays, dictionaries) with % as initial character

Arrays (for example, @fred) may be initialized with a comma-separated list of scalars

@fred = (1, "second entry", $hw); # is an array with three entries

Array entries can include scalar variables and more generally expressions which are evaluated when the array entry is USED not when it is DEFINED -- this is a difference from C but is for instance similar to functions in a spreadsheet

@fred= (1, $hw . " more", $a+$b); # is an example of this

HTML version of Basic Foils prepared Sept 20 97

Foil 19 Arrays and Lists of Scalars II -- Construction

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

There is a list construction operator .. which provides a list of values incremented by 1

@fred = (1..4); # is list (1,2,3,4) and
@fred = (2,3, $a..$b); # CURRENT values of $a, $b
@jeff = @fred[1..3]; # creates an array @jeff = (1,2,3);

We can also use assignment operator = in flexible ways

@fred = @jeff; # sets two lists equal to each other while
@fred= (4,5,@jeff); # defines a list @fred with two more entries than @jeff

HTML version of Basic Foils prepared Sept 20 97

Foil 20 Arrays and Lists of Scalars III -- Construction

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

More complicated lists may be constructed:

($a, @fred) = @fred; # sets $a to first element of @fred and then removes this first element from @fred

($a,$b,$c) = (1,2,3); # sets $a=1, $b=2, $c=3

Curiously, setting a scalar equal to an array returns the length of the array

$a = @fred; # returns $a as length of @fred

The function length returns the number of characters in a string

$a = length (@fred); # returns length in characters of first entry in @fred

($a) = @fred; # sets $a to be first entry of @fred

HTML version of Basic Foils prepared Sept 20 97

Foil 21 Arrays and Lists of Scalars IV -- Element Access

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Note that @fred (an array) and $fred (a scalar) are totally different variables

The elements of @fred are indexed starting at 0 (not 1, as in Fortran)

Elements are referenced by $ NOT @

$a = $fred[0]; # is first element in @fred

PERL interpolates arrays as well as scalars:

$fred[0]= "First element of \@fred";

Indices may be arbitrary integer expressions:

@fred = (0..10); $a = 2;

$b = $fred[$a-1]; # sets $b equal to 1

HTML version of Basic Foils prepared Sept 20 97

Foil 22 Arrays and Lists of Scalars V -- Undefined

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

When variables are undefined or set to undefined as in

$a = $b ; # $b not defined

They are given special value undef which typically behaves the same as null (character string) or zero (numeric) value

<STDIN> returns undef when EOF is reached
@fred = (0,1,2,3); $a = $fred[6]; # sets $a equal to undef
@fred = (0,1,2,3); $fred[6] = 7; $a = $fred[5]; # leaves $a, $fred[4], $fred[5], and $fred[6] undefined

$index = $#fred; # sets $index to index value of last entry in @fred

$a = @fred; $b = $#fred; # the expression $b == $a - 1 is true

Useful functions defined() and exists() are available

HTML version of Basic Foils prepared Sept 20 97

Foil 23 Arrays and Lists of Scalars VI -- Printing

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The argument of print is a list (array)

print "Hello"," Rest", " of World";
@fred = ("Hello"," Rest", " of World");
print @fred; # equivalent to the previous print

One can input an entire file to a list where each entry of list is one line of file. For example,

@file = <STDIN>; # later we will see how to do for arbitrary files

As mentioned earlier, double-quoted strings perform variable interpolation for arrays as well as scalars

$string = "This is a full list @fred \n";
$string = "First value in list fred is $fred[0] \n";
Slices and variable indices can also be interpolated

HTML version of Basic Foils prepared Sept 20 97

Foil 24 Arrays and Lists of Scalars VII --
Operators on Arrays

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

push adds elements at end of a list (array). For example,

push(@stack, $new); # equivalent to @stack = (@stack, $new);

One can also use a list as the second arg in push:

push(@stack, 6, "next", @anotherlist);

pop, the inverse operator to push, removes and returns the last element in the list

Note chop(@stack) removes last character of each entry of @stack

unshift is similar to push, except it works on left (lowest indices) of list. For example,

unshift(@INC, $dir);

modifies the path for included files on-the-fly

shift is to pop as unshift is to push

reverse(@list) and sort(@list) leave @list unaltered, but returns reversed and sorted lists, respectively

HTML version of Basic Foils prepared Sept 20 97

Foil 25 Control Structures -- if,else,unless,elsif

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Statement blocks are sets of semicolon-separated statements enclosed in curly braces {} as in C

if ( TESTEXPRESSION) {
- statement block-true;
} else {
- statement block-false;
}

One can optionally leave off the else part, of course

unless ( TESTEXPRESSION ) {
- statementblock-false;
} else {
- statement block-true;
}
where again else is optional

Both if and unless constructs can use elsif constructs between if/unless and else blocks -- note spelling of elsif!

HTML version of Basic Foils prepared Sept 20 97

Foil 26 Control Structures --
What is true and false?

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

in PERL all TESTEXPRESSIONs are converted to strings

Either the null string or string '0' (same as "0") is evaluated as FALSE

Everything else evaluates as TRUE

Results of comparison operators are what you expect

if ( $age < 18 ) evaluates as TRUE iff the numeric value of $age is less than 18

Note the numeric number 0 is converted to "0" and is FALSE as is numeric computation 1-1

The string "0.000" evaluates as TRUE

HTML version of Basic Foils prepared Sept 20 97

Foil 27 Control Structures --
while,until Statements

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The simplest iterations are while and until

while ( TESTEXPRESSION ) {

some statement block; # if TESTEXPRESSION true

}

For example:

while ( <STDIN> ) {

Process Current Input line; # until EOF in <STDIN>

}

Conversly we can wait for a signal to stop:

until ( TESTEXPRESSION ) {

some statement block; # if TESTEXPRESSION false

}

HTML version of Basic Foils prepared Sept 20 97

Foil 28 Control Structures -- for Statement

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

for ( beginning expression; endtest; doeachloop ) {

FOR statement-block;

}

The above loop is just like C and equivalent to:

beginning-expression;

while ( endtest ) {

FOR statement-block;
doeachloop;

}

For example, we can print the digits with:

for ( $i=0; $i <10; $i++ ) {

print "$i", "\n";

}

HTML version of Basic Foils prepared Sept 20 97

Foil 29 Control Structures -- foreach Statement

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

foreach is similar to statement of same name in C-shell

foreach $element (@some_list) {

# loop body executes for each value of $element

}

$element is local to foreach and subsequently returned to the value it had before the loop executed

An example that prints the numbers 1 to 10 is

@back = (10,9,8,7,6,5,4,3,2,1);

foreach $num ( reverse(@back) ) {

print $num, "\n";

}

One can write more cryptically (a pathological addiction of UNIX programmers):

foreach (sort(@back)) { # sort(@back) == reverse(@back)

print $_, "\n"; # if loop variable is omitted

# PERL uses $_ by default

}

HTML version of Basic Foils prepared Sept 20 97

Foil 30 Hashes -- Definition

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

A hash (sometimes called an associative array) is a "software implemented" associative memory where values are fetched by names or attributes called keys

A hash is a set of pairs (key, value)

The entire array is referred to as %dict (for example):

$dict{key} = value; # NOTE curly braces {} to denote hash

The values can be used in ordinary arithmetic such as

$math{pi} = 3.14; $math{pi} += .0016; # sets $math{pi} = 3.1416;
either pi or 'pi' is allowed for specifying key

If key pimisspelt is undefined then $math{pimisspelt} returns undef and so one can easily see if a particular key has a value

Alternatively, the function exists($math{pimisspelt}) returns false unless key pimisspelt has a value

HTML version of Basic Foils prepared Sept 20 97

Foil 31 Hashes -- Examples

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

One can think of a hash as a simple relational database with two columns and multiple rows labelled by keys

For example, hashes can be used to store data defined by MIME or HTTP as these protocols are specified in terms of a set of header statements

Content-type: text/plain # corresponds to
$mime{Content-type} = "text/plain"; # and so on

Similarly this data type can be used to store values read as arguments of a UNIX command as these are either of form

-keyname value # or
-keyname # just to indicate option set (value = yes or no)

HTML version of Basic Foils prepared Sept 20 97

Foil 32 Hashes -- Storage and Access

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The order of storage of pairs in a hash is arbitrary and nonreproducible

one cannot push or pop an associative array

@listmime = %mime; # produces a list of form (key1,value1,key2,value2 ...)

This list can be manipulated like any list
One can also create a hash by defining such a list where adjacent elements are paired so that in above example

%newmime = @listmime; # creates a hash identical to %mime

One can delete specific pairs by delete command so for example:

%fred = (key1, "one", key2, "two"); # Quotes on key1 optional
delete $fred{key1}; # leaves %fred with one pair (key2,"two")

HTML version of Basic Foils prepared Sept 20 97

Foil 33 Hashes -- Operators: keys, values, each

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

keys(%dict) returns a list (array) of keys in %dict (in arbitrary order). This can be used with foreach construct:

foreach (keys(%mime)) { # $_ is default loop variable

print "\$mime{$_} = $mime{$_}\n";

}

values(%dict) is typically less useful. It returns an unordered list of values (with repetition) in hash %dict

each(%dict) returns a single, two-element list containing the "next" key-value pair in %dict. For example,

while ( ($key,$val) = each(%dict) ) { ... }

Every call to each(%dict) returns a new pair until it finally returns null. The next call to each() starts the cycle again...

HTML version of Basic Foils prepared Sept 20 97

Foil 34 Basic Input

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

We've already seen how to read from standard input with

$line = <STDIN>; # return next line INCLUDING terminal newline
@file = <STDIN>; # return whole file with one line stored in each
- # element of list @file

We can also easily access the command-line arguments of a PERL program. Suppose we invoke a PERL program called makePHD at the command line with

% makePHD file1 file2 file3

We will see later how to process these arguments using standard argument-handling conventions in UNIX

However, the Perl diamond operator <> returns the concatenation of these three files. For example,

@files = <>; # concatenate file arguments and store in @files

HTML version of Basic Foils prepared Sept 20 97

Foil 35 Basic Output

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

We have already seen how to use print to output lists:

print @argfiles; # or you can use parentheses

print (@argfiles);

As in C, one can obtain format control with printf, which starts with a special purpose format specifier:

printf ("%10s %6d %10.2f\n", $string, $decimal, $float);

prints three variables with
- $string in a 10 character string field,
- $decimal in a 6 character integer format and
- $float in a ten character field with two decimal places

HTML version of Basic Foils prepared Sept 20 97

Foil 36 Regular Expressions -- Analogy with grep

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Regular expressions should be familiar as they are used in many UNIX commands with grep being the best known

% grep pattern file; # Print each line of a file containing pattern

Consider the simple pattern /Fox/. We can write the PERL version of grep as follows:

$line = 0;

while (<>) {

if ( /Fox/ ) {

print "$line: $_";

}

$line++;

}

Another familiar operator is s in sed (the batch or stream line editor) where

s/Pat1/Pat2/; # substitutes Pat1 by Pat2 in each line

We'll have more to say about the s operator later

HTML version of Basic Foils prepared Sept 20 97

Foil 37 Regular Expressions -- Patterns

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Simple single-character patterns include:

Single explicit character, e.g., a
Dot . matches ANY character except newline \n

Character class is a single-character pattern represented as a set [c1c2c3...cN] which matches any one of the listed characters

[ABCDE] matches A B C D or E
[0-9] is same as [0123456789]
[a-zA-Z] matches any lower or uppercase letter

Negated character class is represented by a caret ^ after left [ square bracket

[^0-9] matches any character which is NOT a digit

HTML version of Basic Foils prepared Sept 20 97

Foil 38 Backslash Escapes

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

HTML version of Basic Foils prepared Sept 20 97

Foil 39 Predefined Character Classes in Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

\d digits [0-9]

\D NOT digits [^0-9]

\w word characters [a-zA-Z0-9_]

Really anything legal in a PERL variable name
Note \b indicates break between \w and \W

\W NOT word chars [^a-zA-Z0-9_]

\s Whitespace [ \r\t\n\f]

\S NOT whitespace [^ \r\t\n\f]

HTML version of Basic Foils prepared Sept 20 97

Foil 40 Repetition in Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Sequence is c1c2c3.. -- a sequence of single characters

c* means "zero or more" instances of character c

c+ means "one or more" instances of character c

c? means "zero or one" instances of character c

All matching is greedy -- the maximum number of chars are "eaten up" starting with leftmost matching character

In Perl5, use ? to override greedy matching of regex parser
.*?: matches to first : in line while .*: matches to last : in line.

Curly brace notation:

c{n1,n2} means from n1 to n2 instances of character c

c{n1,} means n1 or more instances of character c

c{n1} means exactly n1 instances of character c

c{0,n2} means n2 or less instances of character c

HTML version of Basic Foils prepared Sept 20 97

Foil 41 Anchoring and Alternation in
Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

In regular expressions, the pipe | (read: or) denotes alternation:

a|b|c is equivalent to the character class [abc]
Fox|Furmanski # alternation is more general than a character class
Fox|Furmansk(i|y|ie) # if we don't know how to spell Polish!

Patterns can be anchored in four ways:

/^regex/ matches regex at the beginning of the string only -- ^ has this special meaning only at the start of the regular expression
/regex$/ matches regex at the end of the string only -- $ has this meaning only at the end of the regular expression
\b matches a word boundary so that
- /Variable\b/ matches Variable but not Variables (in a character class, \b denotes a backspace)
\B matches NOT a word boundary so that
- /Variable\B/ matches Variables but not Variable

HTML version of Basic Foils prepared Sept 20 97

Foil 42 Parentheses in Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Parentheses can be used as "memory" for relating different parts of a match or for substitution

If subexpressions are enclosed in parentheses, the matched values are put in temporary variables \1, \2, etc.

s/Geoffrey(.*)Fox/Geoffrey \(\1\) Fox/
when matched to 'Geoffrey Charles Fox' stores \1 = ' Charles ', which is transferred to substitution string giving result 'Geoffrey ( Charles ) Fox'
Note: Use \1, \2, etc. inside pattern only; use $1, $2, etc. outside pattern

Parentheses are also used to clarify the meaning of a regular expression. For instance,

/(a|b)*/ is different than /a|(b*)/

In regular expressions, variables are interpolated as in double-quoted strings. Use \$ to represent a literal dollar sign except at end of string where it represents end-of-string anchor.

HTML version of Basic Foils prepared Sept 20 97

Foil 43 Some Regex Examples

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

An integer with optional plus or minus sign

^(\+|-)?[0-9]+$ (may use [-+]? as well)

Double-quoted strings, with no nested double quotes

"[^"]*" (matches the empty string "")

Double-quoted strings, with Fortran-like nested quotes

"([^"]|"")*"

Double-quoted strings, with C-like nested quotes

"([^"]|\\")*" (bugged! Why?)

"(\\"|[^"])*" (better, but still flawed)

"(\\"|[^"\\])*" (works, but inefficient)

"([^"\\]|\\")*" (works efficiently!)

Double-quoted strings, with other escaped characters

"([^"\\]|\\.)*" (but not escaped newlines)

"([^"\\]|\\(.|\n))*"

HTML version of Basic Foils prepared Sept 20 97

Foil 44 Matched Variables in
Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The variables \1, \2, \3 etc. correspond to parentheses inside the regex. Outside the regular expression use the PERL variables $1, $2, $3 etc.

In string matching, we identify three parts:

$` is variable holding substring BEFORE matched part
$& is variable holding substring matched by regular expression
$' is variable holding substring AFTER matched part

So original string is the concatenation $` . $& . $'

Note, however, any script that uses these variables suffers a significant performance hit!

HTML version of Basic Foils prepared Sept 20 97

Foil 45 More Regex Examples

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

/\s0(1+)/ matches a "whitespace" character followed by a zero and one or more 1s -- the set of ones is stored in \1 (and $1)

/[0-9]\.0\D/ matches "the answer is 1.0 exactly" but not "the answer is 1.00" because of \D

In first case, $` is "the answer is ", $& is "1.0 " and $' is "exactly"

/a.*c.*d/ matches "axxxxcxxxxcdxxxxd" with

$` and $' as null and $& as full string (because greedy)

/(a.*b)c.*d/ matches "axxxxbcxxxxbd" with

\1 as "axxxxb" -- note backtracking as greedy (a.*b) first matches to "axxxxbcxxxxb" but then tries again when following c.*d fails to match

HTML version of Basic Foils prepared Sept 20 97

Foil 46 The Matching Operator and
Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The result of the expression

$string =~ m/$regex/

is true if and only if the value of $string matches $regex.

For example,

if ( <STDIN> =~ m/^[Tt][Oo]:/ ) { ... }

matches if current input line starts with to: (any case)

Note: m/^to:/i is equivalent to above expression since modifier /i instructs pattern matcher to ignore case

Any delimiter may be used in place of the slash

m%^[Tt][Oo]:% # equivalent to previous expression

The m operator may be omitted, but then slash delimiters are required

HTML version of Basic Foils prepared Sept 20 97

Foil 47 The Substitution Operator and
Regular Expressions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The substitution operator s has the form:

$line =~ s/regex1/regex2/ ;

As with m, the operator s can use any delimiter and so

$line =~ s#regex1#regex2# ;

is an equivalent form

In the substitution s/regex1/regex2/g the /g causes substitution to occur at all possible places in string (normally only the first match is found)

Note that /i and /g can be used together

In an HTML doc, replace 2x2 with <NOBR>2 x 2 </NOBR>

Search: (\d+)x(\d+)

Replace: <NOBR>\1 x \2</NOBR>

PERL: s|(\d+)x(\d+)|<NOBR>\1 x \2</NOBR>|i

HTML version of Basic Foils prepared Sept 20 97

Foil 48 More Substitution Examples

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

In code, replace an array subscript [...] with [n++]

Search: \[[^]]+\]

Replace: [n++]

PERL: s/\[[^]]+\]/[n++]/g

In an HTML doc, replace certain file references in URLs

Search: ys97_(\d\d)/

Replace: ys97_\1/index.html

PERL: s#ys97_(\d\d)/#ys97_\1/index.html#

Again in an HTML doc, replace certain paths in URLs

Search: ([^/])\.\./graphics

Replace: \1../../latex-graphics

PERL: s%([^/])\.\./graphics%\1../../latex-graphics%

HTML version of Basic Foils prepared Sept 20 97

Foil 49 Split and Join Operators

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

split takes a string and splits it into parts separated by a delimiter, which may be any regular expression. E.g.,

@fields = split(/\s+/, $line); # splits string $line into several fields stored in $field[0] $field[1] etc. where these fields were separated by whitespace (\s) in $line

join inverts the operation although the join string must now be an ordinary single or double-quoted string and not a regular expression as no matching is occuring!

$line = join( " \t", @fields); # rebuilds $line with space and tab as separator.

HTML version of Basic Foils prepared Sept 20 97

Foil 50 The index and rindex Functions

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

$loc = index($string, $substr); # returns in $loc the location (first character in $string is location 0) of first occurrence of $substr in $string.

If $substr is not found, index returns -1

$loc = index($string, $substr, $firstloc); # returns $loc which is at least as large as $firstloc

Use to find multiple occurrences, setting $firstloc as 1+ previously found location

rindex($string, $substr, $lastloc) is identical to index except scanning starts at right (end) of string and not at start. All locations still count from left but if you give a third argument $lastloc, the returned $loc will be at most $lastloc

HTML version of Basic Foils prepared Sept 20 97

Foil 51 The substr Function

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

$partstring = substr($string, $start, $length); # returns in $partstring the partial string starting at position $start of at most $length characters

Missing $length or a huge value for $length returns all characters from starting position to end of $string
Negative values of $start count backwards from end character in $string
$endchar = substr($string,-1,1); # returns last character in $string

substr($string, $start, $length) = $new; # replaces extracted substring with characters in $new which need not be of same length as original

$class = "CPS600";
substr($class, 3) = "616"; # leads to $class = "CPS616"

HTML version of Basic Foils prepared Sept 20 97

Foil 52 Functions and Subroutines I

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Functions are defined by the sub construct

sub subname {
- statements;
- expression defining returned result;
}

They are invoked as follows:

$sum = add(); # a simple routine with no args
sub add {
- $a1+$a2+$a3; # Sum three global vars
}

You can also use the older syntax:

$sum = &add;

HTML version of Basic Foils prepared Sept 20 97

Foil 53 Functions and Subroutines II

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

A comma-separated list of arguments may be used in the calling sequence

One can write for subroutines or functions (replace subname by any Perl function)

subname LIST; or subname(LIST); # Parentheses are optional

This list can be accessed in function using array @_ with elements $_[0],$_[1] etc.

$sum = add($a1, $a2, $a3); # a similar routine with arguments (can be variable in number)
sub add {
- $_[0] + $_[1] + $_[2]; # sum three arguments
}

HTML version of Basic Foils prepared Sept 20 97

Foil 54 Functions and Subroutines III --
local and my

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The local construct defines dynamically-scoped variables as in Common Lisp

A call to local does not create a new variable! In effect, it makes a copy of its argument, whose original value is restored when the block is finished executing

local extends its scope to any function called from the block in which the local statement is defined

Perl5 introduced my, which creates true local variables whose scope is confined to the block in which they are defined

Note: one can use my or local in any block (not just a function) enclosed in { ... } to define temporary variables of limited scope

HTML version of Basic Foils prepared Sept 20 97

Foil 55 Functions and Subroutines IV -- An Example

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

sub bigger_than {

my ($test,@values); # Create local variables
($test, @values) = @_; # Split argument list
local (@result); # A place to store result
foreach $val (@values) { # Step through arg list
- if( $val > $test ) { # Should we add this value
- push(@result, $val); # add to result list
- }
}
@result; # Required to specify what to be returned

} # Could be pedantic and write return @result

HTML version of Basic Foils prepared Sept 20 97

Foil 56 Binary Equality Operators

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

We have already seen equality operators

== ,!= for numerically equal, unequal
eq , ne for stringwise equal, not equal

$a <=> $b returns -1,0,1 depending if $a is respectively numerically less than, equal to, or greater than $b

$a cmp $b returns -1,0,1 depending if $a is respectively stringwise less than, equal to, or greater than $b

HTML version of Basic Foils prepared Sept 20 97

Foil 57 Sorting with Various Criteria

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

sort() is a builtin PERL function with three modes:

@result = sort @array; # equivalent to sort { $a cmp $b} @array;

which sorts the variables in @array using stringwise comparisons, returning them in @result

@result = sort BLOCK @array; # where statement BLOCK enclosed in {} curly brackets returns -1, 0, 1 given values of $a, $b

@result = sort { $age{$a} <=> $age{$b} } @array; # sorts by age if entries in @arrays are keys to hash %age, which holds numeric age for each key

@result = sort SUBNAME @array; # uses subroutine (which can be specified as value of scalar variable) to perform sorting

sub backsort { $b <=> $a; } # Reverse order for integers

@result = sort backsort @array; # sorts in numerically decreasing order

HTML version of Basic Foils prepared Sept 20 97

Foil 58 The Translation Operator tr

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

tr/ab/XY/ translates a to X and b to Y in string $_

As for m and s, one can apply tr to a general string with =~

$string =~ tr/a-z/A-Z/; # translates letters from lower to upper case in $string

Note use of - to specify range as in regular expressions, although tr does NOT use regular expressions

tr can count and return number of characters matched

$numatoz = tr/a-z//; # $numatoz holds number of lower case letters in $_

if final string empty no substitutions are made

if second string shorter than first, the last character in second string is repeated

tr/a-z/A?/; # replaces a by A and all other lower case letters by ?

if the /d option used, unspecified translated characters are deleted

tr/a-z//d; # deletes all lower case letters

the /c option complements characters in initial string

tr/a-zA-Z/_/c; # translates ALL nonletters into _

the /s option captures multiple consecutive copies of any letter in final string and replaces them by a single copy

HTML version of Basic Foils prepared Sept 20 97

Foil 59 Additional Control Flow Constructs I

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

last, next and redo provide ways to alter execution flow in loops:

while (something is tested) {

# redo goes to here
somecalc1;
if (somecondition) {
- somecalc2;
}
somecalc3;
#next goes to here

}

# last jumps to here

last jumps out of loop

next jumps over rest of loop

redo jumps back to start of loop

These commands control innermost enclosing for, foreach, or while loop

HTML version of Basic Foils prepared Sept 20 97

Foil 60 Additional Control Flow Constructs II
-- Statement Labels

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

next, redo and last can jump to labelled statements with commands such as

next LABEL1; # or

last LABEL2; # or

redo LABEL3;

LABEL1: This is start of any statement block; # is typical statement label

This is typically used to allow you to jump out over several nests of loops

HTML version of Basic Foils prepared Sept 20 97

Foil 61 Additional Control Flow Constructs III -- Accelerated Tests

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

There are ways of performing simple tests that require fewer curly braces and other punctuation

expr1 if testexp; # is equivalent to

if (testexp) {

expr1;

}

last, redo, and next can be followed by such tests e.g.

last DOREALWORK if userendofinitializationhit ;

There are similar abbreviations for unless,while,until

dothisexpression unless conditionholds;

dostandardstuff while normalconditionholds;

dostandardstuff until specialconditionseen;

HTML version of Basic Foils prepared Sept 20 97

Foil 62 Additional Control Flow Constructs IV

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

thatcommand if thiscondition; # is equivalent to

thiscondition && thatcommand;

PERL will not continue with && (logical and) if it finds a false condition. So if thiscondition is false, thatcommand is not executed

Similarly:

thatcommand unless thiscondition; # is equivalent to

thiscondition || thatcommand;

Note can use and instead of && and or instead of ||

not (instead of !) and xor (instead of ^) also allowed

We can use a C-like if-expression

expression ? Truecalc : Falsecalc; # which is equivalent to

if (expression) { Truecalc; } else { falsecalc; }

HTML version of Basic Foils prepared Sept 20 97

Foil 63 FileHandles

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Files are like statement labels designated by a string without a special initial character. It is recommended that you use all capitals in such labels

STDIN STDOUT STDERR (and diamond <> null name) have been introduced and correspond to UNIX stdin, stdout and stderr (and concatenation of argument files if <> operator)

Filehandles allow you to address general files and the syntax is similar to UNIX standard I/O (stdio.h) support

open(FILEHANDLE, "unixname"); # opens file unixname for reading -- can use <
open(FILEHANDLE, ">unixname"); # opens file unixname for writing
open(FILEHANDLE, ">>unixname"); # opens file unixname in append mode

close(FILEHANDLE); # closes file

Errors are handled with die construct:

open(FH, '>' . $criticalfile) || die("Print an error message if file can't be opened\n"); # Note how we add '>' (or '>>') to file name stored in Perl variable

HTML version of Basic Foils prepared Sept 20 97

Foil 64 Using FileHandles and Testing Files

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

As illustrated <FILEHANDLE> reads either single line or full file depending on whether one stores it in a scalar or a array variable

print FILEHANDLE list; # writes list onto FILEHANDLE

print list; # is equivalent to

print STDOUT list;

There are a whole set of test operators which act on file NAMES not file HANDLES

-e $filename returns true if $filename EXISTS

-r $filename returns true if $filename is READABLE

-w $filename returns true if $filename is WRITABLE

-x $filename returns true if $filename is EXECUTABLE

HTML version of Basic Foils prepared Sept 20 97

Foil 65 A Perl "Here" Document

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Very convenient for text output is "here" document:

print FILEHANDLE <<EOF;

<html>

<head><title>$title</title></head>

<body bgcolor="$bgcolor">

<h1>$title</h1>

</body></html>

EOF

Here EOF is an arbitary string to denote end of data

Note that variables are interpolated in this syntax which is equivalent to a "" form which is less clear!

print FILEHANDLE "<html>\n<head><title>$title</title></head>\n"; # etc.

HTML version of Basic Foils prepared Sept 20 97

Foil 66 Some Special Capabilities in Formatted Writes

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Note $| or $OUTPUT_AUTOFLUSH if nonzero forces a flush after every write or print on current output channel. Default is 0.

Note $: or $FORMAT_LINE_BREAK_CHARACTERS is set of characters on which to break when processing filled continuation lines (caret ^ format)

default is "\s\n-" to break on whitespace, newline, or hyphen

Related is $/ or $INPUT_RECORD_SEPARATOR or $RS (defaults to newline) which is very useful when processing HTML where newlines are irrelevant and you set $/ to say < or > to scan to next tag or end of tag and ignore newlines

This is valid in conventional <FILEHANDLE> syntax

HTML version of Basic Foils prepared Sept 20 97

Foil 67 Globbing

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

We use * notation in shell to match sets of files -- this is NOT same as regular expression since * is equivalent to (.*) except normally files beginning with . are not accessed with a simple glob

Presumably glob is "short" for globalize, not globular

@a = <file name with globbing>; returns a list (one per element of @a) of files matching globbed specification

For example @a = < *cps616*> returns all files in current directory with string cps616 somewhere in their name.

Variable interpolation is allowed in globbing, e.g.

$home = "~gcf"; # gcf's home directory is ~gcf

@a = <$home/*>; # returns all non initial . files in gcf's home directory

HTML version of Basic Foils prepared Sept 20 97

Foil 68 Directory Access

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

chdir($name); transfers to directory specified in $name

mkdir($name, mode); # makes directory with given name $name and MODE (typically 3 octal characters such as 0755)

opendir(DIRHANDLE, $name); # opens directory with directory handle DIRHANDLE. Such names can be assigned independently of all other names and are in particular not connected with FILEHANDLEs

closedir(DIRHANDLE); # closes directory associated with handle DIRHANDLE

readdir(DIRHANDLE); # returns file names (including . and ..) in directory with handle DIRHANDLE

If scalar result, readdir returns "next" file name
If array result, readdir returns all file names in directory

HTML version of Basic Foils prepared Sept 20 97

Foil 69 Execution of UNIX Commands -- system

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

system("shellscript"); # dispatchs shellscript to be executed by /bin/sh and anything allowed by shell is allowed in argument

system returns code returned by shellscript

system("date > tempfil"); # executes UNIX command date returning standard output from date to file tempfil in current directory

system("rm *") && die ("not allowed\n"); # terminates if error in system call as shell programs return nonzero if failure (opposite of open and most PERL commands)

Variable interpolation is done in double-quoted arguments:

$prog = "nobel.c"; system("cc -o $prog"); # (I) is equivalent here to

$ccompiler="cc";

system($ccompiler,"-o","nobel.c"); # (II) but in general not identical as in first form (I) shell interprets command list but in second form (II) the arguments are handed directly to command given in first entry in list given to system

HTML version of Basic Foils prepared Sept 20 97

Foil 70 Processing the Environment %ENV

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

%ENV is set as the shell environment in which the Perl program was invoked

Any UNIX processes invoked by system, fork, backquotes, or open inherits an environment specified by %ENV at invocation of child process

One can change %ENV in the same way as any hash:

%ENVIN = %ENV ; $oldpath = $ENV{"PATH"}; # saves input environment

$ENV{"PATH"} = $oldpath . ":/web/cgi"; # resets PATH to include an extra directory to be used by child process -- later we run

%ENV=%ENVIN; # Restores original environment

One can see what has been passed in %ENV by using Perl keys function

foreach $key (sort keys %ENV ) {

print "$key = $ENV{$key}\n"; # both $key, $ENV{} are interpolated

}

HTML version of Basic Foils prepared Sept 20 97

Foil 71 Execution of UNIX Commands -- backquotes

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

$now= "Todays date: " . `date`; # sets $now to be the specified label followed by the result of shell's date command

`who` would naturally return a set of lines and the result can be stored in an array -- one array entry for each output line

Both system and backquote mechanism invoke a shell command which normally share standard input, standard output and standard error with the Perl program

This can be reset as for instance in

`rm fred 2>&1`; # using shell syntax to send standard error to same place as standard output

HTML version of Basic Foils prepared Sept 20 97

Foil 72 Execution of UNIX Commands --
Filehandle Mechanism

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

open(WHOHANDLE, "who|"); # opens WHOHANDLE for reading output of system call to who

the | at right means we will be able to treat output of who as though we were reading it from a file

@whosaid = <WHOHANDLE> ; # defines an array whosaid holding output of who command

open(LPRHANDLE, "|lpr -Pgcf"); # with | at left opens lpr process so that if we write to filehandle LPRHANDLE it is as though we handed file to input of lpr

print LPRHANDLE "This is a test\n"; # for example

close(LPRHANDLE); # waits until lpr command has finished and closes handle

HTML version of Basic Foils prepared Sept 20 97

Foil 73 Execution of UNIX Commands --
fork and exec

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

This is most powerful method with fork creating two identical copies of program -- parent and child

unless (fork) { ;} # child indicated by fork=0

; # otherwise fork=child process number for parent

The child program typically invokes exec which replaces child original by the argument of exec. Meanwhile parent should wait until this exec is complete and child has gone away.

unless (fork) {

exec("date"); # child process becomes date command sharing environment with parent

}

wait; # parent process waits until date is complete

The child process need not terminate naturally as with exec() and if child code was for instance

print FILEHANDLE @hugefile; # in parallel with parent

exit; # is required else child will continue with parents code whereas we wanted parent and child to work in parallel on separate jobs

HTML version of Basic Foils prepared Sept 20 97

Foil 74 Signals, Interrupt Handlers, kill

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

The hash %SIG is used to define signal handlers (subroutines) used for various signals

The keys of %SIG are the UNIX names with first SIG removed. For instance, to set handler() as routine that will handle SIGINT interrupts do something like:

$SIG{'INT'} = 'handler';

sub handler { # First argument is signal name

my($sig) = @_;
print("Signal $sig received -- shutting down\n");
exit(0);

}

kill $signum, $child1, $child2; # sends interrupt $signum to process numbers stored in $child1 and $child2

$signum is NUMERICAL label (2 for SIGINT) and $child1,2 the child process number as returned by fork or open(PROCESSHANDLE,..) to parent

HTML version of Basic Foils prepared Sept 20 97

Foil 75 The eval Function and Indexed Arrays of Hashes

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

As in other interpreters (JavaScript, e.g.), PERL allows you to execute a string using the eval function

Suppose you had two arrays $fred[$index] and $jim[$index] and you wanted to give them a value of $index and an ascii string $name (which could be input) taking value 'fred' or 'jim'. This can be achieved by:

eval('$' . $name . '[' . $index . ']') = $value;

eval returns result of evaluating its argument

In this case, you can achieve the same results with indexed hashes:

$options[$index]{$name} = $value;

using multidimensional array notation

HTML version of Basic Foils prepared Sept 20 97

Foil 76 What is CGI.pm?

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

CGI.pm, a Perl5 module (by Lincoln Stein) used to write CGI scripts, is documented in Ch. 19 of Learning Perl (second edition). See the CGI.pm man page for details:

% man CGI # read CGI man page

CGI.pm is compatible with the Perl4 library cgi-lib.pl:

require "cgi-lib.pl";

&ReadParse; # initialize global hash %in

print "The value of 'foo' is $in{foo}.\n";

is equivalent to

use CGI qw(:cgi-lib);

&ReadParse; # initialize global hash %in

print "The value of 'foo' is $in{foo}.\n";

Other cgi-lib.pl functions available in CGI.pm:

PrintHeader() HtmlTop() HtmlBot()

SplitParam() MethGet() MethPost()

HTML version of Basic Foils prepared Sept 20 97

Foil 77 CGI.pm Import Tags

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Groups of CGI.pm methods are loaded via import tags:

:cgi argument-handling methods such as param()

:form HTML form tags such as textfield()

:html2 all HTML 2.0 tags

:html3 all HTML 3.0 tags, including <TABLE>

:netscape Netscape HTML extensions

:shortcuts equivalent to qw(:html2 :html3 :netscape)

:standard equivalent to qw(:html2 :form :cgi)

:all all of the above

Examples:

use CGI; # must use object-oriented syntax!

use CGI qw(:standard);

use CGI qw(:standard :html3);

HTML version of Basic Foils prepared Sept 20 97

Foil 78 CGI.pm Syntax

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Use object-oriented Perl5 syntax:

use CGI;

my $query = CGI->new(); # new query object

my $val = $query->param(foo);

or ordinary Perl4 function calls:

use CGI qw(:standard);

my $val = param(foo);

With CGI.pm, it's easy to write self-contained CGI scripts:

use CGI qw(:standard);

if ( param() ) {

# process data from HTML form

} else {

# generate HTML form

}

HTML version of Basic Foils prepared Sept 20 97

Foil 79 Using CGI.pm - Form Processing

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Here's an example how to use CGI.pm to process form data:

print header(-type=>'text/html'); # MIME header

print start_html( # first few lines of HTML

-title=>'Pizza Order Form',
-BGCOLOR=>'#ffffff'
);

print h1( 'Pizza Order' );

print h3( "$TOPPING pizza" ); # $TOPPING = param(topping);

print p( "Deliver to: <B>$ADDR</B>" ); # $ADDR = param(address);

print p( "Telephone: <B>$PHONE</B>" ); # $PHONE = param(phone);

my $date = `date`; chomp($date);

print p( "Order came in at $date" );

print hr();

# Print a link:

print 'Return to ';

print a({href=>"fill-out-form.pl"}, # an anonymous hash

'HTML form'
);

print end_html(); # last few lines of HTML

HTML version of Basic Foils prepared Sept 20 97

Foil 80 Using CGI.pm - Form Generation

From Tutorial on PERL Computational Science for Information Age Course CPS616 -- Sept 20 97. *

Full HTML Index

Here's an example of a form generated by CGI.pm function calls:

print start_form( # <FORM> tag

-method=>'POST', # default
-action=>'fill-out-form.pl'
);

print p( "Type in your street address:\n",

textfield( # a textfield
- -name=>'address', -size=>36
)
);

print p( 'What kind of pizza would you like?' );

print blockquote( # requires :shortcuts

radio_group(
- -name=>'topping',
- -values=>['Pepperoni','Sausage','Anchovy']
) # an anonymous array
);

print p( 'To place your order, click here:',

submit('Order Pizza'), # submit button
reset('Clear') # reset button
);

print end_form(); # </FORM> tag

© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Sun Sep 21 1997