Welcome to my CPS 600 Tutorial Project -- PERL


You are the visitor number since February 1, 1996 to look at my Information Age project page at **:** EST.


Table of Contents


How to read this tutorial Perl project?
  1. Introduction to Perl
    1. What is Perl?
    2. Who creates Perl?
    3. Perl's license
    4. Which platforms can Perl run on?
    5. Where can I get Perl's information?
    6. How can I begin a perl program?
    7. Is it a Perl program or a Perl script?
    8. Is Perl difficult to learn?
    9. Should I program everything in Perl?
  2. Scalar Data
  3. Array and List Data and Associative Arrays
  4. Control Structures
  5. Basic I/O
  6. Formats
  7. Regular Expressions
  8. Functions
  9. Filehandles and File Tests
  10. File and Directory Manioulation
  11. Converting Other Languages to Perl
  12. Glossaries and/or man pages
  13. Examples
  14. Difference between Perl4 and Perl5
  15. WWW sites for Perl
  16. References

How to read this tutorial Perl project?



  1. Introduction to Perl

    1. What is Perl?
    2. Perl is short for "Practical Extraction and Report Language". Perl integrates the best features of Shell programming, C, and the UNIX utilities -- grep, sed, awk and sh.

    3. Who creates Perl?
    4. The inventor of Perl is Larry Wall.

    5. Perl's license
    6. Perl is distributed under the GNU Public License. That means it is essentially FREE.

    7. Which platforms can Perl run on?
    8. It can be run on various platforms such as UNIX, UNIX-like, Amiga, Macintosh, VMS, OS/2, even MS-DOS and maybe more in the near future.

    9. Where can I get Perl's information?
    10. You can get any information from the USENET newsgroup "comp.lang.perl". There is information for obtaining Perl, solving the problems, ..., etc.. There are also many experts monitoring this news group including the inventor -- Larry Wall. That means you might get response in a minute.

    11. How can I begin a perl program?
    12. The simplest way to do it is include the following line at the beginning of your perl file:
          #!/bin/perl (or the path Perl located on your system)
      
      Some may have:
      
          #!/usr/local/bin/perl
      
          your program
      	:
      	:
      	:
      
      With this line, the shell knows where to look for Perl to run the program.

    13. Is it a Perl program or a Perl script?
    14. It's up to you. If you prefer to call it Perl program, then call it program. If you like script more, then call it script.

    15. Is Perl difficult to learn?
    16. No, it's not. Many people(including myself) think Perl is easy to learn. There are some reasons:

    17. Should I program everything in Perl?
    18. As a matter of fact, you can do anything in Perl. You, however, should not. Why? You should use the most appropriate tool for your job. Some people use Perl for shell programming, some people use Perl to replace some C programs. Perl, however, is not a good choice fot very complex data structures.


  2. Scalar Data

  3. A scalar is the simplest kind of data that perl manipulates. A scalar can be either a number or a string of characters.

    Numbers
    Even though we can specify integers, floating-poit number, ..., etc.. However, internally, Perl computes only with double-precision floating-point values.

    Strings
    • Single-Quoted Strings
      A single-quoted string is a sequence of characters enclosed in single quotes.
      One thing to notice, some special characters like newline will not be interpreted within a single-quoted string. For example:
         'hello'	# 5 characters
         'don\'t'	# 5 characters
         'hello\n'	# 7 characters
      
    • Double-Quoted Strings
      As to a double-quoted string, it is much like a C string. The Backslash Escapes work within the double-quoted strings. The complete list of double-quoted string escapes is listed below:
      Escape	| Meaning
      --------+------------------------------------------
      \n	| Newline
      \r      | Return
      \t	| Tab
      \f	| Formfeed
      \b	| Backspace
      \v	| Verticle tab
      \a	| Bell
      \e	| Escape
      \cC	| CTRL + C
      \\	| Backslash
      \"	| Double quote
      \l	| Lowercase next letter
      \L	| Lowercase all following letters until \E
      \u	| Uppercase next letter
      \U	| Uppercase all following letters until \E
      \E	| Terminate \L or \U
      
      Another feature of double-quoted strings is that they are variable interpolated. That means that some variable names within the string are replaced by their current values when the strings are used.

    Operators
    An operator generates a new value from one or more values. In Perl, the operators and expressions are generally a superset of most programming languages such as C. One thing you might be interesting is the operators between Numbers and Stings are different. The comparison is listed below:
    Comparison		|    Numeric	|     String
    ------------------------+---------------+---------------
    Equal			|	==	|	eq
    Not Equal		|	!=	|	ne
    Less Than		|	<	|	lt
    Greater Than		|	>	|	gt
    Less than or Equal To	|	<=	|	le
    Greater Than or Equal To|	>=	|	ge
    ------------------------+---------------+---------------
    

    Let's talk more about the scalar data from now on.


  4. Array and List Data and Associative Arrays

  5. An array is an ordered list of scalar data. Each element of the array is a separate scalar variable with the corresponding value.

    Array Variables

    Array variables hold a single array value(0 or more scalar value). The array variable names are similar to scalar variable names except the leading character. The scalar variable name begins with a dollar sign($) but array variable names are:

    Operators for a List Array:


    Associative Arrays

    An associative array is just like a list array. The difference between an associative array and a list array is that the list array uses non-negative integers as index values but the associative array uses arbitrary scalars. These scalars, also called keys, are used to retrieve the corresponding values from the associative array.

    One thing we have to notice is that there is no particular order for the elements of an associative array. Whenever we want to find some specific values, we use the keys to find them. We do not have to worry how we can find them because Perl has the internal order to do this.

    Most of time, people want to access the elements of the associative array rather the entire array. At this time, we need the keys to do this. The associative array is represented as:

        %test
    
    and an element of an associative array is represented as:
        $test{$key}			# Notice! The leading character is a 
    				  dollar sign($) and so is the key.
    
    How do we create and/or update an associative array? Here are some examples:
        $test{1} = "Hello";		# creates key 1 and value "Hello"
        $test{2} = 100;		# creates key 2 and value 100
    
    We can also assign the key-value pairs to a list array.
        @test1 = %test;		# @test1 is either (1, "Hello", 2, 100) 
    				  or (2, 100, 1, "Hello")
    
    The order of the key-value pair is arbitrary and cannot be controlled. Perl has its own logic to have more efficient access. Of course, the list array can copy its values to an associative array.
        %test2 = @test1;		# %test2 is just like %test now.
    

    Operators for Associative Arrays:


  6. Control Structures

  7. There are several statements provided by Perl.


  8. Basic I/O


  9. Formats

  10. Perl also provides the notion of a simple report template which is called format. A format in Perl contains two parts: constant part (the column header, labels, fixed text, ...) and variable aprt (current data).

    Using a foramt consists of doing three things:

    1. Defining a format.
    2. Loading up the data to be printed into the variable portions of the format (fields).
    3. Invoking the format.

    Usually, the first step is done once and the other two are done repeatedly.


  11. Regular Expressions

  12. Regular Expressions

    A regular expression is a pattern, a template to be matched against a string. Regular expressions are used frequently by many UNIX programs, such as awk, ed, emacs, grep, sed, vi and other shells. Perl is a semantic superset of all of these tools. Any regular expression that can be described in one of the UNIX tools can also be written in Perl, but not necessarily using exactly the same characters.

    In Perl, we can speak of the string test as a regular expression by enclosing the string in slashes.

        while (<>) {
           if (/test/)
              print "$_";
        }
    
    What if we are not sure how many e's between "t" and "s"? We can do the following:
        while (<>) {
           if (/te*st/)
              print "$_";
        } 
    
    This means "t" is followed by zero or more e's and then followed by "s" and "t".

    We, now, introduce a simple regular expression operator -- substitute. It replaces the part of a string that matches the regular expression with another string. It looks like the s command in sed, consisting the letter s, a slash, a regular expression, a slash, a replacement string and a final slash, looks like:

        s /te*st/result/;
    
    Here, again, the $_ variable is compared with the regular expression. If they are matched, the part of the string is replaced by the replacement string ("result"). Otherwise, nothing happens.

    Pattern

    A regular expression is a pattern.

    Selecting a Different Target

    Sometimes, we do not want to match patterns with the $_ variable. Perl provides the =~ operator to help us for this problem.
        $test = "Good morning!";
        $test =~ /o*/;		# true
        $test =~ /^Go+/;		# also true
    
    One thing we have to notice again here is we never store the input into a variable. That means, if you want to match this input again, you won't be able to do so. However, this happens often.

    Ignoring Case

    We, sometimes, may want to consider patterns with both uppercase and lowercase. As we know, some versions of grep provides -i flag indicating "ignore case". Of course, Perl has a similar option. You can indicate the ignore case option by appending a lowercase "i" to the closing slash, such as /patterns/i.
        < STDIN > =~ /^y/i;		# accepts both "Y" and "y"
    

    Using a Different Delimiter

    We may, sometimes, meet such a situation:
        $tmp =~ /\/etc\/fstab/;
    
    As we know, if we want to include slash characters in the regular expression, we need to use a backslash in front of each slash character. It looks funny and unclear. Perl allows you to specify a different delimiter character. Precede any nonalphanumeric character with an "m".
        m@/etc/fstab@		# using @ as a delimiter
        m#/etc/fstab#		# using # as a delimiter
    

    Special Read-Only Variables

    There are three special read-only variables: 1. $&, which is the part of the string that matched the regular expression. 2. $`, which is the part of the string before the part that matched. 3. $', which is the part of the string after the part that matched. For example:
        $_ = "God bless you.";
        /bless/;
        # $` is God " now
        # $& is "bless" now
        # $' is " you." now
    

    Substitutions

    We have know the simple form of the substitution operator: s/old_regular_expr/replacement_string/. We now introduce something different. If you want to replace all possible matches instead of just the first match, you can append a g to the closing slash.
        $_ = "feet feel sleep";
        s/ee/oo/g;			# $_ becomes "foot fool sloop"
    
    You can also use a scalar variable as a replacement string.
        $_ = "Say Hi to Neon!";
        $new = "Hello";
        s/Hi/$new/;			# $_ becomes "Say Hello to Neon!"
    
    You can also specify an alternate target with the =~ operator.
        $test = "This is a book.";
        $test =~ s/book/desk/;	# $test becomes "This is a desk."
    

    The split() and join() operators

    In Perl, there are two operators used to break and combine regularu expressions. They are split() and join() operators.


  13. Functions

  14. We have seen some system functions such as print, split, join, sort, reverse, and so on. Let's take a look at user defined functions.

    Defining a User Function

    A user function, usually called a subroutine or sub, is defined like:
        sub subname {
            statement 1;
            statement 2;
            statement 3;
            statement 4;
                :
                :
                :
        }
    
    The subname is the name of the subroutine. It can be any name. The statements inside the block are the definitions of the subroutine. When a subroutine is called, the block of statements are executed and any return value is returned to the caller. Subroutine definitions can be put anywhere in the program. They will be skipped on execution. Subroutine definitions are global, there are no local subroutines. If you happen to have two subroutine definitions with the same name, the latter one will overwrite the former one without warning.

    Invoking a User Function

    How can we call a subroutine? We must precede the subroutine name with an ampersand(&) while you are trying to invoke a subroutine.
        &say_hi;
    
        sub say_hi {
           print "Say Hi to Neon!";
        }
    
    The result of this call will display "Say Hi to Neon!" on screen.

    A subroutine can call another subroutine, and that subroutine can call another and so on until no memory left.

    Return Values

    Like in C, a subroutine is always part of some expression. The value of the subroutine invocation is called the return value. The rturn value of a subroutine is the value of the last expression evaluated within the body of the subroutine on each invocation.
        $a = 5;
        $b = 5;
        $c = &sumab;		# $c is 10
        $d = 5 + &sumab;		# $d is 15
    
        sub sumab {
            $a + $b;
        }
    
    A subroutine can also return a list of values when evaluated in an array context.
        $a = 3;
        $b = 8;
        @c = &listab;		# @c is (3, 8)
    
        sub listab {
    	($a, $b);
        }
    
    The last expression evaluated means the last expression which is evaluated rather than the last expression defined in the subroutine. In the following example, the subroutine will return $a if $a > $b, otherwise, return $b.
        sub choose_older {
    	if ($a > $b) {
    	    print "Choose a\n";
    	    $a;
    	} else {
    	    print "Choose b\n";
                $b;
    	}
        }
    

    Arguments

    The subroutine will be more helpful and useful if we can pass arguments. In Perl, if the subroutine invocation is followed by a list within parentheses, the list is automatically assigned to a special variable @_ for the duration of the subroutine. The subroutine can determine the number of arguments and the value of those arguments.
        &say_hi_to("Neon");		# display "Say Hi to Neon!"
        print &sum(3,8);		# display 11
        $test = &sum(4,9);		# $test is 13
    
        sub say_hi_to {
    	print "Say Hi to $_[0]!\n";
        }
    
        sub sum {
    	$_[0] + $_[1];
        }
    
    Excess parameters are ignored.

    What if we want to add all of the elements in the list? Here is the example:

        print &sum(1,2,3,4,5);	# display 15
        print &sum(1,3,5,7,9);	# display 25
        print &sum(1..10);		# display 55 since 1..10 is expanded
    
        sub sum {
    	$total = 0;
    	foreach $_ (@_) {
    	    $total += $_;
    	}
    	$total;			# last expression wvaluated
        }
    

    Local Variables in Functions

    We have know how to use @_ to invoke arguments in the subroutine. Now, you may want to create local versions of a list of variable names in the subroutine. You can do it by local() operator. Here is the sum subroutine with local() operator:
        sub sum {
    	local($total);		# let $total be a local variable
            $total = 0;   
            foreach $_ (@_) {
                $total += $_;
            }
            $total;                 # last expression wvaluated
        }
    
    When the first body statement is executed, any current value of the global value $total is saved away and a new variable $total is created with an undef value. When the subroutine exits, Perl discards the local variable and restores the previous global value.
        sub larger_than {
    	local($n, @list);
    	($n, @list) = @_;
    	local(@result);
    	foreach $_ (@list) {
    	    if ($_ > $n) {
    		push(@result, $_);
    	    }
    	}
    	@result;
        }
    
        @test1 = &larger_than(25, 24, 43, 18, 27, 36);
        # @test1 gets (43,27,36)
        @test2 = &larger_than(12, 22, 33, 44, 11, 55, 3, 8);
        # @test2 gets (22,33,44,55)
    
    We can also combine the first two lines of the above subroutine.
        local($n, @list) = @_;
    
    This is, however, a common Perl like style. Here is a tip about the using of the local() operator. Try to put all of your local() operators at the beginning of the subroutine definition before you get into the main body of the subroutine.


  15. Filehandles and File Tests

  16. What is a Filehandle?

    A filehandle is the name in a Perl program for an I/O connection between your Perl process and the outside world. Like block labels, filehandles are used without a special prefix character. It might be confused with some reserve words. Therefore, the inventor of Perl -- Larry Wall suggests people to use all UPPERCASE letters for the filehandle.

    Opening and Closing a Filehandle

    Most of times, we want to make sure whether we have opened the file successfully or not. We can use the die() operator to inform us when the opening of a file fails. Usually, we use the following:
       open(FILEHANDLE,"test") || die "Sorry! Cannot open the file "test".\n";
    

    Using Filehandles

    Once a filehandle is opened for reading, you can read lines from it just like you can read lines from < STDIN >. Same as < STDIN >, the newly opened filehandle must be in the angle brackets. Here is an example to copy a file to another file:
        open(FILE1,$test1) || die "Cannot open $test1 for reading";
        open(FILE2,">$test2") || die "Cannot create $b";
        while (< FILE1 >) {		# read a line from file $test1 to $_
           print FILE2 $_;		# write the line into file $test2
        }
        close(FILE1);
        close(FILE2);
    

    File Tests

    Sometimes, we may want to know if the file we are gonna process exists, or is readable or writable. We need the file tests to help us at this time. Here is a table containing file tests and their meaning:
    File Test | Meaning
    ----------+--------------------------------------------------
        -r    | File or directory is readable
        -w    | File or directory is writable
        -x    | File or directory is executable
        -o    | File or directory is owned by user
        -R    | File or directory is readable by real user
        -W    | File or directory is writable by real user
        -X    | File or directory is executable by real user
        -O    | File or directory is owned by real user
        -e    | File or directory exists
        -z    | File exists and has zero size
        -s    | File or directory exists and has nonzero size
        -f    | Entry is a plain file
        -d    | Entry is a directory
        -l    | Entry is a symlink
        -S    | Entry is a socket
        -p    | Entry is a named pipe (a "fifo")
        -b    | Entry is a block-special file (a mountable disk)
        -c    | Entry is a character-special file (an I/O device)
        -u    | File or directory is setuid
        -g    | File or directory is setgid
        -k    | File or directory has the sticky bit set
        -t    | isatty() on the filehandle is true
        -T    | File is "Text"
        -B    | File is "Binary"
        -M    | Modification age in days
        -A    | Access age in days
        -C    | Inode-modification age in days
    
    You can check a list of filenames to see if they exist by the following method:
        foreach (@list_of_filenames) {
           print "$_ exists\n" if -e	# same as -e $_
        }
    


  17. File and Directory Manioulation

  18. Removing a File

    Perl uses unlink() to delete files. Here are some examples:
        unlink("test");		# delete the file "test"
        unlink("test1","test2");	# delete 2 files "test1" and "test2"
        unlink(< *.ps >);		# delete all .ps files like "rm *.ps" in 
    				  the shell
    
    You can also provide the selection from the users.
    
        print "Input the filename you want to delete: ";
        chop($filename = < STDIN >);
        unlink($filename);
    

    Renaming a File

    We use mv to rename files in the shell. In Perl, we use rename($old, $new). For example:
        $old = "test1";
        $new = "test2";
        rename($old, $new);		# "test1" is changed to "test2"
    

    Creating Alternate Names for a File (Linking)

    Making and Removing Directories

    In the shell, we use mkdir command to make a directory. In Perl, similarly, it provides mkdir() operation. However, Perl adds one additional information in this operation. It can decide the permission at the same time. It takes two arguments: directory name and the permission. For example:
        mkdir("test", 0755);	# It generates a directory called "test" 
    				  and its permission is "drwxr-xr-x"
    
    You can use a rmdir(directory_name) to remove the directory just like rmdir directory_name in the shell.

    Modifying Permissions

    Just like chmod command in the shell, Perl has chmod(). It takes two parts of arguments. The first part is the permission number (0644, 0755, ...) and the second part is a list of filenames. For example:
        chmod(0644,"test1");	# change the permission of "test1" to be 
    				  "-rw-r--r--"
        chmod(0644,"test1","test2");# change the permission of both files to 
    				  be "-rw-r--r--"
    

    Modifying Ownership

    Like chown in the shell, Perl has chown() operation. The chown() operator takes a user ID number(UID), a group ID number(GID) and a list of filenames. For example:
        Assume "test"'s UID is 1234 and its GID is 56.
    
        chown(1234, 56, "test1", "test2");
        # make test1 and test2 belong to test and its default group.
    


  19. Converting Other Languages to Perl

  20. One of the great things of Perl is that there are some programs converting from different languages to Perl.


  21. Glossaries and/or man pages

Please let me know what you think after you take a look at my tutorial Perl page. Just click on my email address below, you can send me mail. Thank you!

My Trademark -- Smiling Face!

Name Chang-An Hsiao (Andrew)
Email chsiao@nj37bf3s.bns.att.com
Address 613 Center Street Apt#104
Herndon VA 22070-5010
Phone Number HomeWorkFax
(703)742-4080(703)713-2894(703)713-2597