Fox Presentation Spring 96 CPS 616 Computational Science Track on Base Technology for the Information Age: Perl5 and Perl Extensions Instructors: Geoffrey Fox, Wojtek Furmanski Updated: September 1997 Syracuse University 111 College Place Syracuse New York 13244-4100 Abstract of Perl5 and Extensions See Perl Home Page http://www.perl.com/ for background information and resources such as manual! This foilset mainly extends the previous Perl Overview with a discussion of some key Perl5 capabilities. However, some topics may be advanced Perl4 features We give an initial summary of Perl5 changes and then discuss: Some old and new functions in Perl Regular expression enhancements New syntax, especially -> and => New subroutine calling and declaration syntax Hard (address) and soft (symbol table) references General data structures, including multidimensional arrays Object-oriented features: packages, classes, and methods Some Key Advances in Perl5 -- I Arbitrary multidimensional arrays for list [] and hash {} sectors Modules allow convenient library structure Functional but slightly adhoc object-oriented class structure hacked onto existing Perl4 C/C++ can be called from Perl and vice versa tie/untie allow more general database interfaces AUTOLOAD allows arbitrary action for undefined subroutines Now subroutines can be predeclared before implementation Some Key Advances in Perl5 -- II Significant regular expression enhancements parentheses implemented more cleanly: separating backreference ($n) and grouping functions added Minimal as well as default maximal greedy matching allowed Support for comments in regular expressions Full support for pointers (references) Various useful new routines such as my, qq, qw, q, quotemeta, lc, uc, lcfirst, ucfirst Many new pragmas: (use English; use strict 'vars' 'refs' 'subs';) Some Advanced Syntax in Perl New operator => for specifying keyword-value pairs: %hash = ( 'key1', 'value1', 'key2', 'value2' ); # is equivalent to %hash = ( 'key' => 'value1', 'key2' => 'value2' ); New operator -> is dereferencing operator hard/soft reference to array reference -> [index]; hard/soft reference to hash reference-> {'key'}; class or object -> method; In hash arrays, quotes are now optional if unambiguous, i.e. if couldn't be an expression $days{'Feb'} and $days{Feb} are the same! Perl Modules -- Packages - I package fred; $var = 3.14159; # defines fred to be a package and var to be a variable in this package so that variables following this statement should be accessed by $globalaccess = $fred::var; Packages can be nested package fred; .................... package jim; $var = 3.14159; # and now we use syntax $globalaccess = $fred::jim::var; Perl Modules -- Packages - II Modules are packages used as libraries To reference package HTML::FormatPS Typically you define a file called FormatPS.pm -- note .pm NOT .pl -- in directory HTML Note use of UNIX directory structure and file names to support logical object structure of software -- we saw this quaint (but universal) convention when discussing Java Example of Module HTML::FormatPS The file FormatPS.pm module starts with following lines package HTML::FormatPS; $DEFAULT_PAGESIZE = "A4"; # A Package variable # Followed by (old) Perl4 pod (plain old documentation) syntax for automatic generation of documentation =head1 NAME This and following lines are JUST documentation and ignored by the interpreter This is title of documentation HTML::FormatPS - Format HTML as postscript # Continued on the next foil ...... Pod Syntax Explaining Use of Module HTML::FormatPS =head1 SYNOPSIS require HTML::FormatPS; # This part of documentation defines use of Module $html = HTML::Parse::parse_htmlfile("test.html"); # access function parse_htmlfile in file Parse.pm in directory HTML Now define an object of class FormatPS which holds parameters of relevance -- $formatter will hold pointer to new object $formatter = new HTML::FormatPS FontFamily => 'Helvetica', PaperSize => 'Letter'; # Initial Parameters in this Invocation of HTML::FormatPS package print $formatter->format($html); # run format method in class FormatPS to produce postscript output An Aside on Perl Pod Notation The start of pod information is recognized by a command starting with = in column1: We give HTML approximate equivalents to give intuition! =head1 heading Roughly equivalent to

heading

=head2 heading Roughly equivalent to

heading

=over N Indent by N characters =item text Roughly
  • text =back Roughly =cut End Pod Sequence One uses I to get italic i.e. text in HTML B for Bold, L for link etc. See perlpod manual page and look at Perl library code which uses pod notation to generate manual require and use in Perl5 require Cwd; # Makes notation Cwd:: accessible $here = Cwd::getcwd(); # correctly accesses function getcwd in module Cwd $here = getcwd(); # looks for getcwd in current file and probably fails! On the other hand: use Cwd; # Actually imports names (symbol table) from Cwd and $here = getcwd(); # is equivalent to $here = Cwd::getcwd(); Can use require with Perl programs -- not packages require "fred.pl"; # reads this file and thereby includes any functions included in file. Acts like a library call in that will not read fred.pl if already done! Overview of References in Perl5 Perl5 has a rich and at times rather confusing syntax for references or pointers This new feature is used to allow Perl variables to hold "handles" to objects and so implement an object-oriented environment. So in this sense pointers in Perl combine and do not "properly" distinguish classic pointers and objects. There are hard references and soft or symbol table references Hard references are new to Perl5 and much more powerful Soft Symbol Table References - I One of Perl's "problems" (also its strength if you are knowledgeable) is that one often needs to understand implementation issues to use effectively Every package has a symbol table (i.e. a list of used symbols) called :: so that main symbol table is %main:: and variable $var in main has symbol table entry $main::{'var'} *var is equivalent to $main::{'var'} The symbol $original exists, we can set *var = *original; # and then $var is another 'name' for $original and @var is another name for @original, etc. That is, $var @var %var have same symbol table entry but will have different hard references Soft Symbol Table References - II We can more miraculously set $name="foo"; # define an innocent ascii string ${$name} = 6; # sets $foo=6 as though $name was a symbolic reference $$name = 6; # also sets $foo=6 $name->[0] = 4; # sets $foo[0] = 4 ${$name x 2} = 6; # sets $foofoo = 6 ; # remember definition of x for strings @$name = (); # sets @foo to null list while &$name(arguments); # calls subroutine foo with given arguments! use strict 'refs'; # FORBIDS symbolic references and above syntax will lead to error messages *PI =\3.14159; # ensures that $PI is set in a way that you can not override it! i.e. $PI = 3; # generates an error Hard References in Perl5 - I Hard references are more powerful than typeglobs (symbolic references) and in some cases supersede them $scalarref = \$foo; # is pointer to a scalar $getit = $$scalarref; # is same as $getit = $foo This is called dereferencing or going from pointer (reference) to value $arrayref = \@array; $hashref = \%hash; used as in $$hashref{"key"} = "value"; Hard References in Perl5 - II $coderef = \&subroutine; # pointer to a subroutine! accessed by &$coderef(arguments); $globref = \*STDOUT; used as in print $globref "We used a pointer to file handle\n"; Note in dereferencing, one can use curly braces {} either to disambiguate or to change scalar holding hard reference to a BLOCK returning a reference of correct type Thus $$scalarref is equivalent to ${$scalarref}; $$hashref{"key"} is equivalent to ${$hashref}{"key"} or $hashref->{"key"} Anonymous Data Structures and Subroutines Often one wishes to construct "unnamed" data structures or subroutines where one keeps track of them by reference as opposed to name This is natural with subroutines which return either a data structure or subroutine $arrayref = [1, 2, ['a', 'b', 'c'] ]; # $arrayref is a hard reference to a 2D array with 5 defined elements $arrayref->[2][1] will give value 'b' as will $$arrayref[2][1] $arrayref->[0][0] is 1; $arrayref->[0][1] is undef; $secretsub = sub { print "Support ECS\n" }; executed by &$secretsub; Data Structures Arrays of Arrays in Perl5 - I Read Chapter 4 of Camel Book (second edition) or manual pages on PerlLOL (List of Lists) and PerlDSC (Data Structures Cookbook) for many excellent examples $LoL3D[$x][$y][$z] = scalar func($x,$y,$z); # scalar forces scalar context The above is a classic (Fortran-like) 3D array except it need NOT be predefined, there are no dimensions, and Perl arrays can be ragged with $x=1 having different $y, $z ranges from $x=2, etc In ragged arrays, missing elements return undef Data Structures Arrays of Arrays in Perl5 - II We can also define $indexed2Dhash[$x][$y]{$z} which should be thought of as a hash labelled by two-dimensional indices $hashof2Darray{$x}[$y][$z] should be thought of as a hash whose value is a 2D array One can freely use such data structures as long as you use "full" number of indices Issues that require understanding of implementation occur when you need to manipulate structure "as a whole" with less than full number of indices Implementation Issues - I All multi-dimensional data structures are implemented as arrays of references for $i (1..10) { @list = somefunc($i); # grab a list labelled by $i # Compute the number of elements in @list: $LoL[$i] = scalar @list; # Create a fresh 2D array for each $i: $LoL2D[$i] = [ @list ]; # use array constructor [ ] } # End for loop my(@list) = somefunc($i); # my() creates a fresh instance each time $LoL2D[$i] = \@list; # also works but is perhaps less clear Implementation Issues - II Note my() can occur inside any block { } (not just at start of subroutine) and defines variables local to the block The line $LoL2D[$i] = \@list; also creates a 2D array, but \@list is same location each time and so $LoL2D[$i][$j] gives the same answer (i.e., the final @list returned) regardless of the value of $i In $Lol2D[$x][$y] one stores an array labelled by $x of hard references Each hard reference is to an anonymous 1D array whose elements are accessed by $y The -> Pointer Notation -- I $LoL2D[$i][$j] can be written equivalently as$LoL2D[$i]->[$j] but NOT $LoL2D->[$i]->[$j] or $LoL2D->[$i][$j] as left hand side of -> MUST be a reference and NOT an array or hash $ref_to_LoL2D = \@LoL2D; # is allowed and now access by $ref_to_LoL2D->[$i][$j] or $ref_to_LoL2D->[$i]->[$j] Note [ .. ] or { .. } create anonymous arrays or hashs respectively which can be assigned to a reference and then dereferenced by -> ( .. ) constructs a list which can be assigned to an Array or Hash The -> Pointer Notation -- II @LoL2D = ( [1,2], [1,2,3] ); # Constructs a 2D array $ref_to_LoL2D = [ [1,2], [1,2,3] ]; # creates a pointer to a 2D array $arraypt =\@{$LoL2D[$i]}; # extracts a slice ($i'th row) from $LoL2D[$i][$j] $$arraypt->[$j] is equivalent to $LoL2D[$i][$j] Perl5 is operationally like Fortran and acts as though right most elements are least significant and stored "consecutively" If one has defined attributes for students with components such as $student{"grade"}, then $student[$classmember]{"grade"} is way to address a class of students Some Remarks about Subroutines - I Subroutines must typically be predefined (with new sub command) if they are to be accessed with subname(list); # or one can use &subname(); # as equivalent to subname() so that & notation is typically unnecessary use packagename qw( NAME1 NAME2 NAME3); # imports routines NAME1 NAME2 NAME3 from package packagename Notice qw() is new Perl5 routine to generate quotes around space separated words qw(args); equivalent to split(' ', q(args)); Some Remarks about Subroutines - II One can predeclare subroutines with sub name; # may be used before implementation appears Subroutines may be defined anonymously: sub newprint { my $x = shift; # return anonymous subroutine: return { my $y=shift; print "$x $y!\n"; }; } $h = newprint("Howdy"); # store anonymous subroutine &$h("World"); # call anonymous subroutine, which prints "Howdy World!" Note the $x in anonymous subroutine is private and retains value "Howdy" in $h even when newprint is called again: $g = newprint("Hello"); # $g has separate instance of $x Some Remarks about Subroutines - III Note differences between my() and local() my($x); # declares $x to be private to the block local($x); # declares $x to be known to this block and all routines invoked within the block Typeglob or symbolic reference can be used to pass arguments by reference and not by value This has usual advantage that subroutine alters "global" and not a "local" copy -- especially relevant for complex data structures where you do not want expense of copying Scalars are always passed by reference (not by value); by explicitly changing $_[0..], you can affect global scalars Some Remarks about Subroutines - IV -- Call by Reference sub doublearray { my(*arraypointer) = @_; foreach $elem (@arraypointer) { $elem *= 2; } } # End routine to double elements of an array # Suppose @foo and @bar are arrays: doublearray(*foo); # doubles elements of @foo doublearray(*bar); # doubles elements of @bar Some Remarks about Subroutines - V -- Separating Arguments There is a well-known problem with ordinary ways of using Perl subroutine arguments If one has a argument list such as (@list1, @list2, .... ), then the subroutine sees a single list (array), which is the concatenation of the component lists This can be avoided using hard references with the \ operator. For example: @tailings = popmany( \@a, \@b, \@c, \@d ); See next foil for code of popmany Some Remarks about Subroutines - VI -- Separating Arguments sub popmany { # See previous foil for use my $aref; # A local scalar to hold pointer to array my @retlist = (); # An array to hold returned list # Pop last element in each input array: foreach $aref ( @_ ) { # loop over arguments # @$aref is global array pointed to in argument: push(@retlist, pop(@$aref) ); } # @retlist holds last element of each array passed return @retlist; } Some Remarks about Subroutines - VII -- AUTOLOAD One can define a default function AUTOLOAD to resolve unsatisfied subroutine references in a given (set of) packages You set up AUTOLOAD to deal with this case in whatever way you want! AUTOLOAD is passed arguments that were passed to called subroutine and name of unsatisfied external is in variable $AUTOLOAD sub AUTOLOAD { # Call UNIX for unsatisfied externals my $program = $AUTOLOAD; $program =~ s/.*:://; # remove any package precursors system($program, @_); } date(); # will be executed correctly by above AUTOLOAD Perl5 Object Model -- I The object model in Perl5 is not as clear as in Java as the concepts are mixed up with the implementation. We see same flaws in JavaScript where we "violate" modular programming principles by mixing concept and implementation in the technology which is precisely designed to help programmer keep these separate in his or her own programs! Objects are references -- not directly variables -- they are typically references returned from subroutines as anonymous datastructures objects (references) must be "blessed" so they remember what class (module or package) they come from bless ($self, $class); return $self; # is classic way for a constructor subroutine new to end with $self the datastructure you wish to be stored in object Perl5 Object Model -- II Objects are further: objects are created by a "constructor" which is any suitable function in given module but CONVENTION says for one to call this constructor new so new is (arbitary) name for a subroutine and not a language construct A class method is a conventional Perl5 subroutine defined in given class (package) which expects its first argument to be either an object reference or for static methods (independent of object instance) the class name The class name IS the package name Constructor for class HTML::FormatPS - I $formatter = new(HTML::FormatPS , FontFamily => 'Helvetica', PaperSize => 'Letter'); # create instance of class given in first argument with following arguments overriding default paramters $formatter holds a reference to a blessed hash remember => is just a comma and arguments to a Perl subroutine are just a single list -- here of 5 entities In package HTML::FormatPS subroutine new looks like this (continued on next foil) sub new { my $class = shift; # set $class as package name and remove from argument list using shift function which takes @_ as default argument in subroutine Constructor for class HTML::FormatPS - II # Set up defaults in hash $self which is blessed my $self = bless { family => "Times", mH => mm(40), # mm is local subroutine converting millimetres to points mW => mm(20), printpageno => 1, fontscale => 1, leading => 0.1, }, $class; # second argument to bless is class name $self->papersize($DEFAULT_PAGESIZE); # To be Continued on next foil Constructor for class HTML::FormatPS - III # Parse constructor arguments (might override defaults) while (($key, $val) = splice(@_, 0, 2)) { # get in $key,$value next two elements from @_ and remove them from @_ # Here process key value pair and set $self hash as appropriate # See original for details which are irrelevant here } return($self); # return datastructure } splice function in Perl splice ARRAY,OFFSET,LENGTH,LIST remove LENGTH elements starting at position OFFSET in ARRAY and replace by elements ( if any) in LIST shift(@a) is equivalent to splice(@a,0,1) A Hash of Arrays in class HTML::FormatPS # We show this example of $PaperSizes{}[] which is a Hash whose key references a 2D array %PaperSizes = ( A3 => [mm(297), mm(420)], # mm is built in subroutine A4 => [mm(210), mm(297)], # to convert millimeteres to points ); so $PaperSizes{'A4'} returns a two dimensional array Example of Method in class HTML::FormatPS $self is of course pointer to datastructure and so following code will alter object passed in first argument sub papersize { my($self, $val) = @_; # $self is reference to object $val = "\u\L$val"; # Uppercase first, lowercase following letters in string $val my($width, $height) = @{$PaperSizes{$val}}; return 0 unless defined $width; $self->{papersize} = $val; # reset object attributes $self->{paperwidth} = $width; $self->{paperheight} = $height; 1; # return 1 (or 0 if error) BUT not $self } Inheritance in Perl5 This is implemented "by hand" using @ISA which is defined for every package and contains list of packages to be searched for unsatisfied externals package Fred; require Exporter; # Make package Exporter available to Fred @ISA = qw(Exporter); # Exporter is to be searched for unsatisfied externals # See Exporter manual page for more details Of course AUTOLOAD mechanism kicks in as technique of last resort if cannot find a subroutine anywhere else Some Predefined Variables in Perl - I There are the original cryptic two character names and those with a more mnemonic value which are accessible if one invokes use English; # pragma Here are a few examples -- we gave some of the predefined variables defining formatted output in first Perl foilset $ARG or $_ is default name when nothing specified s/rubbish//; # is equivalent to $_ =~ s/rubbish/; chomp(); # is equivalent to chomp($_); # etc. Some Predefined Variables in Perl - II A match m/regexp/; or equivalent s pattern match sets $MATCH or $& -- The matched string $PREMATCH or $` to be string before matched string $POSTMATCH or $' to be string after matched string $LAST_PAREN_MATCH or $+ contains material in last paranthesis matched useful when | syntax in regular expression make in unclear which $n is set for last paranthesis Some functions equivalent to Different Quotes q(string) or qDstringD for any delimiter D -- interprets string as a literal This delimiter use is like mDregexpD or sDregexp1Dregexp2D the one q in q() denotes single quotes q(string) is equivalent to 'string' except works even if unprotected ' in string qq(string) or qqDstringD is similarily equivalent to "string" except you do NOT need to protect " inside it However $variables and \n etc. are interpolated inside string The two q's in qq() denotes double quotes qx(string) is similarily equivalent to `string` -- the x in qx stands for execute Quotemeta() and \Q .. \E Construct quotemeta("string") protects all regular expression metacharacters [ab] matches a or b in a regular expression quotemeta("[ab]") becomes \[ab\] and matches string [ab] quotemeta("string") is equivalent to "\Qstring\E" where "string" is interpolated before protection (?..) Constructs in Regular Expressions - I (?#comment) is a comment in a regular expression m/[aA][bB](?#matches aB Ab ab AB)/; # is example /x modifier interprets whitespace in regular expression as for readability and not as "real" characters -- the Perl manual gives an example shown below which removes /* .. */ from C programs $program =~ s{ /\* # match opening delimiter /* .*? # minimal match to anything \*/ # closing delimiter */ }[]gsx; # Replace with nothing (i.e. remove) and specify modifiers g s and x to be operational (?..) Constructs in Regular Expressions -II modifier g ensures we match and remove all /* .. */ strings modifier s treats newlines as part of string modifier x means that whitespace ignored and # treated like a comment in conventional Perl Note use of syntax s{old}[new] as equivalent to s%old%new% One can use any types of parantheses such as s(old){new} etc. Minimal Matching in Regular Expressions The default pattern matching in Perl is greedy or maximal size matching There is now the ? option to designate the selection of match of minimum size * is replaced by *? to specify minimal match 0 or more times +? represents minimal match 1 time ?? represents minimal match 0 or 1 times {n}? minimal match exactly n times {n,}? minimal match at least n times {n,m}? minimal match At least n but not more than m times (?..) Constructs in Regular Expressions -III (?:regexp) means a simple grouping of regexp as a unit -- equivalent to (regexp) except it does not generate a $n reference regexp1(?=regexp2) matches to regexp1 followed by regexp2 but regexp2 is not considered part of match i.e. regexp1 in $MATCH, regexp2 in $POSTMATCH regexp1(?!regexp2) matches to regexp1 NOT followed by regexp2 (?i) (?m) (?imsx) etc are equivalent to specifying modifiers i, m or i and m and s and x respectively Some Further Perl4 and Perl5 Functions -- do and glob! do "filename.pl"; or more generally do EXPR; # where EXPR returns a string which is taken as a filename Perl executes contents of file specified by string This is a good way of loading in a block of subroutines glob EXPR returns the value of EXPR with conventional UNIX shell filename expansion glob 'string' is equivalent to where string has UNIX * wildcarding Upper and Lower case Functions We have learnt about \L (lower case characters until \E) and \l (lower case next character) and corresponding upper case \U \u so that "\u\LfOX\E" is "Fox" etc. (\E optional here) There are a set of function calls implementing these so that lc(STRING) converts STRING to lower case lcfirst(STRING) converts first character in STRING to lower case uc and ucfirst play same role for upper case ucfirst(lc 'fOX') returns 'Fox' The defined undef and exists functions defined(expr); # where expr is typically a variable such as $list[7] returns true if expr is defined (i.e. not equal to undef) undef $scalar; undef @list; undef %hash; # set all elements in passed reference to be undefined -- the argument can also be things like $hash{key}; undef on its own represents undefined value for returning from subroutines etc. undef can be very useful -- for instance you may wish to reuse a hash %parms and execuate undef %parms before re-use. exists($hash{place}); returns true if place has been defined as a key to %hash-- note this tests existence of associative memory key -- the value $hash{place} may still be undefined! The map and grep functions @chars = map( expr , @nums); Here expr is some Expression or Block which accesses variable $_ map sets $_ to be succesive values in list @nums and returns the successive results of executing BLOCK expr with each value of $_ These results are returned in list context and may give zero one or more entries into total list @chars A similar construct is grep which acts like map but returns a list containing just the entries in @nums for which expr is TRUE This is clearly like UNIX grep as grep ( /regexp/, @listoflines ); # returns just those lines matching regexp $_ is set by reference to $listoflines[0...] and so altering $_ will alter original @listoflines pack and unpack Functions -- I These are generalized tr like functions which convert a list to a string (pack) or string to list/array (unpack) according to a template We can illustrate with a fragment from a CGI script that reads and interprets data sent to a server in the peculiar application/x-www-form-urlencoded coding scheme $value =~ tr/+/ /; # Convert + back to blanks $value =~ /%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))//eg; pack and unpack Functions -- II The last line finds encoded hex characters %XY with X,Y in set A-F,a-f,0-9 and replaces them by the "real" character representation $1 is of course matched XY pair and "C" in pack first argument tells pack to generate character hex() is a Perl built in function to convert HEX strings to decimal General syntax is pack(TEMPLATE,LIST) where TEMPLATE is a string of identifiers specifying output style of successive characters (A is Ascii, l signed long, p pointer, f Float etc.) ref and scalar functions ref EXPR returns FALSE unless EXPR is a reference (pointer) If it is a reference, then ref returns Package Name if EXPR blessed REF if a reference CODE if a subroutine GLOB if a FileHandle SCALAR ARRAY HASH if one of three basic types scalar EXPR forces EXPR to be evaluated in scalar (as opposed to list) context and returns scalar result We do not need a "list" command as [ ] constructs anonymous arrays and ( ... ) generates a list trivially tie() and untie() Perl Functions - I tie() and untie() are described in PerlTIE Man Page These generate "enchanted" variables ( magical blessing!) which allows one to access what seems to be an ordinary variables in a Perl Program but behind the scenes, the implementation of variable dioes a lot of work which could include computation, data access etc. Examples in PerlTIE include: tie a scalar $priority which returns process or user priority by accessing UNIX system tie a hash to a database so that $tiedhash{lookupkey} returns value of lookupkey in database This latter tie is most powerful and could involve SQL access to the database sever when tied hash accessed tie() and untie() Perl Functions - II tie VARIABLE, CLASSNAME, LIST VARIABLE is a scalar, array or hash to be enchanted Note a given type of tie can have several variables tied which differ by their initial conditions which are specified in LIST which is handed to constructor The CLASSNAME is a module which must have some special ENCHANTED routines defined -- these are constructors TIESCALAR() TIEARRAY() or TIEHASH() for three variable types respectively and further functions to define operational access to variables which are listed on following foil tie() and untie() Perl Functions - III FETCH -- get variable value STORE -- store variable value DESTROY -- destroy variable Tied hashes also should provide EXISTS (implements exists) DELETE (delete one key) CLEAR (clear out all keys) FIRSTKEY() and together with NEXTKEY() system implements keys() and each() Note User provides these functions as they depend on particular behind the scenes manipulations Note standard classes provide ability to specify user routines BEGIN and END to be run at beginning and end of invocation of a class Interfacing Perl with C One can interface Perl5 code with C using an interface constructor called xsubpp This manipulates an existing C library and allows it to be accessed through a designated Perl module Simple datatypes can be handled automatically but user must manipulate complex C datastructures in special xsub code see PerlXS and PerlXSTUT manual entries Also PerlCALL and PerlEMBED manual pages describe how to call Perl from C One can add a Perl Interpreter to your C program, execute a Perl statement such as a pattern match or execute a full Perl subroutine