Given by Nancy J. McCracken at ECS400fall96 Senior Undergraduate Course on Fall Semester 96. Foils prepared 10 Sept 1996
Abstract * Foil Index for this file
CGI is the Common Gateway Interface and is the scheme to interface other programs and systems to the HTTP Web protocol, using the same data protocols as the HTTP clients and servers. |
In this section, we will cover
|
References:
|
This table of Contents Abstract
Nancy McCracken |
ECS 400, Software Technologies for the WWW |
3-234 Center for Science and Technology |
Syracuse University |
111 College Place |
Syracuse NY 13244-4100 |
May 28, 1996 |
Click here for body text |
CGI is the Common Gateway Interface and is the scheme to interface other programs and systems to the HTTP Web protocol, using the same data protocols as the HTTP clients and servers. |
In this section, we will cover
|
References:
|
The client sends a request, conforming to the URL standard and formatted with a MIME header, to the server. |
The server parses the request and decides what to do:
|
The CGI program parses the input from the server and MUST generate a response - even if there is no data to send back, the CGI program must send an error or empty message since the http connection is still open and must be closed by the server. The CGI program will send a header to the server:
|
When the CGI program terminates, the server closes the connection. |
This example consists of a simple form with just a submit button to activate the CGI program. Note that no data is being sent from the form to the CGI program in this simple example. |
The Perl program returns output which is properly formatted HTML. The server returns it to the browser, which displays it as a page. |
Returning the html output is pretty simple as the server and browser handle the encoding and decoding of the MIME formatted message. The complications arise from sending text from the form to the CGI program; there are several ways to do it and the CGI program must decode the message. |
Two environment variables QUERY_STRING and PATH_INFO are used to pass data to the CGI program - there are several ways to do this. |
Using a normal HTML link to pass data:
|
Using the method "get" in a form to pass data, all the input that the user types is put in the QUERY_STRING variable: |
<form method=get action="http://www.some.box/name.pl"> |
Type your first name <input name="First Name"><br> |
Type your last name <input name="Last Name"><br> |
<input type=submit value="submit"> |
</form> |
If the user types "Winona" and "Ryder" as values, the QUERY_STRING environment variable would have the encoded value "First+Name=Winona&Last+Name=Ryder". |
Here is a Perl program that would print that string on the web client's window (without regular html tags): |
#!/usr/local/bin/perl |
print "Content-type: text/html\n\n"; |
print "You typed \"$ENV{QUERY_STRING}\" in the input boxes\n"; |
Method=Get is NOT RECOMMENDED as input too long can be lost! |
The web server also makes available information about the user and the server, including such things as what type of browser made the request. |
A list of environment variables available to the CGI program: |
GATEWAY_INTERFACE REMOTE_HOST |
SERVER_NAME REMOTE_ADDR |
SERVER_SOFTWARE AUTH_TYPE |
SERVER_PROTOCOL REMOTE_USER |
SERVER_PORT REMOTE_IDENT |
REQUEST_METHOD CONTENT_TYPE |
PATH_INFO CONTENT_LENGTH |
PATH_TRANSLATED HTTP_FROM |
SCRIPT_NAME HTTP_ACCEPT |
DOCUMENT_ROOT HTTP_USER_AGENT (browser) |
QUERY_STRING HTTP_REFERER |
It is recommended to use a form with METHOD=POST to safely pass any amount of data through STDIN.
|
This subroutine works with either the GET or POST method, obtaining the user input string from the form into a scalar variable "$in". It then splits this string into fields into the array "@in", where each element contains the encoded string for one field. |
For each field string, the subroutine converts all the encoding symbols. It then creates an associative array "%in" with a keyword,value pair from each field of the web form. |
This subroutine can be used without change in any Perl CGI program. |
All output written by the CGI program to STDOUT is taken by the server to process. The output should start with a header in one of three types:
|
On the web server that you are doing CGI programming, put the HTML pages with forms in a directory somewhere under the server's "document root" and the CGI program somewhere under the server's "cgi bin". The CGI program must have permissions properly set to be executed by the server. Furthermore, if the CGI program reads or writes to other files, then the server must have permission to do so. |
You can first debug your Perl program by executing it directly in the cgi-bin directory and providing test input in a file |
prog.pl < input.data |
When a CGI program crashes, an error should show up in the server's error_log file. |
Your Perl script is run with the current directory as the cgi-bin directory in which it resides, so in any file systems accessess that you program in Perl, the path names are evaluated accordingly. Suppose that you have the file system structure in the example below, then to open file1.txt or file2.txt from prog.pl, use: |
open (FILE1, "file1.txt"); and open (FILE2, "../../htdoc/njm/file2.txt"); |
Many servers, including NCSA, use an Access Control File (ACF) to configure basic authorization for access to all web documents in a directory. |
The global ACF is named access.conf in the server's configuration directory. |
Any directory in the server's document space can have a local ACF named .htaccess. Here is the format of an NCSA ACF: |
An example of a .htaccess file |
The password file can be created with a program called htpasswd (which creates a file called .htpasswd) that is distributed with the server. The password file should not be kept in a publicly accessible directory. |
A web server can return a sequence of replies to the web browser by running a Non-Parsed Header (NPH) CGI script, usually of the form:
|
It also uses a special MIME-type,
|
which allows each reply to replace the previous one on the same browser page. |
The main part of the document is a container, which has boundary strings between the individual entities, each starting with 2 dashes. The final boundary string also ends with 2 dashes to terminate the entire container. |
Suppose we have a set of gif files that make up an animation sequence: m01.gif, m02.gif, . . . The Perl program reads the list of file names from a text file and sends them as parts in a multipart MIME stream. |
The GIF89a specification allows a GIF file to have multiple blocks of data, each with a different image. The images can be set to display one after another or as overlays that partially replace sections of the preceding image. They can be set to have time delays and also to loop. |
Multiple-Block GIFs can be created by a program called Construction Set, from Alchemy Mindworks:
|
The web server can send an HTTP response with a header of type Refresh. This causes the browser to request a new page automatically after some time. |
In Server Push, the HTTP connection is kept open between the browser and the server for the duration of the responses. In Client Pull, the connection is closed and reopened each time the browser sends another request for a refresh. Thus Server Push is best for small size files that are going to be sent at small intervals, like animations of small images, and Client Pull is best for larger or longer interval transmissions, such as stock ticker updates. |