WWW: Beyond the Basics

12. Common Gateway Interface

12.2 Creating CGI Applications

There are many things that one must consider when designing CGI applications (CGI-apps). This section covers the basic requirements of such applications, including the input and output needed to make a CGI-app work, and how side effects can be used to an advantage. Section 12.3 discusses some more general issues which are important to effective CGI-app creation.

The output from a CGI application contains two sections: the header and the body. The header is always the first output that a CGI-app generates. A blank line immediately follows the header information, and the body follows this blank line. Generally, the header includes information about the data contained in the body.

12.2.1. Header Output

Usually, a CGI application doesn't need to produce much header information. When a WWW server returns a (static or dynamic) object to a client, it includes information about the object in the header. This information could be the time it was last modified, etc. (see Object Header lines in HTTP). Such information is provided to the client by the server whether or not the page is created by a CGI-app; if it is, then this information is merged with the header information that the CGI-app produces. There are three main pieces of information that a CGI application can include in its header:

12.2.2. Body

The body of the CGI-app output follows the blank line which separates it from the header. The body contains all of the data which is to be displayed by the WWW client. If the content type specified by the header is text/html, then the body should contain the HTML code that the CGI application generates. If a CGI-app generates a GIF image, then the body of the output should contain the bytes that make up a valid GIF image.

12.2.3. Handling Input

When a client asks a server to launch a CGI application, it can give the server some input to provide to the CGI-app. There are three important ways to give input from a client to a CGI application:
  1. Through a query string
  2. Through the command line
  3. Through standard input

12.2.3.1. Query String

A WWW client can send what's called a query to a CGI application by appending a '?' followed by a query string to the URL for the CGI application. This query string is composed of name-value pairs in the form of name=value, where name is the name of a variable and value is the value assigned to it. Each name-value pair in the query string is separated by a '&'. The query string is sent to the CGI application through the environment variable QUERY_STRING.

The most common way to send a query string to a CGI application is by setting up an HTML form that uses the HTTP GET method (Predefined Methods). Each item in the form that can take on a value is given a unique name. When the form is filled out by a user, the user's WWW client sends back the names of the items and the values the user gave those items in the form of a query string. The CGI application to which the client sends this information can then dissect the query string to find the information provided by the client.

Since the query is appended to the URL for the CGI application, it is "URL encoded," and must be decoded. There are some characters that have special purposes in a URL, such as the colon (':') and forward slash ('/'). Some characters are not allowed to appear in URLs at all, such as spaces. For this reason, URLs are encoded in a special way. All spaces are replaced by a plus sign ('+'), and all special characters are replaced by %xx, where xx is the hexadecimal representation of the ASCII value for a character. All URL encoded strings must be decoded before they can be used properly.

12.2.3.2. Command Line Parameters

Passing parameters to a program on the command line is an easy way to provide information to an application. This method, however, is rarely used for CGI applications. It is useful, though, when an author of a CGI application wants to pass a single parameter to the application without having to parse a query string. This method was originally designed for use with the ISINDEX tag (Document Structure), but it may be used in other ways.

Let's say that there is a server called "www.nowhere.com", and on that server in a directory called "cgi-bin" exists a CGI application called "foo.cgi". This application may be accessed through the following URL:

http://www.nowhere.com/cgi-bin/foo.cgi

If we were to send a query to this CGI application with no equals sign ('=') in the query string, then the query string would be passed to the CGI application through the command line. If an equals sign is present in the query string after the question mark ('?') (see above, section 12.2.3.1), then the entire query string is provided through the QUERY_STRING environment variable. So, if the following URL is accessed:

http://www.nowhere.com/cgi-bin/foo.cgi?chicken

then the string "chicken" would be the first parameter passed to foo.cgi when the server started its execution. If the following URL is accessed:

http://www.nowhere.com/cgi-bin/foo.cgi?yummy=chicken

then the string "yummy=chicken" would be placed in the QUERY_STRING environment variable; foo.cgi should ignore anything in the list of command line parameters passed to it upon execution.

12.2.3.3. Standard Input

Information is only given to a CGI application through standard input when the CGI is accessed with either the POST or PUT methods (see Predefined Methods). The body of the POST or PUT request (sent by the client to the server) is used as the standard input to the CGI application. The information in the body transferred by the POST method is URL encoded.

For example, if a user fills out a World Wide Web form, the WWW client can send the user's form data to a CGI-app using the POST method. The CGI application which receives this data will have the form information provided to it through standard input. In fact, the format of the information will be the same as if it were provided in the QUERY_STRING environment variable via the GET method.

12.2.4. Side Effects

In programming contexts, "a side effect of a function ... occurs when a function changes either one of its parameters or a global variable." (Sebesta) In some cases, such a modification may not be a desirable occurance. But, there are many situations where global information may need to be modified, especially on a World Wide Web server. In fact, such "side effects" may not be unwanted at all -- they may be the main purpose of a CGI application.

Whenever a World Wide Web server uses the Common Gateway Interface to allow people to access the contents of a database, the server makes that database information global. In one sense, a CGI application is essentially a function; it takes input and produces output. If a CGI application that is acting as a gateway to a database modifies the data within that database, then this is technically a side effect to the CGI application; it is acting as a function that has modified global variables. Whenever another user accesses the modified information, they will see the changes made by the other CGI application. This is the intended function of the CGI-app, but it is technically a side effect.

Most World Wide Web servers are capable of serving more than one request at a time; that is, a server can operate with some concurrency. This can cause problems when designing CGI applications that modify a global information space. (For more information on concurrency issues, consult an Operating Systems text, such as Deitel).

12.2.5. Making CGI Applications Accessible

When a CGI application is ready to be made available to users, it must be stored in a location that can be accessed by the World Wide Web server running on the host machine. This way, when a user tries to access the URL that points to that CGI-app, the server will have the ability to execute it. The server must, however, know that the file to which the user's URL points is a CGI application and not a static document, such as a text or HTML file. There are usually two ways of accomplishing this:

[PREV][NEXT][UP][HOME][VT CS]

Copyright © 1996 J. Patrick Van Metre, All Rights Reserved

J. Patrick Van Metre <vanmetre@csgrad.cs.vt.edu>
Last modified: Sat Oct 26 13:26:04 1996