CGI basics

he Common Gateway Interface (CGI) is an interface between your Netscape FastTrack and Enterprise servers and programs you write. CGI lets those programs process HTML forms or other data coming from clients, and then it lets the CGI programs send a response back to the client. The response can be HTML documents, GIF files, video clips, or any data the client browser can view. This makes your Web pages interactive with the user.

The word gateway can be used in two ways. The original meaning describes a connection to other services through the HTTP server (some of these other services might be Archie/Prospero, WAIS, and so on). Currently, you can't access these services through the World Wide Web, but you can access them through a CGI program and an HTTP server.

Another interpretation defines gateway as a connection between programs and your HTTP server. This is the definition used for CGI. Like a gate in a fence between two fields, CGI rests between your program and the server.

The rest of this chapter describes:

The CGI request process in detail

This section provides an overview of how CGI fits into the interaction of client software like Netscape Navigator and HTTP servers like the Netscape FastTrack and Enterprise servers. When a client requests a document from a server, the server finds the file and sends it to the client (this is a simplification of the process). However, if a client requests a CGI program, the server simply acts as a middle-man between the client and the CGI program.

How CGI programs relate to HTTP clients and servers

The following steps are a very simplified overview of what happens when a client requests a CGI process (in-depth details appear in the following sections). Figure 2.1 is a high-level graphical overview of this process.

  1. The client (Netscape Navigator) sends a request to the server for a document. If it can, the server responds to the request directly by sending the document.
  2. If the server determines the request isn't for a document it can simply deliver, the server creates a CGI process.
  3. The CGI process turns the request information into environment variables. Next, it establishes a current working directory for the child process. Finally, it establishes pipes (data pathways) between the server and an external CGI program.
  4. After the external CGI program processes the request, it uses the data pathway to send a response back to the server, which in turn, sends the response back to the client. In reality, the behavior of the CGI process is a bit more complicated. The specific details that happen at each stage are described in the next few sections.

The client sends the request

The navigation software (the client) sends a request to the Netscape server running on your machine. The request might be for a document, or it might be something else, like the contents of an HTML form. If the request is for a regular document (such as an HTML document or a .GIF file), the server sends that document directly back to the navigator.

If the request is data intended for an external application, then the server needs to use CGI to run that application. For example, the client's request might be to search a database. The CGI application takes the search criteria, searches the database, then sends the results back to the client.

In fact, there is a pool of server threads running on your machine. The number of threads in the pool is determined by directives in the magnus.conf file. All of these servers await requests from the port (also specified in the magnus.conf file). Only one of the threads gains ownership of any specific arriving request, as shown in Figure 2.2.

How CGI programs relate to HTTP clients and servers

The server thread is one of the threads your server already has allocated with the MaxProcs directive. You should make sure you have enough threads dedicated to the server to ensure your machine doesn't run out of threads.

The server creates the CGI process

When a server receives a request that must be handled by an external application (a CGI request) that server creates a copy of itself (in Unix terms, it forks a process). This second process is called the CGI process because it is the process in which the CGI program will run. The CGI process has all the same communication pathways that the server process has. The only purpose for the CGI process is to set up communications between the CGI program and the server. Be careful about creating CGI programs that can run infinitely, because as long as the program runs your server has one less thread for servicing requests.

Because it is a copy of the server, the CGI process has access to information about the CGI request. For example, the CGI process knows

The server assigns variables and opens data paths

The CGI process takes the data the server has about the current request and puts it into environment variables. The CGI process enables the client to pass data to the CGI program as standard input.

The CGI process also creates data pathways between your CGI program and the server. This means the server can send to the program any encoded form data the client submitted, and the external CGI program can send a reply back to the client via the server.

The CGI process then executes the CGI program; that is, in Unix terms, it execs the CGI program, so that the code and data in this process are now that of the CGI program, not the copy of the server program.

How CGI programs relate to HTTP clients and servers

The CGI program sends the response to the client

The CGI program takes the data that the server provides through environment variables, standard input, or command-line arguments. It processes the data, contacts any external services it needs to (such as Archie or WAIS), and then sends a response to the server by way of the data pathways using standard output. The server then takes the program's response, prepends any necessary protocol headers to the output, and sends it back to the client software. Your program can output any type of data it needs to, including HTML, GIFs, or JPEGs.

You can also use nonparsed headers to bypass the server and send data directly to the client browser. See "Bypassing the server: nonparsed headers" on page 43 for more information.

Security concerns about CGI

CGI is a powerful tool for interacting with users, but it can also be a potential security problem. You should follow these guidelines when implementing CGI on your server:

Accessing CGI programs through URLs

When responding to a request from a client, the server must figure out if it can handle the request itself or if it must create a CGI process. To determine whether a URL that the client requests refers to a CGI program, the Netscape server can use two methods. You can configure the server to use either method, but neither is active by default.

If you try to access a CGI program and you get a server error accompanied by a message such as

no way to service request for /some/stuff.cgi

in your error log, or if the text of your program appears in the client's window, then you have not properly activated CGI in that directory. See page 48 for instructions on configuring your server to use CGI.

Embedding information in URLs

The method of CGI activation you choose determines only part of the URL used to access your program. URLs to CGI programs can be split into three different parts, shown here in brackets:

[virtual path][extra path information]?[query string]
First, you can use it to convey constant information to your CGI programs independent of the information the client sends.

Second, you can use it to access the server's virtual-to-physical path translation. You can send a virtual path as extra path information, so your CGI program can use the path information to access a file on your server machine. This means you don't have to embed file paths inside your CGI URLs. The server provides the physical path name corresponding to that virtual path in the environment variables by using path translation. For example, if your document root directory is /usr/docs/ and a request comes in with extra path information like /test/test.html, the server translates the path to /usr/docs/test/test.html and stores that translated path in an environment variable.

  • The query string is an optional part of the URL. It can be explicitly entered in your hypertext anchor, it can come from a user typing into a search dialog box for an HTML document with the ISINDEX tag, or it can come from HTML forms (see page 32). Table 2.1 on page 32 shows some sample CGI URLs. In all of these examples, the CGI program name is /misc/search.cgi, and the document root is in the physical directory /Netscape/docs.
    CGI URL

    Description

    http://mysrvr/misc/search.cgi

    This is a simple URL to the CGI program located at /Netscape/docs/misc/search.cgi with no extra path information and no search query.

    http://mysrvr/misc/search.cgi/type=minimal

    This example uses embedded extra path information in the URL: /type=minimal. This is the same as sending this information as a query string.

    http://mysrvr/misc/search.cgi/misc/movies.mdb

    This URL specifies the extra path information /misc/movies.mdb. The search script could then use the path information to find and search the database called /Netscape/docs/misc/movies.mdb.

    http://mysrvr/misc/search.cgi?netscape

    This could be a hyperlink that automatically searches for the word "netscape" without the user having to type anything.

    http://mysrvr/misc/search.cgi/misc/ movies.mdb?netscape

    This hyperlink would automatically return the results of a search for "netscape" in the movies database/Netscape/docs/misc/ movies.mdb.

    Accepting user input from URLs and other sources

    There are three types of information the client can send to a CGI program:

    When the data is typed as form text, the data is encoded using URL encoding. In URL encoding, there are two rules:

  • Spaces are converted to + signs.
  • Any of the characters can be "escaped" by changing them into a sequence in the form %xx, where x is a hexadecimal digit. For example, %21 is an exclamation point. See the Installation and Reference Guide for a list of escaped characters.

    HTML form data

    If the data comes from an HTML form, then the location of the data varies depending on the method attribute specified with the FORM tag in your HTML document.

    Wherever the data comes from, it appears in this form:

     	name1=value1&name2=value2 ... &nameN=valueN
    
    The Perl and C examples later in the chapter have functions that do this decoding.
    If there are any equals signs (=) or ampersands (&) in the data, they're encoded using URL encoding. This avoids ambiguity when your program translates the form data. To properly decode this data, your CGI program should first split it into name-value pairs (eliminating the ampersands), then split each pair into a name and a value, and then apply URL decoding to the name portion and to the value portion of the pair.

    When a user submits a form, you can use the order of the form items to determine what order your CGI program receives the name-value pairs. However, you should not depend on this behavior. The various form elements have their own rules for determining what value is associated with the name they are given:

  • All of the text input areas use the user's typed input as the value.
  • Radio buttons use the value of whichever button is enabled.
  • If checkboxes are unchecked, either they will use an empty value string, or their name won't appear in the encoded form data at all.
  • Hidden form elements can send constant or per-document data to your script without the user's knowledge or intervention. The example program documented on page 57 demonstrates decoding form data.

    Input to an ISINDEX search dialog

    When the data comes from a search dialog resulting from an ISINDEX tag, the escaped character decoding is done by the CGI process. Your CGI program receives this information fully translated as command-line arguments. This is handy if you want to avoid the hassle of performing this translation yourself.

    ISMAP or imagemaps

    In the case of clickable images, the data you receive from the client software is sent as a query string that takes the form xx,yy where xx and yy are the coordinates of where the user clicked the image. Coordinates are measured as the number of pixels from the upper left corner of the image the user clicked. This lets your CGI program respond differently depending on where the user clicked the image.

    Data the server sends to the CGI program

    The server sends data to the CGI program in three ways: environment variables, standard input, and command-line arguments.

    Environment variables are the most common method used to pass data about a request to your CGI program. This data comes from the server software itself, from the network socket connecting the client to the server, and from the URL that was used to access the CGI program (for example, when using the GET method, the data is sent to the QUERY_STRING variable).

    This section describes how to obtain the information stored in environment variables, standard input, and command-line arguments.

    Accessing environment variables

    There are a several ways programs can access environment variables. The method you use depends on the programming or scripting language you use. Environment variables are identified by character strings and have character string values. This section lists some examples of the different methods programming languages use to access environment variables.

    Using Java Applets

    In Java, environment variables are accessed as "system properties" by use of the System.getProperties or System.getProperty method or other related methods.

    rhost = System.getProperty("REMOTE_HOST");
    

    Using C or C++

    In C or C++, you can use the getenv library call to access the environment variables.

    #include <stdlib.h> 
    ... 
    char *rhost = getenv("REMOTE_HOST");
    

    Using Perl

    In Perl, environment variables are accessed through a simple array.

    $rhost = $ENV{'REMOTE_HOST'};
    

    Using the Bourne shell

    In the Bourne shell (/bin/sh), environment variables are accessed just like normal shell variables.

    RHOST=$REMOTE_HOST
    

    Using the C shell

    The C shell is similar to the Bourne shell, but it needs the keyword set before any variable assignment.

    set RHOST = $REMOTE_HOST
    

    Environment variables and their formats

    This section lists all of the environment variables and their formats. The Netscape servers pass only these environment variables to the CGI, in order to save storage space and improve security.

    SERVER_SOFTWARE
    This environment variable contains the name and version of the software that your program is running under.

    FORMAT
    name/version

    EXAMPLE
    Netscape-FastTrack/2.0

    Netscape-Enterprise/2.0

    SERVER_NAME
    This environment variable contains the domain name or IP address of the server machine.

    FORMAT
    A fully-qualified domain name or IP address

    EXAMPLE
    198.93.93.10 or www.netscape.com

    SERVER_URL
    This environment variable contains the URL that individuals should use to access this server. This variable is not supported by revision 1.1 of the CGI interface. It is only available using Netscape server software.

    FORMAT
    protocol://hostname[:port]

    If the server is running on a protocol's default port, the :port section won't be present.

    EXAMPLE
    http://www.netscape.com:8081

    GATEWAY_INTERFACE
    This environment variable contains the revision of the CGI specification supported by the server software.

    FORMAT
    CGI/n.n

    n.n is the numerical revision.

    EXAMPLE
    CGI/1.1

    SERVER_PROTOCOL
    This environment variable contains the name and revision of the protocol being used by the client and server.

    FORMAT
    name/version

    EXAMPLE
    HTTP/1.0

    SERVER_PORT
    This environment variable contains the number of the port to which this request was sent.

    FORMAT
    A number between 1 and 65,535

    EXAMPLE
    80

    REQUEST_METHOD
    This environment variable contains the name of the method (defined in the HTTP protocol) to be used when accessing URLs on the server. When a hyperlink is clicked, the GET method is used.

    When a form is submitted, the method used is determined by the METHOD attribute to the FORM tag. (See page 33 for more information.)

    CGI programs do not have to deal with the HEAD method directly and can treat it just like the GET method.

    FORMAT
    method

    EXAMPLES
    GET, HEAD, POST

    PATH_INFO
    This environment variable contains the extra path information that the server derives from the URL that was used to access the CGI program.

    FORMAT
    /dir1/dir2...

    EXAMPLE
    /html/graphics/doc1.gif

    PATH_TRANSLATED
    This environment variable contains the actual fully-qualified file name that was translated from the URL. The Netscape server distinguishes between path names used in URLs, and file system path names. It is often useful to make your PATH_INFO a virtual path so that the server provides a physical path name in this variable. This way, you can avoid giving file system path names to remote client software.

    FORMAT
    /dir1/dir2...

    EXAMPLE
    /Netscape/docs/doc1.html

    SCRIPT_NAME
    This environment variable contains the name of the virtual path to your program. If your program needs to refer the remote client back to itself, or needs to construct anchors in HTML referring to itself, you can use this variable.

    FORMAT
    /dir1/dir2/progname

    EXAMPLES
    /orders/tickets.cgi, /cgi-bin/order-tickets

    QUERY_STRING
    This environment variable contains information from an HTML page to your script in these three instances:

    The information sent by the server is encoded using the URL encoding rules described earlier.

    FORMAT
    varies

    EXAMPLE
    With links, you can get play or view returned from the following:

    I want to <A HREF=multimed.cgi?play>play some music!</A> 
    I want to <A HREF=multimed.cgi?view>view a graphic!</A>
    
    From a form, you might get button1=on&button2=off, or from a document that contains the ISINDEX tag you might get two+words.

    REMOTE_HOST
    This environment variable contains the host name of the remote client software. This is a fully-qualified domain name such as www.netscape.com (instead of just www).

    FORMAT
    machine.subdomain.domain

    EXAMPLE

    www.netscape.com

    If no host name information is available, the script relies on the REMOTE_ADDR variable instead.

    REMOTE_ADDR
    This environment variable contains the IP address of the remote host. This information is guaranteed to be present.

    FORMAT
    n.n.n.n

    n is a number between 1 and 255.

    EXAMPLE
    198.93.93.10

    AUTH_TYPE
    If the CGI script is protected by any type of authorization, this environment variable contains the authorization type. The Netscape server supports HTTP basic access authorization; it will probably support additional types of authorization in the future as standards develop.

    EXAMPLE
    basic

    REMOTE_USER
    This environment variable is set to the name of the local HTTP user of the person using the navigation software only if HTTP access authorization has been activated for this script's URL. Note that this is not a way to determine the user name of any person accessing your program.

    EXAMPLE
    jdoe

    CONTENT_TYPE
    If a form is submitted with the POST method, then this environment variable contains the type of data being sent by the client. Note that while clients currently only send application/x-www-form-urlencoded, this variable can contain any MIME type. Future systems might use this method to transfer data back and forth.

    FORMAT
    type/subtype

    CONTENT_LENGTH
    This environment variable contains the number of bytes being sent by the client. You use this variable to determine the number of bytes you need to read.

    EXAMPLE
    Content-Length: 64

    Secure server variable formats

    The Netscape FastTrack and Enterprise servers define the following additional environment variables to describe the security status of the server and the client.

    HTTPS
    This environment variable has the value on or off, depending on whether security is active on the server.

    HTTPS_KEYSIZE
    When security is on, this environment variable contains the number of bits in the session key used to encrypt the session.

    HTTPS_SECRETKEYSIZE
    This environment variable contains the number of bits used to generate the server's private key.

    HTTP headers as environment variables

    In addition to the other environment variables, if the client sends any HTTP headers along with its request, then these headers are also placed into the environment. The only exception is the Authorization header.

    The names of the environment variables are the names of the HTTP headers, and are prefixed with HTTP_. All letters in the name are changed to upper case. All hyphens are changed to underscore characters. Examples of these HTTP headers are described in the following sections.

    HTTP_ACCEPT
    This environment variable enumerates the types of data the client can accept. For most client software, this protocol feature has become a bit convoluted and the information isn't always useful.

    FORMAT
    type/subtype[, type/subtype]...

    EXAMPLE
    image/gif, image/jpeg, */*

    HTTP_USER_AGENT
    This environment variable identifies the browser software being used to access your program.

    FORMAT
    varies

    EXAMPLE
    Mozilla/1.1N (Windows)

    HTTP_IF_MODIFIED_SINCE
    This environment variable contains the date, set according to GMT standard time. This enables a client to request that the program's response be sent only if the data has been modified since the given date.

    FORMAT
    Weekday, dd-mon-yy hh:mm:ss GMT

    The Weekday specifies the full name of the day, such as Thursday or Friday. The dd specifies the number of the day of the month.The mon specifies the three-letter abbreviation of the month. The yy specifies the current year within the century. The hh:mm:ss gives the current time in 24-hour format.

    EXAMPLE
    Saturday, 12-Nov-94 14:05:51 GMT

    Using standard input to get information

    HTML forms that use the POST method send their encoded information using the standard input. You can use the CONTENT_LENGTH environment variable to determine the number of bytes to read in.

    Although the only content-type currently used is application/x-www-form-urlencoded, you can use the standard input with a custom navigator to send other types of data to your programs via the standard input. For example, a user could send scientific data to your CGI program, which then saves the data to a file on your server machine.

    Sending command-line arguments to your CGI program

    Command-line arguments are only used with ISINDEX queries. The query string is split and placed in the command line only if the string doesn't contain = characters (you must encode the = sign if you want to use it as an argument).

    Command-line arguments are not used with an HTML form or any undefined query types.
    The server searches the query information for a nonencoded = character to determine whether to use the command line. This means the client applications must encode the "=" sign in ISINDEX queries if the CGI program is to use the command-line arguments.

    If the server finds any encoded = characters, it decodes the query information by first splitting it at the plus signs in the URL. It then performs additional character decoding before placing the resulting characters as command-line arguments. For example, the information name+date is split and sent as command-line arguments name date to the CGI program.

    Note:
    If the server cannot send the string due to internal limitations (such as exec() or /bin/sh command-line restrictions) the server sends no command-line information and provides the nondecoded query information in the environment variable QUERY_STRING.

    Sending output from CGI programs

    Usually the CGI program's output goes to the client through the server-spawned CGI process. This means your CGI program doesn't have to worry about protocol-specific headers and such. This can make your CGI programs simpler and guarantees that they can take advantage of newer revisions of the protocol with little or no changes to your program code.

    However, if you know enough about the HTTP protocol to code the protocol and send it directly back to the client, you can use the nonparsed header feature.

    Bypassing the server: nonparsed headers

    Nonparsed headers let you bypass the server and send your CGI output directly to the client. As of CGI version 1.1, the only reason you would use this feature is if your program outputs an excessively lengthy amount of data and you want to sidestep the server buffering of your output (not all HTTP servers buffer the output).

    To activate the feature, make sure your CGI program names start with the characters "nph-". The server makes the standard output a direct copy of the socket to the client. Once you activate this feature, your CGI program is responsible for any protocol-related response headers or messages, including the following:

    HTTP/1.0 200 OK 
    Date: DayOfWeek, DD-Mon-YY nn:nn:nn GMT 
    Server: Netscape-FastTrack/2.0 
    MIME-version: 1.0
    
    Even though you can control the nonparsed header feature, in most cases you should avoid it because CGI programs must print a valid CGI header on the standard output in order for the server to accept the response and send it to the client. In the Netscape server, the standard output and the standard error file streams are directed to the same place: back into the server.

    This means that errors your program generates or system utilities your program calls can interfere with your header. Similarly, if your program is abnormally terminated (through a bug or some other disaster), the server will send a server error to the client and describe the error in the server's error log file. Because of this, you should print your header as early as possible in your program.

    CGI generic headers

    When the CGI program sends its output to the server, it begins the text with generic headers. A CGI header consists of several text lines in this format:

    name: value
    
    The end of the header is signalled by a single blank line. After the blank line, the server stops parsing your program's header and sends the rest of your data untouched to the client. This means that your program can output any type of data it needs to, including HTML, GIFs, or JPEGs.

    Each name-value pair is an HTTP protocol header. You can output any header you want and the server sends it to the client. However, if the server detects odd header lines, the server logs a 500 error and doesn't return any data to the client.

    Some of the commonly used HTTP headers are described here. When you output any of these headers, the server doesn't alter their values or their output.

    Content-length
    This header reports the length of the data in bytes, not including the header.

    Content-type
    This header reports the type of data your program is returning. This is a valid MIME type in the format type/subtype. This header should always be sent from any CGI program.

    EXAMPLE
    text/html, text/plain, image/gif, image/jpeg, audio/basic
    
    Expires
    This header reports the date on which this file should be considered outdated by the client. The date format is the same as the format for the if-modified-since header.

    EXAMPLE
    Saturday, 12-Nov-94 14:05:51 GMT
    
    Content-encoding
    This header reports that the data is the given content-type, but it is compressed. Current values that can be used are

    CGI-specific headers

    The headers listed in this section are special to CGI and make the server act on your program's behalf.

    Location
    The Location header reports the location of a new file for the server or client to retrieve. This header must be in one of two forms: a complete URL or a virtual path.

    If the value is a full URL, such as http://mysrvr/misc/file.html then the server redirects the client to the new URL (this is transparent to the user). The client then acts as if it had originally requested that redirected URL, so all relative links in the document of the URL are resolved from the directory specified in that URL. For example, if the URL points to an HTML file with relative paths to graphic files, the client locates the files from the directory where the HTML file is.

    Note:
    You do not necessarily have to redirect to an HTTP URL; you can redirect to a Gopher, news, FTP, or any other valid URL.

    If the location is a virtual path, such as

    /misc/file.html
    
    then the server restarts the request using the virtual path, for example

    http://mysrvr/misc/file.html.
    
    However, the client isn't informed of the new location, so any relative links in the document are resolved from the directory of your CGI program, not of the document that is actually being returned. This means images referenced in the document might not work because the client might be looking for them in the wrong directory.

    Status
    The Status header is a status code that is returned for every HTTP request. The status code indicates to the client whether the request succeeded or not. If the request was unsuccessful, one of several error codes is provided to tell the client what happened. If the request was successful, there are status codes that indicate a successful request and to ask for further action from the client. If no Status header line is provided in the CGI header, the default is assumed 200 OK unless a Location header with a full URL is present. If the location is present, the default is 302 Found.

    The status line has the form nnn reason, where nnn is the three-digit code for the request, and reason is a short string describing the error. The following codes and reasons are currently recognized by Netscape Navigator.
    Error Code

    Description

    200 OK.

    The request finished normally.

    204 No response.

    The request was understood and processed, but there is no new document to be loaded by the client.

    302 Found.

    The client should look for data at a new URL, given by a Location header.

    304 Use local copy.

    The client sent a request with an if-modified-since header, and the requested data hasn't been modified since the given date.

    400 Bad request.

    The request had illegal or unintelligible HTTP inside.

    401 Unauthorized.

    If access authorization is enabled, the request could not be fulfilled because the user did not provide the proper authorization to access the area. With current authorization schemes, a WWW-Authenticate header must be provided to give the client instructions on how to complete the request with the proper authorization.

    403 Forbidden.

    The client is not allowed to access what it requested.

    404 Not found.

    The client asked for something the server couldn't find.

    500 Server error.

    This is a catch-all error code that indicates something went wrong in the server or the CGI program, and the problem stopped the request from being completed.

    501 Not implemented.

    The client asked the server to perform an action that the server knows about, but can't do.

    Sample program output

    The following CGI program output sends an HTML document back to the client:

    Notice the blank line after the header. This lets the server know when the header ends and where document begins.
    Content-type: text/html 
     
    <title>My little document</title> 
    This is my own little document. Do you like it?
    
    The following output instructs the client to retrieve a different URL. The small HTML fragment at the bottom allows any navigation software that doesn't support redirection to retrieve the given URL.

    Location: http://www.sample.org/abc/afile 
     
    This document can be accessed at the following <a href=http://
    www.sample.org/abc/afile>location</a>.
    

    Configuring your server to use CGI programs

    Before your server can use CGI programs, you must either specify that all files in designated directories are CGI programs or activate CGI as a MIME type.

    Specifying a CGI directory

    You can specify a directory that contains only CGI programs. You can also edit and remove current CGI directory mappings.

    CGI programs are an excellent way to extend the capabilities of your server. However, they can also be somewhat of a security risk; you might want to keep tabs on all CGI programs in one directory.

    You can activate CGI as a file type for part of your server instead of selecting a specific directory. If you do this, any file with the .cgi extension is interpreted by the server as a CGI program. This lets you keep your CGI programs next to the HTML files they affect.

    Click the link (under the Programs tab) for specifying a CGI directory, then enter a URL prefix that you want to map to a CGI directory. The prefix is used in URLs to indicate the URL specifies a CGI program. For example, you can use the URL prefix cgi-bin, which means your URLs will be of the form http://www.acme.com/cgi-bin/program.

    Next, you specify the directory you want that URL prefix to map to. It should be a full system path.

    Activating CGI as a file type for parts of the server

    Sometimes, it makes more sense to keep CGI programs near the HTML files that refer to them. You also might not want to have the programs kept in a central directory (for example if you have many people controlling their own directories in the document root).

    When CGI is active as a file type, any file with the .cgi extension is interpreted as a CGI program.

    To activate CGI, in the Server Manager, click the link to activate CGI as a file type. Use the buttons at the top of the page to select a resource (the entire server, a directory, or a wildcard pattern of directories--such as */user-cgi/*), and then check the option that activates the CGI file type.

    Setting up a default query handler

    See "Sending command-line arguments to your CGI program" on page 42 for more information on ISINDEX.
    The HTML ISINDEX tag is an easy way for a client to send queries to the server. When the server receives a request from an ISINDEX tag, it sends that request to a CGI program that acts as a query handler. You can set a default query handler for part of the server.

    1. In the Server Manager, click the link to set a default query handler.
    2. Use the buttons at the top of the page to select a resource. If you choose a directory, then the default query handler you specify runs only when the server receives a URL for that directory or any file in that directory.
    3. Finally, enter the full path name for the CGI program you want to use as the default for the resource you chose.

    Customizing server-parsed HTML

    Normally, HTML documents are sent to the client exactly as they are stored on disk. However, sometimes you might find it useful to have the server parse these files and insert request-specific information or files into the document. You can do this through server-parsed HTML.

    To customize server-parsed HTML, follow these steps:

    1. Click the Server Manager link to the customized form. Use the buttons at the top of the form to select the directory where you want to set up server-parsed HTML.
    2. Choose whether you want to activate the exec command. The exec command lets an HTML file run an arbitrary program on the server (you might want to deactivate the exec command for security or performance reasons).
    3. Choose a method the server uses to determine which files should be parsed. The usual method is to use a file extension of .shtml for server-parsed HTML files. Sometimes you might not want to use a different file name extension.

      The server can also parse only files with the Unix file permissions set so the execute bit is on. This is often unreliable because some documents have the execute bit set even though they aren't really executable.

      The server can also look at every HTML file on the server. This can be a large performance hit because the server must look at every single HTML file it sends back from the parsed directory. (If the directory isn't that large, this may not be that much of a problem.)

    Server-parsed HTML commands

    The files the server parses should contain commands the server recognizes. The server replaces the commands with data defined by the commands and their related attributes.

    The format for the commands is:

    <!--#command attribute1 attribute2 ... -->
    
    Note that the command must be in lower case. The format for each attribute is the typical name-value pair:

    name="value"

    The config command
    The config command controls file parsing.

    The include command
    The include command inserts a text file into the parsed file (this can't be a CGI program). You can chain files by referencing another parsed file, which then includes another file, and so on. The user requesting the parsed document must also have access to the included file if your server uses access control for the directories those files appear in.

    The echo command
    The echo command sends the data in an environment variable or special variable. You use the attribute var to specify the variable to echo.

    Example
    <!--#echo VAR="DATE_GMT"-->
    
    The fsize command
    The fsize command sends the size of a file. The attributes are the same as those for the include command. The file size format is determined by the sizefmt attribute in the config command.

    Example
    <!--#fsize FILE="bottle.gif"-->
    
    The flastmod command
    The flastmod command prints the date a file was last modified. The attributes are the same as those for the include command. The date format is determined by the timefmt attribute in the config command.

    Example
    <!--#flastmod FILE="bottle.gif"-->
    
    The exec command
    The exec command runs a shell command or CGI program.

    Environment variables that affect parsing

    In addition to the normal set of environment variables, you can include the following variables in your parsed commands:

    The DOCUMENT_NAME environment variable contains the file name of the parsed file.

    The DOCUMENT_URI environment variable contains the virtual path to the parsed file (for example, /user/test.shtml).

    The QUERY_STRING_UNESCAPED environment variable contains the unescaped version of any search query the client sent with all shell-special characters escaped with the character.

    The DATE_LOCAL environment variable contains the current date and local time.

    The DATE_GMT environment variable is similar to DATE_LOCAL but is expressed in Greenwich mean time.

    The LAST_MODIFIED environment variable contains the date the file was last modified.

    Adding signatures (trailers) to files

    You can append a custom signature, or "trailer" to documents within a certain part of the server without having to use server-parsed HTML. With server-parsed HTML, you can add a trailer to an HTML file, but sometimes you might not want the overhead of server-parsed HTML. Trailers let you do without parsing the HTML.

    In the Server Manager, click the link (under Content Management) to add customized signatures. Use the buttons at the top of the form to select the directories of files you want to add custom trailers to. The trailers are added when someone requests the file from your server.

    Next, choose what file type to add the trailer to. Type a wildcard pattern such as *.html.

    Choose what format the last modification date should have. You can choose from the list of formats given, or you can specify your own using the strftime format. See your system's documentation about the strftime function for details on that format.

    Finally, you can type your text trailer using HTML tags and entity encoding. You can use up to 254 characters. Any existing trailer appears in the box.

    Note:
    Any entities you type in the trailer are decoded if you later edit the trailer. Be sure to re-encode the entities before submitting the form again! For example, if you use "&amp;" in your trailer, when you later edit the trailer you'll see simply "&". You need to change "&" to "&amp;" before submitting the form.

    The HTTP_FROM environment variable is not actually implemented. Apparently, few browser/clients support it.