The main purpose of a World Wide Web server is to provide many people with the ability to read certain documents. A server must be set up properly to avoid allowing others to read files on the host computer that were not meant to be accessible, such as system configuration or password files. If we assume that this is the case, then there are really only two ways to limit access to the public documents -- through server controls, and through CGI application-based authorization.
Most WWW servers are designed to allow server administrators to restrict access to the documents that the system provides. For example, a server may grant access only to client requests coming from within a certain internet domain. Or, a server may only grant access to users who authenticate themselves (see also Chapter 18). CGI authorization relies on some means of authentication, and is discussed in section 12.3.1.2.
Some WWW servers allow clients to write directly to the document space that the server makes available. This allows a great deal of flexibility, yet provides the potential for some sticky security problems. Write access is provided by servers in the form of the PUT and DELETE methods of HTTP/1.1, and the multipart/form-data encoding type, which allows file uploading to take place within a WWW form (see Extensions to HTML 3.0).
It is up to the server to ensure that no one is allowed to maliciously alter the document space using the PUT and DELETE methods. But it is also up to the server administrator to ensure that any files uploaded to the host machine are appropriate and that they don't contain any viruses. The use of forms for file uploading requires the use of a CGI application to process the completed form that is POSTed to the server. This CGI-app can take the responsibility of placing any uploaded documents in a safe place until they can be reviewed, or can automatically perform virus checking on the documents. In this way, CGI-apps can provide some safeguards against attacks while still providing flexibility to users.
The greatest potential for security problems occurs when a server allows users to execute programs on the host computer. Consider this: a person can't get a virus from reading an email message (reading a message only involves viewing text characters), but if a person executes a program attached to an email message, and that program has a virus, then the person's system will probably become infected by the virus. The virus can only spread when part of it is being executed by the computer.
When a CGI application is used as a gateway to another application, the CGI-app will often make system calls -- that is, it will execute other programs on the host system. Frequently, the CGI-app will pass command-line parameters in these system calls, and the information provided in these parameters will come from a query string sent by a WWW client. If a user designs a clever query string, and the CGI-app doesn't sense what the user is doing, it is possible for a user to gain the ability to issue commands on the host system as if they were a user logged onto the system. Sometimes, servers will execute CGI applications with administrative access privileges -- which may give the intrusive user full control over the system!
A common illustration of this potential security hole is a "finger" gateway. The "finger" utility is a way of finding out a little bit of information about users of a system -- information that the users provide about themselves to the public. For example, if one were to issue the command "finger vanmetre@csgrad.cs.vt.edu" on a Unix or similar system, one could find out some more about the author of this section. Common "finger gateways" are CGI applications that accept a user name as a query string from a WWW client, perform a system call to the finger utility, and return the results to the client. So, if a client were to access a CGI application called finger.cgi with the URL
http://www.nowhere.com/cgi-bin/finger.cgi?vanmetre
the CGI-app would issue the system command
finger vanmetre
and return the results to the WWW client. Now suppose a user submitted the URL
http://www.nowhere.com/cgi-bin/finger.cgi?vanmetre;rm+-rf+%2F
After the CGI decodes this, it will issue the system command
finger vanmetre;rm -rf /
If the CGI-app was executed with full system privileges, this will not only return the finger information for user vanmetre, but it will also erase the contents of the hard drives on the system! Even if the CGI-app was not executed with full privileges, a user could create a query string that could mail off a password file, open a remote telnet connection, or do something else that would aid a break-in to the system.
Often, in the context of the World Wide Web, authentication is provided by a simple name/password combination. When a user enters a protected site, the server asks the user's WWW client for authentication. The WWW client then asks the user for a name/password combination, which it then sends back to the server. If the combination is a valid one, the user is authenticated, and can then be authorized to access the server's documents. Each time the authenticated user makes a request for a protected document, the name/pasword combination is resubmitted by the client to the server.
Servers can also provide authorization protection for specific documents or for sets of documents within the server document space. For example, all of the contents of a directory on a server can be assigned a list of authorized users, who must properly be authenticated before they can access the directory; all users who fail authentication will be denied access.
To provide a bit more flexibility with document control, authorization can be left up to a CGI application. When an authenticated user accesses a CGI application, an environment variable is set which provides the name of the user to the CGI-app; the CGI-app can decide whether or not to provide access to the user. In addition, a CGI application that generates complex WWW pages can present different pages to different users, as long as they are authenticated. A WWW site can keep a list of preferences or configuration details for each user and apply these preferences to the page generation methods of the site.
12.3.2. Language
There are many points to consider when choosing a language to use when
developing CGI applications. Here is a list of some of them:
There are many interpreted languages, such as PERL, that have become popular among CGI-app developers, and there are many compiled languages, including C, that are popular, too. There are advantages and disadvantages to both; these must be considered before developing CGI applications.
Programs written in compiled languages generally run faster than equivalent programs written in interpreted languages. However, interpreted languages can be more flexible, and can make prototyping much easier. If any changes need to be made to a CGI application, they can be made to an interpreted version much more quickly that to a compiled version, because the compiled version must be recompiled before the changes can take effect. Once a CGI application is in place, though, a user may not see much of a difference between an interpreted CGI-app and a compiled equivalent.
When developers choose a language to use for a CGI application, they must always think of the long term. They must consider how the site will operate in several years to make sure that the CGI application will not only meet the needs of the site today, but also in the future. The World Wide Web is changing extremely rapidly, so it is often difficult to keep the future in perspective.
The needs of a WWW site are always changing. As server technology advances, it is likely that a site will undergo some hardware and/or software modifications, while the contents of the site must stay the same. To make such transitions as smooth as possible, one must consider the portability and modifiability of CGI applications. If a language is chosen for a site that uses proprietary hardware or software, then the CGI-apps developed with that language may not be usable should that hardware or software change. One should choose a stable, flexible language (such as C) that will persist while needs and resources may fluctuate and evolve.
Quite often, a WWW site needs to be constructed very quickly. If CGI applications are needed at that site, then it may be necessary to choose a language that allows rapid development, yet may be weaker in other areas. CGI creation often takes the greatest proportion of development time, so any steps to minimize that development time may weigh heavily when a language is to be chosen. For a site that will not exist for very long (for example, a site that provides up-to-date Olympics results), development time is much more important than preparing for the future of the site. Interpreted languages such as PERL are often used for rapid development.
When a CGI-app is providing an interface to a second application, the dominant concern for language choice is often dictated by that second application. If a developer is creating a CGI-based Web interface to a proprietary database, then the CGI applications will have to access the database's programming interface, which may be very limiting. Many CGI applications are forced to make system calls (run programs from the command line) to interface with other applications; in this case, the secondary application may not have as much influence on language choice.
This is especially true on WWW servers that provide many dynamically produced documents. In this case, to serve one client request, it takes much more time and effort than it does to simply read a file and return it. When running a CGI application, the system needs to start a new process, execute the code (which could be complex), collect the results, and then return them. If the CGI-app is a gateway to a second application, say, a database, then there may be much time spent by the second application to provide any data needed by the CGI-app. While CGIs increase the flexibility of a server, they can significantly decrease the server's performance.
Copyright © 1996 J. Patrick Van Metre, All Rights Reserved
J. Patrick Van Metre
<vanmetre@csgrad.cs.vt.edu>
Last modified: Sat Oct 26 13:26:04 1996