Given by Nancy J. McCracken at ECS400 Senior Undergraduate Course on Spring Semester 1997. Foils prepared 11 May 1997
Outside Index
Summary of Material
MIME stands for Multipart Internet Mail Extensions and is the developing standard for the contents of all messages passed over the Internet. |
HTTP is Hypertext Transport Protocol and is the protocol that provides the basis of the World Wide Web: transmitting multimedia documents across the Internet. HTTPD is the daemon running the HTTP Web server. |
URL stands for Uniform Resource Locator and is the universal addressing scheme for all documents (multimedia) on the WWW. |
CGI is the Common Gateway Interface and is the scheme to interface other programs and systems to the HTTP Web protocol, using the same data protocols as the HTTP clients and servers. |
References:
|
Outside Index
Summary of Material
Wojtek Furmanski, Nancy McCracken |
NPAC |
Syracuse University |
111 College Place |
Syracuse NY 13244-4100 |
January 31, 1996 |
updated September 1996 |
Click here for body text |
MIME stands for Multipart Internet Mail Extensions and is the developing standard for the contents of all messages passed over the Internet. |
HTTP is Hypertext Transport Protocol and is the protocol that provides the basis of the World Wide Web: transmitting multimedia documents across the Internet. HTTPD is the daemon running the HTTP Web server. |
URL stands for Uniform Resource Locator and is the universal addressing scheme for all documents (multimedia) on the WWW. |
CGI is the Common Gateway Interface and is the scheme to interface other programs and systems to the HTTP Web protocol, using the same data protocols as the HTTP clients and servers. |
References:
|
Some material presented here comes from Internet documents. Here is a summary of various document formats you may find. |
Internet Drafts
|
Internet Memos
|
Internet Standards
|
Here are a few sample Internet documents relevant for this part of the course. |
RFC-822: Crocker, D., "Standard for the Format of ARPA Internet Text Messages", SRD 11, RFC 822, UDEL, 1982. |
RFC-1036: R. Horton and R. Adams, "Standard for Interchange of USENET Messages", RFC 850, AT&T, December 1987. |
RFC-1521: Borenstein, N. and Freed, N., "MIME (Multipurpose Internet Mail Extension) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Bellcore, September 1993. |
RFC-1524: Borenstein, N. "A User Agent Configuration Mechanism for Multimedia Mail Format Information", RFC 1524, Bellcore, September 1993. |
Internet Draft: Tim Berners-Lee, "Basic HTTP", CERN, 1992/3. |
We all know and use it, but here is a formal specification. |
Each message is a stream of 7-bit ASCII chars which contains a header and optional (newline separated) body. |
Header consists of a set of entries with one entry per line given by a colon separated key:value pair. |
Key contains no spaces or tabs and cannot exceed 63 chars. |
Body is a fully unstructured sequence of ASCII chars. |
There is a finite set of standard keys and an extension mechanism via the "X"-prefix. The standard set (as used by MH) is: |
Date Bcc Resent-Date Resent-Fcc |
From Fcc Resent-From resent- |
Sender Message-ID Resent-To Message-Id |
To Subject Resent-cc Forwarded |
cc In-Reply-To Resent-Bcc Replied |
Goals
|
History:
|
Retain RFC-822 header+body format |
Add new header fields |
Allow for multipart multimedia bodies |
Include media type and encoding information in new header fields such as: Content-Type, Content-Description, Content-Transfer-Encoding, Content-ID |
Retain 7-bit ASCII for all valid encoding schemes |
Implement multi-component bodies via a special 'magic type' Content-Type: multipart |
Provide natural support required for large multimedia message files such as remote references (similar to hyperlinks but NOT the URL model) and file fragmentation by further specification of the 'multipart' type. |
Two level hierarchical typing scheme adopted of the form: basetype/subtype |
Seven base media types are defined this minimal set is enforced, i.e. all extensions must pass the whole ID->RFC->STD process. |
Allow for less restrictive subtyping the base types, for example:
|
Some standard subtypes are specified and many more are expected. New subtypes must be registered with the IANA (Internet Assigned Numbers Authority). |
Private experimental subtypes prefixed with "X-" may be used freely and without registration. |
Seven base types are: text, image, audio, video, multipart, message, application. |
text
|
image
|
audio
|
video
|
multipart
|
message
|
application
|
3+ public domain implementations available |
The most popular - Metamail (Bellcore) is a MIME transition (backward compatible with most current mailing systems) |
Some existin implementations: PMDF, IMAP2, C-Client, Mail-Manager, MH-MIME, Z-Mail, Andrew, Pine, Elm, Unix Sytem 5 4.3, STI Document Browser, Servicemail, MIXMH |
MIME support in progress by key vendors on most platforms |
ATOMICMAIL (also called "computational mail" or "active mail") project at Bellcore towards interactive extensions of MIME |
HTTP provides an upper level to the Internet, that is, it is built on top of a back-bone network with all the packets flowing from client to server and vice versa using the standard TCP/IP protocol. |
It uses MIME formats and concepts, but does not fully conform to MIME as the WWW is not a mail system. |
HTTP protocol is compatible with other network services such as FTP (File Transfer Protocol), NNTP (Network News Transport Protocol).
|
The HTTP service is standardly assigned to port 80 - it provides a much shorter service connection than the other services. |
The HTTP daemon is the server which responds to the Internet service requests on standard port 80 (or on another custom port). The server program is available from NCSA and is easily installed by editing a set of configuration files which give directory locations for documents, cgi scripts, error messages and icons, and which allows for options regarding path names, domain access, and so on. |
A URL has the standard form
|
HTML hyperlinks typically use the service http for linking to other documents and media files. Some other internet services can also be used such as
|
In this way, a Web server can provide other Internet services through the browser interface. |
The machine is an Internet address and can either be a symbolic name provided by the Domain Name Service (DNS) or the IP numbers. |
If the port is not specified, it defaults to 80. |
The file.file-extension is given by any Unix path name starting from the directory known to the server as "document root". Which path names are valid is one of the options of the server - whether "public_html" is automatically put into the path name and whether paths starting with "~username" are allowed. |
In the http service, the file-extension is used to tell the browser what helper application to use to view the file. Typical file extensions are html, gif, jpeg, mpeg, au, ram, etc. |
On each hyperlink click, the browser (client) initiates a connection with the server at the "machine" (e.g. using UNIX BSD connect call on the default port 80, or a custom user-defined port) |
A request is sent to the server, formatted as a MIME-like message. |
The server replies with another MIME-like message which is received by the browser and either formatted in the browser window or viewed with a helper application. |
The connection is closed on both sides. (The exception to this is the "server push" connection.) |
GET /document.html HTTP/1.0 |
Accept: www/source |
Accept: text/html |
Accept: image/gif |
User-Agent: Lynx/2.2 libww/2.14 |
From: mnotulli@ukonaix.cc.ukans.edu
|
First line syntax is always: METHOD URL ProtocolVersion |
The following lines form a header of an (extended) MIME message |
"User-Agent" specifies the browser type |
"Accept" specifies MIME types recognized by the browser |
The server is expected to provide the requested data in one of these acceptable formats. |
HTTP/1.0 200 OK |
Date: Wednesday, 02-Feb-95 23:04:12 GMT |
Server: NCSA/1.1 |
MIME-version: 1.0 |
Last-modified: Monday, 15-Nov-94 23:33:16 GMT |
Content-type: text/html |
Content-length: 2345 --
|
<HTML><HEAD> |
<TITLE> Document Title </TITLE> |
. . . |
This message contains both header and body |
Some replies contain only header (e.g. error reports, such as HTTP/1.0 404 Not Found) |
GET request also contained header only, whereas POST request (see next example) contains both header and body |
POST /cgi-bin/post-query HTTP/1.0 |
Accept: www/source |
Accept: text/html |
Accept: video/mpeg |
Accept: image/x-rgb |
Accept: application/postscript |
User-Agent: Lynx/2.2 libwww/2.14 |
From: grobe@unanaix.cc.ukans.edu |
Content-type: application/x-www-form-urlencoded |
Content-length: 150
|
org=Academic%20Computing%20Services |
&users=10000 |
&browser=lynx |
&contact=Michael%20Grobe%20grobe@kuhbuh.cc.ukans.edu |
Both header and body present in POST requests - the body is typically used to pass a form contents to the server. |
CGI is an interface for running programs on the server at the request of the client. |
The client look-and-feel for accessing CGI programs is identical to conventional static HTML, but the server side implementation is different. When the user clicks on a CGI link, the server calls the corresponding process and returns its output, not the data/file/code associated with the process. |
Typical Applications
|
Look at a simple example of an HTML form with its CGI Perl program. |