MIME stands for Multipart Internet Mail Extensions and is the developing standard for the contents of all messages passed over the Internet.
|
HTTP is Hypertext Transport Protocol and is the protocol that provides the basis of the World Wide Web: transmitting multimedia documents across the Internet. HTTPD is the daemon running the HTTP Web server.
|
URL stands for Uniform Resource Locator and is the universal addressing scheme for all documents (multimedia) on the WWW.
|
CGI is the Common Gateway Interface and is the scheme to interface other programs and systems to the HTTP Web protocol, using the same data protocols as the HTTP clients and servers.
|
References:
-
HTML and CGI Unleashed, John December and Mark GInsburg, chapters 19 and 20.
-
Innumerable web documents.
|
MIME stands for Multipart Internet Mail Extensions and is the developing standard for the contents of all messages passed over the Internet.
|
HTTP is Hypertext Transport Protocol and is the protocol that provides the basis of the World Wide Web: transmitting multimedia documents across the Internet. HTTPD is the daemon running the HTTP Web server.
|
URL stands for Uniform Resource Locator and is the universal addressing scheme for all documents (multimedia) on the WWW.
|
CGI is the Common Gateway Interface and is the scheme to interface other programs and systems to the HTTP Web protocol, using the same data protocols as the HTTP clients and servers.
|
References:
-
HTML and CGI Unleashed, John December and Mark GInsburg, chapters 19 and 20.
-
Innumerable web documents.
|
Here are a few sample Internet documents relevant for this part of the course.
|
RFC-822: Crocker, D., "Standard for the Format of ARPA Internet Text Messages", SRD 11, RFC 822, UDEL, 1982.
|
RFC-1036: R. Horton and R. Adams, "Standard for Interchange of USENET Messages", RFC 850, AT&T, December 1987.
|
RFC-1521: Borenstein, N. and Freed, N., "MIME (Multipurpose Internet Mail Extension) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Bellcore, September 1993.
|
RFC-1524: Borenstein, N. "A User Agent Configuration Mechanism for Multimedia Mail Format Information", RFC 1524, Bellcore, September 1993.
|
Internet Draft: Tim Berners-Lee, "Basic HTTP", CERN, 1992/3.
|
We all know and use it, but here is a formal specification.
|
Each message is a stream of 7-bit ASCII chars which contains a header and optional (newline separated) body.
|
Header consists of a set of entries with one entry per line given by a colon separated key:value pair.
|
Key contains no spaces or tabs and cannot exceed 63 chars.
|
Body is a fully unstructured sequence of ASCII chars.
|
There is a finite set of standard keys and an extension mechanism via the "X"-prefix. The standard set (as used by MH) is:
|
Date Bcc Resent-Date Resent-Fcc
|
From Fcc Resent-From resent-
|
Sender Message-ID Resent-To Message-Id
|
To Subject Resent-cc Forwarded
|
cc In-Reply-To Resent-Bcc Replied
|
Retain RFC-822 header+body format
|
Add new header fields
|
Allow for multipart multimedia bodies
|
Include media type and encoding information in new header fields such as: Content-Type, Content-Description, Content-Transfer-Encoding, Content-ID
|
Retain 7-bit ASCII for all valid encoding schemes
|
Implement multi-component bodies via a special 'magic type' Content-Type: multipart
|
Provide natural support required for large multimedia message files such as remote references (similar to hyperlinks but NOT the URL model) and file fragmentation by further specification of the 'multipart' type.
|
Two level hierarchical typing scheme adopted of the form: basetype/subtype
|
Seven base media types are defined this minimal set is enforced, i.e. all extensions must pass the whole ID->RFC->STD process.
|
Allow for less restrictive subtyping the base types, for example:
-
Content-Type: text/plain
-
Content-Type: text/richtext
|
Some standard subtypes are specified and many more are expected. New subtypes must be registered with the IANA (Internet Assigned Numbers Authority).
|
Private experimental subtypes prefixed with "X-" may be used freely and without registration.
|
Seven base types are: text, image, audio, video, multipart, message, application.
|
multipart
-
Specifies a MIME message composed of several parts with possible different Content-Type fields.
-
Parts are separated by a boundary string, specified in the multipart header entry
-
Subtypes: mixed (serial combination of media), parallel (for parallel presentation if possible), alternative (multiple representations of the same data) and digest (all parts are messages)
|
message
-
Subtypes: rfc822 (standard ARPA e-mail format), partial (a single chunk of a larger message, chopped into pieces for transmission and then reassembled), external-body (pointer to a remote data - similar to typerlink/URL but different representation)
|
application
-
Current subtypes: postscript, ODA
-
Placeholder for "anything else" - several interactive/custom/creative extensions expected here
-
Already registered: Andrew-inset,t ATOMICMAIL (Bellcore)
|
3+ public domain implementations available
|
The most popular - Metamail (Bellcore) is a MIME transition (backward compatible with most current mailing systems)
|
Some existin implementations: PMDF, IMAP2, C-Client, Mail-Manager, MH-MIME, Z-Mail, Andrew, Pine, Elm, Unix Sytem 5 4.3, STI Document Browser, Servicemail, MIXMH
|
MIME support in progress by key vendors on most platforms
|
ATOMICMAIL (also called "computational mail" or "active mail") project at Bellcore towards interactive extensions of MIME
|
HTTP provides an upper level to the Internet, that is, it is built on top of a back-bone network with all the packets flowing from client to server and vice versa using the standard TCP/IP protocol.
|
It uses MIME formats and concepts, but does not fully conform to MIME as the WWW is not a mail system.
|
HTTP protocol is compatible with other network services such as FTP (File Transfer Protocol), NNTP (Network News Transport Protocol).
-
On a UNIX-based machine, the basic services are enumerated in the file /etc/services. Each service cooresponds to a standard port. For example, telnet is mapped to port 43, and FTP is mapped to port 21. All ports below 1024 are privileged - only the system administrator can determine port use.
|
The HTTP service is standardly assigned to port 80 - it provides a much shorter service connection than the other services.
|
A URL has the standard form
-
service://machine:port/file.file-extension
|
HTML hyperlinks typically use the service http for linking to other documents and media files. Some other internet services can also be used such as
-
ftp://machine/file.file-extension.
|
In this way, a Web server can provide other Internet services through the browser interface.
|
The machine is an Internet address and can either be a symbolic name provided by the Domain Name Service (DNS) or the IP numbers.
|
If the port is not specified, it defaults to 80.
|
The file.file-extension is given by any Unix path name starting from the directory known to the server as "document root". Which path names are valid is one of the options of the server - whether "public_html" is automatically put into the path name and whether paths starting with "~username" are allowed.
|
In the http service, the file-extension is used to tell the browser what helper application to use to view the file. Typical file extensions are html, gif, jpeg, mpeg, au, ram, etc.
|
GET /document.html HTTP/1.0
|
Accept: www/source
|
Accept: text/html
|
Accept: image/gif
|
User-Agent: Lynx/2.2 libww/2.14
|
From: mnotulli@ukonaix.cc.ukans.edu
-
-- blank-line-terminating-the-request --
|
First line syntax is always: METHOD URL ProtocolVersion
|
The following lines form a header of an (extended) MIME message
|
"User-Agent" specifies the browser type
|
"Accept" specifies MIME types recognized by the browser
|
The server is expected to provide the requested data in one of these acceptable formats.
|
HTTP/1.0 200 OK
|
Date: Wednesday, 02-Feb-95 23:04:12 GMT
|
Server: NCSA/1.1
|
MIME-version: 1.0
|
Last-modified: Monday, 15-Nov-94 23:33:16 GMT
|
Content-type: text/html
|
Content-length: 2345 --
-
-- blank-line-separating-header-and-body--
|
<HTML><HEAD>
|
<TITLE> Document Title </TITLE>
|
. . .
|
This message contains both header and body
|
Some replies contain only header (e.g. error reports, such as HTTP/1.0 404 Not Found)
|
GET request also contained header only, whereas POST request (see next example) contains both header and body
|
POST /cgi-bin/post-query HTTP/1.0
|
Accept: www/source
|
Accept: text/html
|
Accept: video/mpeg
|
Accept: image/x-rgb
|
Accept: application/postscript
|
User-Agent: Lynx/2.2 libwww/2.14
|
From: grobe@unanaix.cc.ukans.edu
|
Content-type: application/x-www-form-urlencoded
|
Content-length: 150
-
--blank-line-separating-header-and-body---
|
org=Academic%20Computing%20Services
|
&users=10000
|
&browser=lynx
|
&contact=Michael%20Grobe%20grobe@kuhbuh.cc.ukans.edu
|
Both header and body present in POST requests - the body is typically used to pass a form contents to the server.
|
CGI is an interface for running programs on the server at the request of the client.
|
The client look-and-feel for accessing CGI programs is identical to conventional static HTML, but the server side implementation is different. When the user clicks on a CGI link, the server calls the corresponding process and returns its output, not the data/file/code associated with the process.
|
Typical Applications
-
Support for dynamic generation of HTML documents, such as on-the-fly conversions from other formats.
-
Interfacing with other (non-HTTP) remote services such as databases (WAIS, RDMS), video-on-demand, simulations, etc.
-
Support for the two-way interactivity between clients and servers ( to be achieved by building some intelligence and multiple choice/response capabilities into the CGI programs)
-
Interface to and integration with Forms/GUI area of HTML - submitted forms are handled by suitable CGI processes.
|
Look at a simple example of an HTML form with its CGI Perl program.
|