Given by Nancy J. McCracken at ECS400 Senior Undergraduate Course on Spring Semester 97. Foils prepared 11 May 1997
Outside Index
Summary of Material
The World Wide Web is a world-wide repository of linked information, called hypertext or hypermedia. It consists of
|
In this talk, we give a brief background on the Internet and its services (telnet, ftp, news and mail), Client/Server Architectures, Networking, and several prominent Web technologies. |
This is an introductory talk intended for people of any background who have used the Web, but wish to know more about how it works and what capabilities are possible. |
Outside Index
Summary of Material
Dr. Nancy McCracken |
NPAC |
Syracuse University |
December 9, 1996 |
The World Wide Web is a world-wide repository of linked information, called hypertext or hypermedia. It consists of
|
In this talk, we give a brief background on the Internet and its services (telnet, ftp, news and mail), Client/Server Architectures, Networking, and several prominent Web technologies. |
This is an introductory talk intended for people of any background who have used the Web, but wish to know more about how it works and what capabilities are possible. |
The Internet is a loose federation of networks. |
Cooperative organization - no administration, no fees. Protocols and standards are evolved through the IETF, Internet Engineering Task Force. |
Most national and international networks are members: NSFNET, ESNET, ARPANET, BITNET |
All these networks are packet switched systems based on TCP/IP. Together these protocols allow for communication over a wide variety of technologies. Machines called gateways connect the networks. |
Standard domain name system - names are looked up by name server to obtain routing information.
|
1969 The first locations commissioned by DOD (ARPA) |
1971 # host computers = 23 |
1982 Standards for TCP and IP established. |
1983-4 Name server and domain name server developed. |
1984 #host computers > 1,000 |
1986 NSFNET backbone established, 56Kbps |
1987 #host computers > 10,000 |
1989 NSFNET backbone upgraded to T1 (1.544Mbps)
|
1992 Internet Society is chartered, World Wide Web released by CERN
|
1993 NSF experiments with 600 Megabit backbone
|
1990 Tim Berners-Lee at CERN in Geneva implements a hypertext system to provide efficient information access to the members of the international high-energy physics community. |
1993 Marc Andreessen at NCSA at the University of Illinois develops a graphical user interface. |
1994 Web Servers increase by 10% per month. |
1994 World Wide Web Consortium formed to guide the technical development of standards. The Consortium is run for the Laboratory of Computer Science at MIT, CERN, and INRIA (the French Research Institute). |
1995 Netscape Communications Corp., founded by Mark Andreessen, offers many extensions in its browser. |
1995 Commercial interest in the web grows. Prodigy, Compuserver and America On-line offer Web access to the public. |
1996 The Web is integrated with other computing technologies such as databases. Secure web commerce has still not yet arrived. |
Server: A program in charge of a resource or information.
|
Client: Any program that makes a request for service from the server. |
Clients and servers send their messages over a network connection. |
Web servers provide access to a collection of files containing hyperlinked information
|
Browsers provide an easy graphical interface for users to request information. The client machine also provides viewers for a standard set of image and video formats. |
The interface is kept very simple to run on all networks and most machines. |
The Internet is a packet-switched network. Each message (or document) is broken up into a number of packets. Each packet has an address. A computer called a router sits on the local network and decides where to send it first on its way to its final address. Each computer along the network connection examines messages that come in and either keeps it or reroutes it along its way. The message is reassembled on the other end. |
Performance of network delivery depends on the size of the message, the capacity of the various pieces of network that the message may travel along and the congestion of the network. |
Telnet basically allows you to log in to a system over a network just as though you were logging in from a terminal attached to the system or from a dial-up modem. |
You may use telnet from a command line such as:
|
where you give the internet name of the machine that you wish to connect to. The telnet service will proceed to ask you for a name and password just as if you were logging in. |
Or you may have a telnet program which prompts you for the same information. |
Between two unix systems, you can use the rlogin command instead. |
Mostly, you must already have an account on the machine to log in. There are a few publicly available telnet machines, such as the FAA Flight Service at duats.gtefsd.com, where student pilots can log in to get the latest weather data. |
FTP (File Transfer Protocol) is the way that people transfer files from one internet machine to another. |
You can use the ftp protocol directly from Unix machines using a command line:
|
where it will prompt you for an account login name and password. You will then be connected to the home directory of that account and can use commands to move around the directory structure (cd and ls) and commands get and put to copy a file to or from your original location. |
Other ftp interfaces may be provided by your telnet program, or by other software programs such as fetch. |
FTP will transfer files of all types and formats. If the files are large, such as images, you may want to transfer in binary mode (the default is ascii). |
Some machines may provide a special ftp account called "anonymous". You use your ftp program as usual, except that the login name is "anonymous". The password can be anything, but netiquette obliges you to give your email address. The directory that you are connected to is a public directory provided by the host machine. |
Usenet newsgroups provide discussion forums on a wide range of topics. You can read the forums from a news server installed at your site. |
The topics are organized into hierarchies. Some of the main categories are
|
Subtopic names are always shown as part of the hierarchy
|
People participate in newsgroups by contributing messages, called "posting", which everyone else on the list can read. |
Some newsgroups are moderated, which means that posted messages are scanned by a human for appropriate content and style before being made public. |
Many software packages are news readers, including Netscape web browsers - just ask your systems administrator what news server to use. |
Other discussion forums on interesting topics are provided through mail lists. The discussion is delivered through your regular email. |
In this case, the discussion is again provided through messages. But instead of posting the message through special software (as is the case with news readers), the message is sent to an email address, and then forwarded to everyone in the group. |
Mail list addresses
|
Mail lists may also be moderated. |
The World Wide Web is a collection of documents located all over the world, and which can have links to images, motion videos and audio files. |
Links use Web addresses called URL's (Uniform Resource Locators) which have the form
|
where
|
Types of files follow the standard MIME (Multipurpose Internet Mail Extensions) originally developed to include multimedia and multi-part content with electronic mail messages. |
File extensions on the server tell which MIME format the file is in. |
The browser is configured to have a set of helper applications or "plug-ins" to appropriately display or play files in various MIME formats. |
Web servers provide what is called HTTP service (for HyperText Transfer Protocol), but links can also direct connections to other Internet services. |
For other services, the Web server transfers the connection to the appropriate server. |
Image types:
|
Audio types:
|
Video types:
|
Forms are used to allow the user to send information from the browser back to the server. |
The server must provide a program, called a CGI script, that will process the user information and provide an appropriate response.
|
The CGI program parses the input from the server and performs any number of computing and data access functions:
|
When the CGI program terminates, the server closes the connection. |
Search Engines enable users to look up text documents stored on the Web, usually by one or more keywords appearing in the document. |
Information gathering and filtering
|
Indexing: the information gathered by the robots is organized into an indexing database at the search server.
|
Searching: the indexing database allows (keyword) searches by the user.
|
User Interface
|
Challenge of WWW search:
|
Many useful Web applications provide a web page interface to a commercial product database of information. |
This is currently done through CGI scripting. |
The database must have a programmable interface (in addition to an interactive interface). For relational databases, this has been standardized in the query language SQL. |
Web queries to the database are taken from an HTML form, the information is passed to the CGI script, which makes appropriate SQL queries to the database. The results of the database query can be formatted and returned to the web page. |
Developed by Netscape from HTML scripting language LiveScript, and including some features of Java, that allows HTML authors to have more control over the behavior of the browser. |
JavaScript is text embedded in an HTML document using the <SCRIPT> tags, which a JavaScript browser will interpret (and other browsers ignore). |
JavaScript can perform animations, respond to buttons and other forms of user input, and allow the author more control over the appearance of the Web Page. |
JavaScript can also provide an object-oriented view of other browser plug-in programs. |
Reference: JavaScript Authoring Guide at http://home.netscape.com/eng/mozilla/Gold/handbook/javascript/ |
Java is a general-purpose object-oriented language developed by Sun with the capability of providing distributed computing through the Web (http://www.javasoft.com). |
Browsers (HotJava, Netscape 2.0/3.0 ..) supporting Java allow arbitrarily sophisticated dynamic multimedia applications inserts called Applets, written in Java, to be embedded in the regular HTML pages and activated on each exposure of a given page. |
VRML is a computer graphics language for describing 3-Dimensional scenes. It was developed as a standard for the WWW from OpenInventor of SGI. |
VRML includes language elements for creating simple shapes, various lighting effects, applying textures to shapes, and various points of view (referred to as cameras). |
A VRML enabled browser will recogize VRML files of the form file.wrl, and create an interface where the user has controls to fly through space and examine objects. |
Objects within a VRML scene may be configured as URL links to other Web pages of any document type. |
VRML documents are huge - most serious current drawback to using VRML more widely on the Web is the slow download time. |
New versions of VRML include motion in the scenes. |