The World Wide Web is a world-wide repository of linked information, called hypertext or hypermedia. It consists of
-
A user interface consistent across many computers
-
A set of open standards that enables the interface to access a variety of document types and information protocols.
-
A provision for universal access, based on the Internet domain name schemes.
|
In this talk, we give a brief background on the Internet and its services (telnet, ftp, news and mail), Client/Server Architectures, Networking, and several prominent Web technologies.
|
This is an introductory talk intended for people of any background who have used the Web, but wish to know more about how it works and what capabilities are possible.
|
The World Wide Web is a world-wide repository of linked information, called hypertext or hypermedia. It consists of
-
A user interface consistent across many computers
-
A set of open standards that enables the interface to access a variety of document types and information protocols.
-
A provision for universal access, based on the Internet domain name schemes.
|
In this talk, we give a brief background on the Internet and its services (telnet, ftp, news and mail), Client/Server Architectures, Networking, and several prominent Web technologies.
|
This is an introductory talk intended for people of any background who have used the Web, but wish to know more about how it works and what capabilities are possible.
|
The Internet is a loose federation of networks.
|
Cooperative organization - no administration, no fees. Protocols and standards are evolved through the IETF, Internet Engineering Task Force.
|
Most national and international networks are members: NSFNET, ESNET, ARPANET, BITNET
|
All these networks are packet switched systems based on TCP/IP. Together these protocols allow for communication over a wide variety of technologies. Machines called gateways connect the networks.
|
Standard domain name system - names are looked up by name server to obtain routing information.
-
symbolic names: npac.syr.edu
-
internet addresses: 128.230.7.2
|
1969 The first locations commissioned by DOD (ARPA)
|
1971 # host computers = 23
|
1982 Standards for TCP and IP established.
|
1983-4 Name server and domain name server developed.
|
1984 #host computers > 1,000
|
1986 NSFNET backbone established, 56Kbps
|
1987 #host computers > 10,000
|
1989 NSFNET backbone upgraded to T1 (1.544Mbps)
-
#host computers > 100,000
|
1992 Internet Society is chartered, World Wide Web released by CERN
-
NSFNET backbone upgraded to T3 (44.736Mbps)
-
#host computers > 1,000,000
|
1993 NSF experiments with 600 Megabit backbone
-
#host computers > 2,000,000
|
1990 Tim Berners-Lee at CERN in Geneva implements a hypertext system to provide efficient information access to the members of the international high-energy physics community.
|
1993 Marc Andreessen at NCSA at the University of Illinois develops a graphical user interface.
|
1994 Web Servers increase by 10% per month.
|
1994 World Wide Web Consortium formed to guide the technical development of standards. The Consortium is run for the Laboratory of Computer Science at MIT, CERN, and INRIA (the French Research Institute).
|
1995 Netscape Communications Corp., founded by Mark Andreessen, offers many extensions in its browser.
|
1995 Commercial interest in the web grows. Prodigy, Compuserver and America On-line offer Web access to the public.
|
1996 The Web is integrated with other computing technologies such as databases. Secure web commerce has still not yet arrived.
|
Telnet basically allows you to log in to a system over a network just as though you were logging in from a terminal attached to the system or from a dial-up modem.
|
You may use telnet from a command line such as:
-
> telnet nova.npac.syr.edu
|
where you give the internet name of the machine that you wish to connect to. The telnet service will proceed to ask you for a name and password just as if you were logging in.
|
Or you may have a telnet program which prompts you for the same information.
|
Between two unix systems, you can use the rlogin command instead.
|
Mostly, you must already have an account on the machine to log in. There are a few publicly available telnet machines, such as the FAA Flight Service at duats.gtefsd.com, where student pilots can log in to get the latest weather data.
|
FTP (File Transfer Protocol) is the way that people transfer files from one internet machine to another.
|
You can use the ftp protocol directly from Unix machines using a command line:
-
> ftp internethostmachinename
|
where it will prompt you for an account login name and password. You will then be connected to the home directory of that account and can use commands to move around the directory structure (cd and ls) and commands get and put to copy a file to or from your original location.
|
Other ftp interfaces may be provided by your telnet program, or by other software programs such as fetch.
|
FTP will transfer files of all types and formats. If the files are large, such as images, you may want to transfer in binary mode (the default is ascii).
|
Some machines may provide a special ftp account called "anonymous". You use your ftp program as usual, except that the login name is "anonymous". The password can be anything, but netiquette obliges you to give your email address. The directory that you are connected to is a public directory provided by the host machine.
|
Usenet newsgroups provide discussion forums on a wide range of topics. You can read the forums from a news server installed at your site.
|
The topics are organized into hierarchies. Some of the main categories are
-
alt - alternative topics
-
comp - computers and computing
-
misc - miscelleneous newsgroups
-
rec - recreational topics
-
sci - science-related topics
-
soc - social and cultural topics
|
Subtopic names are always shown as part of the hierarchy
-
sci.chem.electrochem and comp.parallel
|
People participate in newsgroups by contributing messages, called "posting", which everyone else on the list can read.
|
Some newsgroups are moderated, which means that posted messages are scanned by a human for appropriate content and style before being made public.
|
Many software packages are news readers, including Netscape web browsers - just ask your systems administrator what news server to use.
|
The World Wide Web is a collection of documents located all over the world, and which can have links to images, motion videos and audio files.
|
Links use Web addresses called URL's (Uniform Resource Locators) which have the form
-
http://www.place.org:8888/mydirectory/mydoc.html
|
where
-
http is the hyperlink web service
-
www.place.org is the internet name of the web server
-
8888 is the optional port number
-
/mydirectory/ is the directory or folder path to the document within the web server document space
-
mydoc.html is the document to be retrieved (with an html file extension)
|
Types of files follow the standard MIME (Multipurpose Internet Mail Extensions) originally developed to include multimedia and multi-part content with electronic mail messages.
|
File extensions on the server tell which MIME format the file is in.
|
The browser is configured to have a set of helper applications or "plug-ins" to appropriately display or play files in various MIME formats.
|
Indexing: the information gathered by the robots is organized into an indexing database at the search server.
-
Primarily keyword indexing is currently used - some full text searching is just on single site search engines.
-
Key issue is size of resulting database.
|
Searching: the indexing database allows (keyword) searches by the user.
-
Queries are formed, some number of most highly ranked results are returned.
|
User Interface
-
uniform interface for HTTP, FTP, GOPHER, WAIS, Harvest, Lycos
|
Challenge of WWW search:
-
estimated total size is 30 Gigabytes, 5 million documents (many search engines now take months to crawl the web to update index files.)
-
diversity - huge distributed database, unstructured, non-relational, hierarchical information with many formats.
|
Developed by Netscape from HTML scripting language LiveScript, and including some features of Java, that allows HTML authors to have more control over the behavior of the browser.
|
JavaScript is text embedded in an HTML document using the <SCRIPT> tags, which a JavaScript browser will interpret (and other browsers ignore).
|
JavaScript can perform animations, respond to buttons and other forms of user input, and allow the author more control over the appearance of the Web Page.
|
JavaScript can also provide an object-oriented view of other browser plug-in programs.
|
Reference: JavaScript Authoring Guide at http://home.netscape.com/eng/mozilla/Gold/handbook/javascript/
|
VRML is a computer graphics language for describing 3-Dimensional scenes. It was developed as a standard for the WWW from OpenInventor of SGI.
|
VRML includes language elements for creating simple shapes, various lighting effects, applying textures to shapes, and various points of view (referred to as cameras).
|
A VRML enabled browser will recogize VRML files of the form file.wrl, and create an interface where the user has controls to fly through space and examine objects.
|
Objects within a VRML scene may be configured as URL links to other Web pages of any document type.
|
VRML documents are huge - most serious current drawback to using VRML more widely on the Web is the slow download time.
|
New versions of VRML include motion in the scenes.
|