CIS 6930-01 Applications of Information Technology I
a.k.a.
Technologies for an Information Age I

Web Architectures and Technologies
Instructors: Geoffrey Fox and Bryan Carpenter
Dept. of Computer Science
School of Computational Science and Information Technology
400 Dirac Science Library
Florida State University
Tallahassee
Florida 32306-4130
http://www.csit.fsu.edu
fox@csit.fsu.edu 850-644-4587
dbc@csit.fsu.edu 850-644-0180

Abstract of CSIT IT1 Fall 2000 Introduction
This Foilset contains introductory material on CSIT/CS Course IT1 for fall 2000
Some Aspects of Course Logistics -- all students must go to web site for complete discussion of this
http://aspen.csit.fsu.edu/it1fall00/
Overview of Field and Material covered and relation to other courses IT2, IT3 (proposed) and the synergy between computational science and information technology
The Internet is the most important distributed computer system and it has spawned the most remarkable and general purpose software – In studying the Internet, we study distributed computing (hardware and software)
Students should be able to design and build any distributed system
Summary of Base Distributed Object Web and Internet Technologies

Leave Now Unless ……
You are practically minded and wish to learn how to write real software to solve real distributed systems
Your software should work and be documented!
At the end of IT1 you will have basic knowledge for a .com job
At the end of IT2 and a good grade, you will be top applicant for .com job
At end of IT3, you will be prepared for PhD study in Information Technology

Practical Issues
Grade will be based on about 6 homework sets. The first of these will be a report and the last a modest project. The rest will be largely Java oriented programming tasks
The Books are:
Core Java 2
Volume 1-Fundamentals
Volume 2-Advanced Features
The Sun Microsystems Press Java Series (Prentice Hall)
Cay S. Horstmann and Gary Cornell
ISBN 0-13-081933-6
ISBN 0-13-081934-4

Overview of CSIT Information Technology Courses - I
IT1 assumes good programming skills and familiarity with the Internet/web
It will teach Java and use Internet examples to illustrate use of language
Course could be useful even if you know Java – we will emphasize topics like
Servlets – Simple way of building Java Server side applications
RMI – Foundation of pure Java distributed objects and systems built in these terms
JDBC (Java Database Connectivity) – Universal interface between Java and databases
The beat up client side (in Microsoft-Netscape battle won by MSFT) – Applets, Dynamic HTML and JavaScript (good ideas albeit a victims of battle) and Java Server Pages (how to build client software if you sell servers and don’t like MSFT)

Overview of CSIT Information Technology Courses - II
We will try to cover basic concepts of distributed systems and use the most elegant technology to illustrate – we will generalize to other approaches which could be best to use in a particular application
Servlets illustrate Server side application where you can of course use Perl, C++, JavaScript, Fortran, Machine Language or what have you
RMI illustrates the integration of Internet and distributed object ideas – the Object Web that underlies all modern distributed systems
IT1 will be basic 3 tier systems with core Client and Server Side technologies – Java, JavaScript and Dynamic HTML
IT2 will assume IT1 (including mastery of Java) and cover remaining core technologies
IT3 will cover the nifty new emerging ideas – if taught today it will cover the Wireless Internet and Computational Grids

Overview of CSIT Information Technology Courses - III
Provisional IT2 Syllabus
XML and some exemplar applications such as MathML
More on IT1 technologies such as Dynamic HTML with W3C (World Wide Web Consortium) Document Object Model
Virtual Machines and Java from pico to enterprise editions (Smart Cards to Cell Phones to PC’s to Servers)
Security for Java and for heterogeneous Systems (Public Key infrastructure, Kerberos)
The four approaches to the Object Web
CORBA from the Object Management Group
SOAP  (Simple Object Access Protocol) from W3C – the pure web approach
RMI, Enterprise Javabeans (EJB) and Jini – the pure Java approach
COM from Microsoft
Component and event based programming – Javabeans on the client; CORBA and EJB on the server
Graphics on the Web: VRML X3D and SVG (Scalable Vector Graphics)

Overview of CSIT Information Technology Courses - IV
Some topics left over for “IT3” or other courses or Universities
Java API’s (frameworks) for Multimedia, Graphics
Java and High performance
Computational Grids - using internet for (parallel) Computing
The Wireless Internet – WAP, web clipping
Complete Web Frameworks like iPlanet (from Sun), ninja (from UC Berkeley) or e-Speak from Hewlett Packard
Applications: e-commerce, web-based education
Portals as the metaphor for web applications
Nifty companies and their technologies, Yahoo (Portals),  Google (search), Real Networks (Multimedia), Ariba (Business to Business), Akamai (Caching and Routing), Inktomi (Web Infrastructure)
There are also other issues besides software
Algorithms for caching and multimedia
Communication hardware – conventional, optical, wireless
Communication architecture for Internet Services including quality of service

Some Course Prerequisites
We will assume Basic Web Browsing and HTML expertise and programming experience
Permission of Instructor is needed for IT2 if you have not taken IT1
You should be familiar with either PC or UNIX environment and program in at least one real language including Java
Perl is still widely used, but not taught here?
We will not assume any database or CORBA knowledge and will review basic material such as SQL if needed
CSIT provides servers for you to access Oracle databases and other needed core resources
You need a UNIX workstation or a PC running Windows

Practical Computer Science
Computational Science is the Interdisciplinary field between computer science/engineering/math and application areas using simulation technologies
CSIT is the Interdisciplinary field between computer science/engineering/math and application areas using simulation and information technologies
CSIT can be studied from either a technology (CS CE Math) or an application perspective
It covers all of practical (applied) computer science and its application motivation
This course gives either technology or application scientists the base knowledge of information technology

Computational Science and Information Technology (CSIT)
Together cover practical use of leading edge computer science technologies to address “real” applications
There are many technologies in common

Is Computational Science an Academic Discipline?
There can be no doubt that topics in Computational Science are useful and those in CSIT (Computational Science and Information Technology) are even more useful
CSIT incorporates trend towards data intensive computing and networked scientific collaboration
CSIT includes e-commerce
The CSIT technologies are also difficult and involve fundamental ideas
The area is of interest to both those in computer science and application fields
Probably most jobs going to Computer Science graduates really need CSIT education but unfortunately
Employers only know about “Computer Science/Engineering”
So most current implementations make Computational Science as a set of courses within existing disciplines
Situation could change though as CSIT is “correct”

Components of a Basic Web System

Original Structure of World Wide Web
Universal machine independent interfaces – Success of Web as much to with standards as software/hardware
CGI Programs were originally usually written in PERL but can be essentially any Process and so do simulation, database access (this is JDBC), advanced document processing etc.
Java (servlets) is of growing importance in Server Code

Architecture of Web Software

The 1998 Information System Architecture

Distributed Object Web Technology Model - I
Basic Vision: The current incoherent but highly creative Web will merge with distributed object technology in a multi-tier client-server-service architecture with Java based combined Web-ORB’s
Need to abstract entities (Web Pages, database entries, simulations) and services as objects with methods(interfaces)
CORBA .. XML  is “just” CGI done right
COM(Microsoft) and CORBA(world) are competing cross platform and language object technologies
Every Netscape4 browser has a Visigenic ORB built in
Javabeans plus RMI and JINI is 100% pure Java distributed object technology
W3C says you should use XML which defines a better IDL and perhaps an object model -- certainly does for documents
How do we do this while technology is still changing rapidly!

Multi-Tier Client Server Service

Distributed Object Web Technology Model - II
Need to use mix of approaches -- choosing what is good and what will last
For example develop Web-based databases with Java objects using standard JDBC (Java Database Connectivity) interfaces
Oracle, DB2, Informix, Sybase, Lotus Notes, Object database choice becomes an issue of performance/robustness NOT functionality
Use CORBA (C++) or Java as software to wrap existing applications with XML as syntax to define these distributed objects
Note Middle tier insulates client from backend -- can use one object model for user level  (object functionality) and different one for backend (object access and persistent store)
specialized object databases getting “overwhelmed” by multi-tier approach with Oracle etc. traditional backends
Program the server not the backend or the client
Do this programming in Java
And this implied that Oracle won the database battle as they got model correct and supplied an appropriate backend database with functionality added through middle tier extensions

3-Tier Architecture and Different Object Models
There are several important Object Models: COM, CORBA, Java, Web, Oracle Database ……
But it doesn’t matter!!

Distributed Objects
Examples of current object technologies
Documents -- URL
"General Programs including database invocations"
Old style Web -- CGI
New Style Web -- XML makes server side objects look like applets as far as invocation goes
CORBA and COM -- special "interface definition language" (IDL) defines invocation in C++ like syntax
RMI uses Java language as IDL language
Benefits of distributed objects
allows objects written in different languages to communicate seamlessly via standardized messaging protocols embodied by middleware.
Higher levels of transparency of interoperability
Objects can be “self-managing” of resources
provides flexible grain of decomposition for building complex systems

Two Database Web Linkages

Two More 3 Tier Web Database Links

2 Tier and CORBA Models

Comparison of 2 3 and 4 Tier Models

Two ways of Implementing Data Objects
Old way: Use an Object Database
Current Approach: Use a Relational Database and business logic in EJB

Today’s Distributed Object Web: The Confusing Multi-Technology Real World Middleware Server Layer

Emerging Object Web Multi-Server Model

Computational Science Portal: Multi-Server Web Computing System

Summary of Pragmatic Object Web
3-(or more)-tier architecture - Web browser front-ends, legacy (e.g. databases, HPC modules) backends; fat middleware
Use as appropriate the alternative / competing Middleware models:
Java RMI+ EJB (Enterprise Javabean) - single language solution by Sun
CORBA - all languages solution by OMG
COM - multi-language solution by Microsoft
SOAP/XML - emergent solution by the Web Consortium
Each model has different tradeoffs (most elegant, powerful, fastest, simplest)
POW attempts to integrate  various models and services in terms of multi-protocol middleware servers
Note Java is often the best language to build middleware whether this is Java or some other distributed object model
Most commercial Java activity is on Server not Client

What is a Web Client?
Originally we thought of Web Systems as a set of communicating objects with
Not much on client linking to UNIX processes invoked by CGI
Then we excitedly got balanced client server applications with JavaScript and Java applets on client which was faster as no network traffic for “small” local actions
Servlets, Enterprise Javabeans and CORBA provided robust middle tier programming model
But browsers never became a good programming environment as actions (say of JavaScript) undefined or quality (of Java virtual machine in browser) poor.
So browsers are just display technology and one should use servers or applications for software
Java Server Pages provide similar functionality to Java Applets with Java running outside browser in a nice robust server
This is the old way we built applications done with faster networks and more elegant implementation (we used to invoke Perl CGI scripts to provide dynamic web pages but this was too slow)

Palm Tops help define Client Model
There is growing interest in wireless portable displays in the confluence of cell phone and personal digital assistant markets
One needs to design web systems so they can be accessed from either a PDA or a PC or a Powerwall
This implies that only code in browser should be that immediately needed to relay events between user and web system – all “logic” (state) should be outside browser.

Web Technologies in a Nutshell  -- Java
Java -- Objected Oriented version of C/C++ supporting Interactive Distributed Computing.
Original Web architecture (e.g. CGI) was server-side. Java allowed design and Implementation of balanced Client Server Applications but this original motivation is less important now
Java likely to be a dominant software engineering and Scientific Computing  language  -- see http://www.javagrande.org
This course discusses Java as a language in context of a system building tool
Java will probably be preferred language for development of next generation general or custom Web servers and clients
Programmers more productive in Java
Java has frameworks (libraries) for key Internet functionalities
Java can build client side customized GUI's  and graphics/image processing but Microsoft JavaScript and DHTML competes here and MOST Industry use of Java is in middle tier
New Java 2 has several enhancements including very many specialized API’s
Javabeans are (visual) component model for Java applications
Enterprise Javabeans are Java middleware containers
Jini and RMI allow distributed objects to be found and communicate

Web Technologies in a Nutshell - JavaScript
JavaScript -- only superficially related to Java and was called LiveScript -- is Netscape's (somewhat supported by Microsoft) fully interpreted Client side extension of HTML. This is a good Client Window integration /customization technology where flexibility more important than performance
i.e. use JavaScript for Rapid Prototyping of Complex User Interfaces
First examples use JavaScript together with frames ( HTML extension) for interactive multi-window technologies
JavaScript is roughly equivalent to "Abstract Windowing Toolkit/ Layout Manager" in Java but applied to Browser Frames and not Java windows
JavaScript cannot build complex filters or simulations as slow
But JavaScript with dynamic HTML is powerful client technology which is often easier and faster than Java -- it is faster as invokes optimized browser functions
both Internet Explorer 4 and Netscape have excellent JavaScript support
Server side version of JavaScript called LiveWire runs on Netscape Servers -- unsuccessful
Originally expected client side use of JavaScript to grow in importance but new view of Web clients limits use of JavaScript to small critical event handling
JavaScript on Palmtops called WMLScript

Web Technologies in a Nutshell - DHTML
There is an emerging DOM or  Document Object Model which will be uniform model used by W3C, Netscape, Microsoft
It allow you to address individual components of a page e.g. text box, image or collections thereof as separate entities
DOM is quite close to IE 5 conventions and is based on XML
DOM ought to be critical for publishing industry – Microsoft Word does not use except implicitly in Web export
Cascading Style Sheets allow one more powerful ways of assigning properties (such as color fonts etc.) to these components using either name(id) or type (<h2> tag etc.)
DHTML or dynamic HTML allows one to address the components of document and change on the fly (without reloading page) the properties of these components
This includes not only natural style properties but also position, size and “visibility”
DHTML currently handicapped by major differences between IE5 and Netscape 4 -- functionalities are similar but syntax very different
JavaScript combined with DHTML allows animations, graphs and replacement of just parts of text

Web Technologies in a Nutshell - XML
HTML is powerful but does not separate display and form (structure of document component as an object)
XML is a generalization of HTML which allows definition of arbitrary tags
e.g. <student name=“Jane Doe” class=“CSIT:IT1” grade=“…” >Working Hard</student> is more elegant way of capturing information in a reliable fashion than HTML
<h2>Students</h2>
<ul><li>Jane Doe: Working Hard</li><ul>
<li>Class: IT1</li>
<li>Grade: …</li> …. </ul>
</ul>  with a PERL program to extract data
XML allows powerful way of defining dynamic ascii databases useful for “modest size data” such as people, document citations etc.
XML parsers map XML tags into HTML for display or hand to programs to interpret
XML can also be used to define extensions to HTML such as special tags for mathematics (MathML) or chemistry or …..
XML defines syntax for “serializing” Web objects and transmitting between clients and servers SOAP

Web Technologies in a Nutshell - PERL
PERL is a C like Interpreter with powerful direct access to UNIX system commands and very easy ways of processing text files
PERL is a relatively old technology which has being overtaken by Java tidal wave.
Still PERL has significantly better Systems and Document handling capability than Java
Very good for UNIX as much easier than Shell for system scripts -- PC versions exist but not so well integrated into O/S
Wonderful regular expression handling
PERL is traditional but not best choice for server CGI extensions and development of filters even for simpler cases involving text documents
PERL5 is object oriented but much less elegant (in my opinion) than Java
PERL5 has very useful multidimensional associative and regular arrays
Use PERL for UNIX batch jobs to edit text files (e.g. map www.npac.syr.edu to aspen.csit.fsu.edu) and quick simple Web server extensions – Convert latter to Java for production

Web Technologies in a Nutshell - Databases
The Web provides a convenient integration environment for "mature" technologies migrating from existing computer environments.
Object Relational databases are a good example where it is now straightforward in Microsoft Access, Oracle, DB2, Informix, Sybase etc. to provide a Web Interface to access and edit database with Java/JavaScript/Forms based Interfaces
Object databases such as Illustra also interfaced to Web but this is wrong way to thing about problem
Systems such as Cold Fusion and Dreamweaver provide convenient high level interfaces to Web-linked databases
Note Web Authoring “confusion” is another result of unfortunate browser war lost by Netscape
Several excellent Java to Database packages becoming available with the JDBC standard based on ODBC -- more powerful but lower level than systems like Cold Fusion
CORBA will have good Web and Java Interfaces and in IT2 we will discuss integration of Web CORBA and database technologies
CORBA views a database as a managed persistent object

Web Technologies in a Nutshell - VRML
VRML plays same role to 3D worlds that HTML does to documents
VRML 1.0 has been widely available and specifies static 3D scenes through which you can navigate. Already provides universal visualization environment and we have examples of use In Geographical Information Systems
Note can embed clickable URL's as with ImageMaps which can be used to annotate images to provide interactive resources
VRML 2.0 is now the standard with critical enhancements so that individual elements of 3D world are dynamic and can be programmed
It is designed to support full interactivity (televirtuality) with texture mapped video, avatars etc.
VRML 2.0 could require huge computing resources whether used as the virtual car-dealership / interactivity gaming or more academic uses such as collaboration between teachers and students in 3D virtual classroom
Bandwidth and computing needs of VRML are handicapping acceptance and appears that VRML will NOT “make it” -- replacement unclear
Microsoft ChromeEffects (XML based) and
Java3D address some but not all VRML applications
X3D is XML syntax for VRML
SVG is XML for Vector Graphics Primitives (much more limited but perhaps more realistic than VRML)

Can Computer Science help Simulation Scientists ?
There are classic computational science areas based high performance parallel computers and the software needed to use them
Cluster of PC’s to Teraflop machines in CSIT or DoD DoE NSF Centers
Nifty algorithms
New languages such as Java (Grande) optimized for performance
Information Technology can benefit simulations in many ways:
XML based scientific (meta)data standards to allow sensor data access to modern tools and support interoperability
Convenient access to multiple (super)computers and technology to integrate diverse simulations and data sources
Portal or “Problem Solving Environment”
Customization of generic portal services like real-time collaboration
We will look at this for a couple of examples

Commodity Portals are Web Interfaces for Consumers

Portal for Landscape Management Simulation

Example of a custom Web
User Interface
Land Management System

Interactive Mesh Generation Portal

GEM Portal Architecture

Services in Computing Portals
Security
Fault Tolerance
Object Lookup and Registration
Object Persistence and Database support (as in EIP’s)
Event and Transaction Services
Collaboration among scientists around world
Job Status as in HotPage (NPACI) and myGrid (NCSA)
File Services (as in NPACI Storage Resource Broker)
Support (XML based) computational science specific metadata like MathML, XSIL
Visualization
Programming
Application Integration (chaining services viewed as backend compute filters)
“Seamless Access” and integration of resources between different users/application domains
Parameter Specification Service (get data from Web form into Fortran program wrapped as backend object)

Job Status in High End Computing Portal
XML Separates Computer Data and Users

Typical XML Descriptor of Software as an Object

Slide 49