Given by Mark Baker at Tutorial for CRPC Annual Meeting at Argonne on May 13 1996. Foils prepared August 4 1996
Outside Index
Summary of Material
Distributed Computing. |
The Challenge. |
Understanding the Functionality of a Metacomputer. |
Workings of Typical Cluster Management Software. |
Features of Metacomputing Management Software:
|
Status of CMS Packages - basic problems. |
Some Current Metacomputing Projects:
|
Near and Future Projects:
|
Metacomputing in the future ! |
Outside Index Summary of Material
Mark Baker |
Northeast Parallel Architectures Center |
Syracuse University |
111 College Place |
Syracuse, NY 13244-4100, USA |
tel: +1 (315) 443 2083 |
fax: +1 (315) 443 1973 |
email: mab@npac.syr.edu |
URL: http://www.npac.syr.edu/ |
Parallel and Distributed Computing |
Distributed Systems: Some Problems |
The Challenge |
Understanding the Functionality of a Metacomputer |
Workings of Typical Cluster Management Software |
Features of Metacomputing Management Software
|
Status of CMS Packages - Basic Problems |
Some Current Metacomputing Projects
|
Near and Future Projects
|
Metacomputing in the future ! |
Vast numbers of under utilised workstations available to use. |
Huge numbers of unused processor cycles and resources that could be put to good use in a wide variety of applications areas. |
Reluctance to buy Supercomputer due to their cost and short life span. |
Distributed compute resources fit better into todays funding model. |
Parallel Computing |
Communication - high bandwidth and low latency. |
Low flexibility in messages (point-to-point). |
Distributed Computing |
Communication can be high or low bandwidth. |
Latency typically high -- can be very flexible messages involving fault tolerance, sophisticated routing, etc. |
Why use Distributed Computing Techniques ? |
Expense of buying, maintaining and using traditional MPP systems. |
Rapid increase in commodity processor performance. |
Commodity networking technology (ATM/FCS/SCI) of greater than 200 Mbps at present with expected Gbps performance in the very near future. |
The pervasive nature of workstations in academia and industry. |
Price/Performance of using existing hardware/software. |
Comms1 - From ParkBench Suite |
High Initial and Maintenance Costs
|
Applications Development
|
Two Types of Environment |
1) Management Software |
- Configured outside the OS Kernel.- Interacts with the OS.- Used to manage user applications. |
2) Distributed Programming Environment |
- Configured within kernel or as a dynamic kernel-module.- low-level interface necessary to get performance.- Typically used to support environments such as VSM. |
There is not a universally used or mandated environment with which to implement, manage and run a Metacomputer at this time. |
Presently there are only relatively small integrated local systems (LAN-based) - customised for local circumstances and conditions. |
To create the infrastructure for a Metacomputer you would ideally like to call-up your local computer vendor and buy a package that does everything that you want! |
The reality is that you will need to: |
- Buy/acquire the hardware.- Buy/acquire/develop or create the necessary software.- Integrate the hardware and software into a coherent system.- Edit/debug/tune/optimise your application before you can run it.- Support, maintain and "port" as time goes by. |
Support for all Programming Paradigm... |
- Sequential - monolithic.- Sequential - with forked processes.- Parallel - Shared objects (Legion).- Parallel - Virtual Shared Memory (TreadMarks).- Parallel - Shared task space (Linda).- Parallel - Shared memory using directives (HPF).- Parallel - Message passing interface (MPI).- Parallel - Message passing Environment (PVM).- Parallel - Thread based/object parallelism (Java). |
Transparent Utilisation of a Distributed |
Heterogeneous Computing Environment |
Want to fully utilise a heterogeneous computing environment where different types of processing resources and inter-connection technologies are effectively and efficiently used. |
Fully Utilise Available Resources |
Low utilisation rates of high-performance workstations (LLNL/Los Alamos 7- 10%), as their performance grows utilisation will become worse. |
Build a Metacomputer |
The use of distributed resources in this framework is known as Metacomputing and such an environment has the potential to maximise performance and cost effectiveness of a wide range of scientific and distributed applications. |
Do Not Want to Reinvent the "Wheel", So Must... |
Understand what we are trying to achieve - through-put and/or processor utilisation !? |
Learn from experiences with current LAN-based Cluster Management Software (CMS) packages |
Extend existing knowledge to design and develop a WAN-based Metacomputing Management package. |
Use new and emerging technologies to help solve some of the existing problems. |
Becoming a more common means of increasing the throughput of user applications in the US and Europe. |
A significant number of packages exist - almost all originate from research projects, many have now been taken-up/adopted by commercial vendors. |
Importance can be seen by both the commercial take-up and also by the widespread installation of this software at most of the major computing facilities around the world. |
Much effort is being expended to increase throughput by load balancing the work that needs to be done. |
Most packages are designed to run with Unix. Some support Linux (PCs) - NT support is planned by many vendors. |
WWW software and HTTP protocols could clearly be used as part of an integrated management package. |
Little software of this type so far been developed - several packages use a WWW browser as an alternative GUI. |
http://www.npac.syr.edu/techreports/hypertext/sccs-748/index.html |
Commercial Packages |
Research Packages |
Step 1 - Job Description File |
Produce some type of resource description file. |
ASCII text file (produced using a normal text editor or with the aid of a GUI) which contains a set of keywords to be interpreted by the CMS. |
The nature and number of keywords available depends on the CMS package, but will at least include the job name, the maximum runtime and the desired platform. |
Step 2 - Submit Job |
Job description file is sent by the client software resident on the user's workstation, to a master scheduler. |
The Master Schedular |
The Master Schedular |
On each of the resource workstation daemons are present that communicate their state at regular intervals to the master scheduler. |
One of the tasks of the master scheduler is to load balance the the resources that it is managing. |
When a job is submitted it not only has to match the requested resources with those that are available, but also needs to ensure that the resources being used are load balanced. |
Multiple Queues |
Typically multiple queues, each being appropriate for different types of job - for example:
|
The number of possible queue configurations will depend on the profile of the typical throughput of jobs on the system being managed. |
Fault Tolerance |
The master scheduler is also tasked with the responsibility of ensuring that jobs complete successfully. |
It does this by monitoring jobs until they successfully finish. |
If a job fails, due to problems other than an application runtime error, it will reschedule the job to run again. |
Computing Environments Supported |
Platforms Supported
|
Operating Systems
|
Additional Hardware/Software
|
Application support |
Batch/Interactive Job Support |
- For example, a debugging session or a job that requiresuser command-line input. |
Application Programming Support |
Application support |
Queue Type |
Suport for multiple, configurable, queues.- This feature is necessary for managing large multi-vendor clusters where jobs ranging from short interactivesessions to compute intensive parallel applications needto run. |
Job Scheduling and Allocation Policy |
Dispatching Policy |
Job Scheduling and Allocation Policy |
Minimise Impact on a Workstation and Owner |
- suspend jobs, change priority (nice), migrate jobs, etc.- Undesirable impact when a job is suspended, checkpointed or migrated.- Process migration requires a job to saves its state and isthen physically moved over the network to another w/s.- Impact on CPU/memory/diskspace while state is savedand on network when Mbytes of data is transferred. |
Job Scheduling and Allocation Policy |
Load Balancing |
Load balances the resources that it is managing. - Customise the default configuration to suit the localconditions - in the light of experience... |
Job Scheduling and Allocation Policy |
Check Pointing
|
Useful, but can be costly in terms of resources- Small job - rerun- Time critical job - necessary- Large jobs - costly |
Job Scheduling and Allocation Policy |
Checkpointing Needs |
Additional diskspace per workstation |
Filestore may be remotely mounted, impact on NFS performance and the network bandwidth. |
Existing systems may not have the resources (local diskspace). |
Job Scheduling and Allocation Policy |
Process Migration
|
- Minimise impact on w/s - owner takes back control.- Suspend job and then migrate it onto another w/safter a certain time interval. - Load balancing -- heavily -> lightly loaded systems.- Complicated on anything other for sequential jobs.- Similar impact to checkpointing + large state files movedaround the network - impact on network. |
Job Scheduling and Allocation Policy |
Job Monitoring and Rescheduling |
- Monitor that jobs are running and in the event of a jobfailure should reschedule job. |
Suspension/Resumption of Jobs |
- Help minimise the impact of a jobs on w/s owner. - Useful in the event of a system or network wide problem. |
Configurability |
Resource Administration
|
- Administrator should control who has access to whatresources and also what resources are used (CPU load,diskspace, memory). |
Job Runtime Limits
|
Configurability |
Process Management |
Configure the resources and manage jobs |
- Control over the number of processes running.- Exclusive access to a resource by a particular job.- Control the priority of jobs.- Management and control of forked child processes. |
Configurability |
Job Scheduling Control |
- User/administrator can schedule when a job will be run. |
GUI/Command-line |
- Dramatic increase in usage and popularity of the HTTPprotocol and the WWW, so a GUI based on this technologyseems likely to be a common standard in the future. |
Configurability |
Ease of Use
|
User Allocation of Jobs
|
User Job Status Query and Statistics |
Configurability |
Scalability |
The system should be scalable:- Across administrative boundaries.- Reduce SPF and increase resilience.- Practically based on "domains" but scalable fromthousands to tens of thousands of machines. |
Dynamics of Resources |
Runtime Configuration |
Reconfigure dynamically at runtime - resources available,queues and other configurable features i.e. not necessaryrestart |
Dynamic Resource Pool |
- Add and withdraw resources dynamically during runtime. |
Dynamics of Resources |
Single Point of Failure (SPF) |
Dynamics of Resources |
Fault Tolerance |
- System should check that resources are available beforesubmitting jobs.- Rerun a job after a workstation has crashed.- Guarantee that a job will complete. - Automatically recover status and continue to run afterfailure.- Level of fault tolerance is determined by the servicebeing provided. |
Dynamics of Resources |
Security Issues |
- Provide at least normal Unix security.- Takes advantage of NIS and other industry standardpackages. |
Truly heterogeneous platforms support, across:
|
Good documentation (on-line and manual) |
Vendor support |
Plug and Play Installation |
Batch and interactive usage |
Easy to use GUI |
Scalable |
Easy maintenance and support |
Easy to manage, reconfigure and administrate |
Ease of submitting, monitoring and control jobs |
Support for all programming paradigm |
Security |
Statistics |
No Single point of Failure |
Fault tolerant |
Checkpointing/Process-Migration |
LAN-Based - not scalable! |
Limited platform and operating system support - not truly heterogeneous |
Do not support all programming paradigms |
Load Balancing is generally naive |
Single-points-of-failure |
Limited-fault tolerance |
Projects |
Legion - University of Virginia |
WAMM - Italian Research Labs |
WANE - Florida State University |
The Legion project is an attempt to design and build system services that provide the illusion of a single virtual machine. |
Legion targets wide-area assemblies of workstations and supercomputers. |
Developed at the University of Virginia. |
It aims to provide - |
- Shared-object and shared-name spaces, - Application adjustable fault-tolerance, - Improved response time and greater throughput, - Wide-area network support, - Management and exploitation of heterogeneity, - Security, - Scheduling, - Resource management, - Parallel processing- Object inter-operability. |
Legion is an object-oriented system - designed around C++ |
The principles of the object-oriented paradigm are the foundation for the construction of Legion; the following features are exploited: |
- Encapsulation- Inheritance- Software reuse, - Fault containment - classes- Reduction in complexity. |
Campus Wide Virtual Computer (CWVC) |
The CWVC is a heterogeneous distributed computing environment built on top of Mentat. |
The CWVC is used to demonstrate the benefits of a Legion-like system - provides departments at UV with an interface to high performance distributed computing. |
The CWVC allows researchers at UV to share resources and to develop applications that will be usable in a true Legion setting. |
Organisations |
The CWVC is currently used by organisations at UV and the NASA Langley Research Center. |
Hardware Resource |
CWVC contains more than 100 workstations of varying types, including - IBM/Sun/HP/SGI. |
Mixture of network technologies, including ATM, FDDI, and Ethernet. |
Legion Tools |
The CWVC provides a set of tools that facilitate application |
development, debugging, and resource management. |
Mentat |
Federated File System |
Thermostat |
Resource Accounting Service |
MAD |
Prophet |
Mentat |
Parallel C++ : The Mentat object-oriented programming language. |
Provides a high-level abstractions that masks the complex aspects of parallel programming, including communication, synchronisation, and scheduling, from the programmer. |
Allows the programmer to concentrate on their application. |
Federated File System (FFS) |
Objects that execute on hosts in the CWVC are presented with a single unified file system abstraction - FFS. |
The FFS interacts with local file systems, i.e. NFS mount structures, etc., so files visible on one host may not be available on another. |
The FFS allows objects to view a single, unified file name space, and thus execute in a location independent manner. |
Interfaces is similar to the Unix standard library file system - little change to existing code necessary. |
Thermostat |
Thermostat provides an GUI to manage resources. |
Allows a resource owner to schedule the times of day and the days of the week that hosts will be available for CWVC use. |
It also allows the resource owner to specify the percent of the CPU time and memory that may be used by the CWVC during individual available time slots. |
Resource Accounting Services |
Resource utilisation is logged on a per-user basis, while resource availability (controlled via the Thermostat) is logged on a per machine basis. |
Usage and machine availability "credits" are scaled based on the computing power of the hosts involved - time on an SP-2 node is worth more time than on a Sun IPC. A report generation tool is provided to extract and summarise usage statistics. |
MAD |
MAD is a set of tools that enables programmers to debug their CVWC applications with the debugger of their choice. |
MAD supports post-mortem debugging: CWVC programs run to completion, or until an error |
The programmer can replay a specific object, i.e. reproduce its execution, under the control of a sequential debugger, and can use the traditional debugging cycle to find the bug. |
Data Parallel Computation Scheduler - Prophet |
Automatic run-time scheduling system for SPMD computations. |
Helps with processor selection, data decomposition, task placement. |
Chooses the best subset of processors to use based on the problem computation granularity. |
Decomposes data to provide processor load balance. |
Assigns tasks to processors to limit communication overhead. |
Scientific Applications |
Genome Library Comparison |
Atmospheric Simulation |
Automatic Test Pattern Generation for Integrated Circuits |
Electrical Engineering has developed a parallel Automatic Test Pattern Generation (ATPG) application. |
URL http://www.cs.virginia.edu/~legion |
WAMM (Wide Area Metacomputer Manager) |
WAMM is a graphical tool, built on top of PVM. |
Provides user with a GUI to assist in tasks such as: host add, check, removal, process management, compilation on remote hosts, remote commands execution. |
WAMM (Wide Area Metacomputer Manager) |
Sites Involved (Italy) |
CINCECA - Interuniversity Consortium of Northeast Italy for Automatic Comp - Bologna |
CASPUR - University and Research Consortium for Supercomputing Apps - Rome |
CRS4 - Centre for Advanced Studies, Research and Development - Sardinia |
CNUCE - institute of the Italian National Research Council - Pisa |
ScuolaNormale Superiore - Pisa |
Connection |
Networked by GARR, the Italian research network - 2 Mbps. |
WAMM (Wide Area Metacomputer Manager) |
WAMM (Wide Area Metacomputer Manager) |
GUI |
All functions are accessible via menus and buttons. |
Geographical view of the system. |
Hosts are grouped following a tree structure - WAN, MAN and LAN). |
WAMM (Wide Area Metacomputer Manager) - WAMM Tree |
WAMM (Wide Area Metacomputer Manager) |
Remote Command Execution |
UNIX commands (e.g. ls, uptime, who, etc.) as well as X11 programs (e.g. xload, xterm, etc.) can be executed on remote hosts. |
WAMM takes care of showing command output (for UNIX ones) and windows (for X11 ones) on the user's display. |
WAMM (Wide Area Metacomputer Manager) |
Remote Compilation |
Compilation of modules on remote nodes is greatly simplified. |
The user selects a group of hosts to compile onto and a set of source files to be compiled. |
WAMM copies sources on remote nodes, compiles them in parallel and shows progress in separate windows, one for each host. |
WAMM (Wide Area Metacomputer Manager) |
Configuration |
The Metacomputer configuration is specified through an external file, written in a simple declarative language. |
Number and grouping of hosts, remote commands for each node, icons can be specified. |
Graphical aspect (colours, fonts, etc.) can be customised via standard X11 resource files. |
WAMM (Wide Area Metacomputer Manager) |
WAMM (Wide Area Metacomputer Manager) |
Software Requirements |
PVM version 3.3 or higher |
X11 Release 5 or higher |
Motif version 1.2 or higher |
XPM version 3.4 or higher |
WAMM (Wide Area Metacomputer Manager) |
Supported Platforms |
HP/Sun/IBM/SGI/Digital. |
URL http://miles.cnuce.cnr.it/pp/wamm/ |
Introduction |
WANE is a Wide Area Networked Environment being developed at the Supercomputer Computations Research Institute (SCRI) as part of a three year grant from the US Department of Energy. |
It is designed to be provide highly scalable, fault tolerant computing and information encompassment. |
WANE is the integration of several ongoing software projects into one single transparent environment. |
Key technologies exploited by WANE |
- DQS - Job scheduling - PostGres - database accesses - Tcl/Tkl - X-Window development - Mosaic/WWW/gopher - internet information services - FreeNet - information services |
All software developed in this project will be supported as a "best-effort" by SCRI. |
WANE is a comprehensive Internet service package providing all the underlying software necessary to connect a server to the Internet. |
Furthermore, WANE provides the client software for a variety of platforms, allowing users to access all Internet services. |
WANE is a way for users to connect to the Internet quickly and without great expense or problems. |
The Wane Server |
The server software package is a turn key solution targeted at a diverse audience, which includes commercial, governmental, educational, and individual users. |
The system is designed to support thousands of users, is scalable, and fault tolerant. |
The package is modular, with services ranging from a simple mail or WWW server to a full fledged free net service package. |
The server package contains: |
Extensive user access controls. |
Hierarchical administrative levels. |
Multiple user domains. |
Optional support of separate administrative groups, allowing delegation of administrative tasks. |
User friendly GUI. |
Administration. |
A complete distribution of Linux. |
Multi-user operating system on inexpensive PC clones. |
Mail Hub service |
World Wide Web multimedia server software |
Internet connection tools/SW |
Internet Newsgroups |
Comprehensive suite of interactive personal |
Communication tools (Text, Audio, Video) |
Access to vast archives of software and data |
Client Software: |
The client software includes a variety of public domain and shareware packages enabling users to connect to the Internet via modem or LAN. |
The WANE client distribution provides support for a diverse set of platforms, including Macintosh, DOS, Windows, Windows NT, OS/2, and Unix. |
The connectivity tools provided allow full multimedia access over both LAN and phone lines. |
Client Software |
The Client distribution includes: |
- Network connection packages - PPP and SLIP. |
- WWW browsers. |
- Electronic mailer programs. |
Internet News readers |
HTML editor - Hot Metal, XHTML Edit & Phoenix |
Optional menu based environments (graphical and text) for administration and/or novice users |
Typical WANE Server Usage Diagram |
URL - http://www-wane-leon.scri.fsu.edu/ |
WWW/CGI Computing |
WWW is now the most promising candidate for the universal access core component of the NII. |
Current Web is ~15,000 servers and expands at the rate of ~1 new server/hour. |
Software industry starts adding value (Netscape, Netsite, Mosaic licenses, HotMetal, Netforce, Web support in OS/2 Warp and Windows95) |
. |
WWW/CGI Computing |
So far, Web was mainly used for static hypermedia such as local information pages, digital libraries, Internet directories etc. However, the WWW model offers also extension mechanisms (CGI, CCI) towards dynamic services and in fact arbitrary computation. |
Early interactive Web services appearing. Examples include: WebCalc (NASA Goddard), Easy HTML (NCSA), WebChat (Internet Society), Virtual Doors (Unique, Inc.), Visioneering's Imaging Machine (VRL, Inc.). |
WWW/CGI Computing |
Key points in Web Technology: Characteristics |
Current main components: HTTP; HTML; CGI; Fillout Form |
Client-server communication model - (Flat hierarchical UNIX) File system as the major file (data) management system |
WWW/CGI Computing |
Key points in Web Technology: Strengths |
Established Internet as the major vehicle in networking industry |
Universal, hyperlinked information access and dissemination |
Transparent networking navigation and GUI with multimedia information access for information - dissemination. |
WWW/CGI Computing |
Key points in Web Technology: Weaknesses |
Static, browser-oriented client. |
Document update done manually, hard to automate. |
Flat UNIX file system supports only primitive information system functions such as open, read/write and close. |
WWW-Based Project Undertaken at NPAC |
Collaboration with Boston University and Cooperating Systems NPAC has been developing concepts and prototypes of Compute-Webs. |
Partly motivated by the integration of information processing and computation for both a better programming environment and for a natural support of data intensive computing. |
Web represents the largest available computer with worldwide some 20 million potential nodes which is expected to grow by an order of magnetude as the Superhighway is deployed fully. |
WWW-Based Project Undertaken at NPAC |
First prototype was built on compute-extended Web Servers using the standard CGI mechanism and applied successfully to the factorisation of the RSA 130 digit number which was distributed to a net of Web servers. |
This work was presented at the SC'95 and was given the award as the most geographically dispersed and heterogeneous metacomputing solution in the Teraflop Challenge contest. |
WWW-Based Project Undertaken at NPAC |
RSA Factoring Challenge |
Public-key cryptosystem for both encryption and authentication; invented in 1977 by Rivest-Shamir-Adleman (RSA). |
RSA a cryptosystem where each party has two keys: a public key and a corresponding secret key. |
The public key is made public, the secret key is kept secret. |
WWW-Based Project Undertaken at NPAC |
RSA Factoring Challenge |
In RSA the secret key can be derived from the public key. |
Factoring large numbers is believed to be hard - so RSA is believed to be secure. |
WWW-Based Project Undertaken at NPAC |
RSA Factoring Challenge |
People are protecting their data and money using 155- digit (i.e., 512-bit) numbers. |
WWW-factoring project is the first large scale project that makes use of a new and factoring method - the Number Field Sieve (NFS). |
First goal was to factor RSA 130 - ultimate goal to to break a RSA 155. |
WWW-Based Project Undertaken at NPAC |
RSA Factoring Components - FAFNER |
FAFNER is a collection of Perl scripts, HTML pages, and associated documentation which together comprise the server-side of the Web factoring effort. |
The FAFNER software provides interactive registration, task assignment, and solution database services to sieving clients. |
WWW-Based Project Undertaken at NPAC |
GNFS (General Number Field Sieve) |
The GNFS client package implements the sieving algorithms that converts a task specification into a set of useful results (called relations). |
GNFS performs relatively little I/O - embarrassingly parallel, and has large (but configurable and constant for an entire run) memory requirements. |
WWW-Based Project Undertaken at NPAC |
GNFSD (General Number Field Sieving Daemon) |
GNFSD is an augmented sieving client that allows a GNFS process to interact with a task server over the net, rather than requiring task specification on the command line. |
Other key features are automatic failure detection and restart via a watchdog timer, persistent configuration state, and a TCP/IP monitor interface at port 5453. |
WWW-Based Project Undertaken at NPAC |
How It All Fits Together |
The FAFNER servers are hierarchical; root server, plus several major subservers - each in turn has subservers, and so forth. |
Subserver depends on its parent for sieving tasks. |
The sieving clients (GNFS or GNFSD) are the leaves of the FAFNER tree; they get a single task from a FAFNER server, and then spend time computing the problem. |
WWW-Based Project Undertaken at NPAC |
When the answers are ready (in the form of a text file containing a few 100 or few 1000 relations), the clients send them back to their FAFNER server. |
There, they are distilled, archived, and ultimately sent back to Bellcore, where they are integrated into the final solution - the factoring of RSA number. |
Successfully computed RSA-130. |
WWW-Based Project Undertaken at NPAC |
The Problem with Fafner |
Major problem with the CGI enhanced Web servers that supported RSA factoring, was that they did not provide the standard support which one expects from clustered computing packages. |
Such as - load balancing, fault tolerance, process management, automatic minimisation of job impact on user workstations, security, and accounting support. |
A Scalable Metacomputer Management Package |
The overall aim of this project is to design, develop and implement a WWW-based Metacomputer management package - MetaWeb. |
Project will build on existing knowledge and experiences with the management of LAN-based computing clusters to produce a software package capable of managing a potentially globally distributed Metacomputer. |
Objective is to increase the through-put of user applications by utilising the wealth of existing networked computing resources efficiently and effectively together. |
Ecourage collaboration between groups using MetaWeb. |
Truly heterogeneous, capable of managing resources ranging from PCs to vector/MPP supercomputers. |
Capability will be based on the use of pervasive WWW software and take advantage of Java architecturally neutral ByteCode. |
MetaWeb: designed to be fully fault tolerant. |
Not only will it be able to reboot itself and retain its previous status but also be able to resume or restart failed application jobs. |
This ability is enabled by the fully duplicated design of MetaWeb and also by the use of a persistent database to maintain the Metacomputer's current status |
The MetaWeb Prototype |
Using existing WWW-based technologies such as Perl CGI-scripts, C-modules and HTTP servers. |
This prototyping phase of the project will allow the MetaWeb design be proven and adapted as necessary. |
Ensure that the implemented version of the package will be functional, robust and work as it is intended to. |
MetaWeb - The Product. |
MetaWeb will replace the need to use existing research and commercial cluster management packages, such as Codine, LoadLeveler, DQS, etc, by exploiting emerging technologies and the ubiquitous nature of WWW. |
MetaWeb will exhibit all the best features of the existing management packages but will have the advantage of being specifically designed and developed with all the latest and emerging WWW technologies at hand. |
MetaWeb Project Collaboration |
Northeast Parallel Architectures Center |
Cornell Theory Center
|
The Future Trends... |
Long term is hard to predict - See changes over last 5 Years!! |
Can see trends, however... |
Hardware Trends (5 - 10 Years) |
Computers |
Millions (100 - 300) of "settop" boxes |
One in every US household |
More worldwide |
Ranging from Supercomputer to Personal Digital Assistants |
Hardware Trends (5 - 10 Years) |
Networks |
Networks (1 - 20 MBytes/s) - fulfil needs of "home" entertainment industry. |
Technologies ranging from high-bandwidth fibre to Electro-magnetic types such as Microwave. |
Hardware Trends (5 - 10 Years) |
Software |
Very Hard to Predict in relatively short term - JAVA has been |
product for about a year !! |
Ubiquitous and pervasive (WWW/JAVA-like). |
Can forget about underlying h/w and OS. |
Metacomputing "plug-ins" |
Micro-kernel-like JAVA based servers with add-on services that can support Metacomputing (load balancing, migration, checkpointing, etc...). |