Full HTML for Basic MetaComputing -- the Informal Supercomputer -- MRA Meeting Part I:Concepts and Issues

Full HTML for

Basic foilset MetaComputing -- the Informal Supercomputer -- MRA Meeting Part I:Concepts and Issues

Given by Mark Baker, Geoffrey Fox at Tutorial for CRPC MRA Meeting at Cornell on May 7 1996. Foils prepared May 7 1996
Outside Index Summary of Material

A Brief History of Scientific Computing

- Performance of Supercomputers and Networks

Some Terminology

The Need for Supercomputing - A Rationale for Metacomputing

- Some Examples

- Current Access to Resources

- Need for Alternatives

- Computer Architectures

- Why Use Parallel Computing Techniques

Parallel versus Distributed Computing

- Pros and Cons

- Why Use Distributed Computing Techniques

- Examples of Communications Performance

The Challenge

Infrastructure and Technology

Features of Distributed Systems

- Performance, Distributed File Systems, Namespace- Service Availability, Resource Allocation, Security

The Problem...

The Reality

Some Relevant Parallel Programming Languages - Legion - TreadMarks - Linda - HPF - MPI - PVM - JAVA

Table of Contents for full HTML of MetaComputing -- the Informal Supercomputer -- MRA Meeting Part I:Concepts and Issues

Denote Foils where HTML is sufficient

1

Metacomputing: The Informal Supercomputer
2

Lecture 1: An Introduction to the Concepts and Issues
3

Lecture 1: An Introduction to the Concepts and Issues
4

Overview (Contd)
5

Overview (Contd)
6

A Brief History of Scientific Computing
7

A Brief History of Scientific Computing
8

A Brief History of Scientific Computing
9

Computer Performance (1940 - 1995)
10

Network Performance (1955 - 2005)
11

Some Terminology - 1
12

Some Terminology - 2
13

Some Terminology - 3
14

Some Terminology - 4
15

Some Terminology - 5
16

The Need For Supercomputing - Some Examples
17

The Need for Supercomputing - (Cont.)
18

The Need for Alternative Supercomputing Resources
19

The Need for Supercomputing - (Cont.)
20

Parallel Architectures
21

Parallel Architectures
22

Single Instruction Multiple Data (SIMD)
23

Multiple Instruction Multiple Data (MIMD)
24

Computer Architectures - MIMD - Shared Memory
25

Computer Architectures - MIMD - Shared Memory
26

Computer Architectures - Shared Memory
27

Computer Architectures - Shared Memory
28

Computer Architectures - MIMD - Distributed Memory
29

Computer Architectures - MIMD - Distributed Memory
30

Computer Architectures - MIMD - Distributed Memory
31

Computer Architectures - MIMD - Distributed Memory
32

Computer Architectures - MIMD - Distributed Memory
33

Distributed Memory: Some Approaches
34

Distributed Memory: Some Approaches
35

Distributed Memory: Some Approaches
36

Distributed Memory: Examples
37

Computer Architectures - MIMD - Virtual Shared Memory
38

The Need for Supercomputing - (Cont.)
39

The Need for Supercomputing - (Cont.)
40

Parallel Computing versus Distributed Computing
41

Parallel Computing versus Distributed Computing
42

Some Comments about Parallel and Distributed Computing
43

Communications Performance of Some Parallel and Distributed Systems
44

The Challenge
45

Infrastructure and Technology
46

Features of Distributed Systems - Performance (1)
47

Features of Distributed Systems - Performance (2)
48

Features of Distributed Systems - Distributed File Systems
49

Features of Distributed Systems - Single User Namespace
50

Features of Distributed Systems - Service Availability
51

Features of Distributed Systems - Resource Allocation
52

Features of Distributed Systems - Security
53

Features of Distributed Systems: The Problems
54

Features of Distributed Systems: The Problems
55

Features of Distributed Systems: The Problems
56

Categories of Application Problems
57

Categories of Application Problems
58

Software Issues - Methodology of Parallel Computing
59

Example of Basic Problem Architectures for MetaProblem Class
60

Affordable Systems Optimisation Process (ASOP)
61

Affordable Systems Optimisation Process (ASOP)
62

Affordable Systems Optimisation Process (ASOP)
63

Affordable Systems Optimisation Process (ASOP)
64

What is the "Correct" Machine Architecture for Each Problem Class
65

Enabling HPCC Software Technologies
66

Enabling HPCC Technologies
67

Comparison of Three Different Programming Models
68

Some Hardware and Software Trends 0ver the next 5 Years
69

Some Hardware and Software Trends 0ver the next 5 Years
70

Distributed/Parallel Programming Paradigms
71

Parallel/Distributed Programming Languages
72

Legion - Introduction
73

Legion - Introduction
74

Legion - Philosophy
75

Legion - Distributed Objects
76

Legion - Philosophy
77

The Legion Testbed at the University of Virginia
78

The Legion Testbed at the University of Virginia
79

The Legion Testbed at the University of Virginia
80

The Legion Testbed at the University of Virginia
81

Legion Tools
82

Legion Tools
83

Legion Tools
84

Legion Tools
85

Legion Tools
86

Legion Tools
87

Legion Tools
88

Legion Tools
89

Legion Tools
90

Legion - Applications
91

TreadMarks: Shared Memory Computing on Networks of Workstations
92

The TreadMarks Interface
93

The TreadMarks Interface
94

TreadMarks - Parallel "Hello World"
95

Programming Experience between PVM and TreadMarks - Generalisations.
96

Programming Experience between PVM and TreadMarks - Generalisations.
97

Programming Experience between PVM and TreadMarks...
98

Network Linda
99

Linda - Overview
100

Linda - Overview
101

The Linda Model
102

Master/Worker Model using Virtual Shared Memory
103

Master/Worker Model using Virtual Shared Memory
104

Linda Basics
105

Linda Basics
106

Linda Basics
107

Linda Basics
108

Linda Basics
109

Example Code: Features Illustrated in FORTRAN/C Code
110

Linda - Parallel "Hello World"
111

High Performance Fortran
112

High Performance Fortran
113

HPF Goals
114

HPF Goals
115

HPF Language
116

HPF Language
117

Message Passing Interface (MPI)
118

Message Passing Interface (MPI)
119

MPI Forum
120

MPI's prime goals are:
121

MPI - Parallel "Hello World"
122

Parallel Virtual Machine (PVM)
123

PVM - What is it?
124

PVM - What is it?
125

PVM Features
126

PVM Features
127

PVM - History
128

Advantages/Disadvantages to Using PVM
129

Advantages/Disadvantages to Using PVM
130

Advantages/Disadvantages to Using PVM
131

Advantages/Disadvantages to Using PVM
132

Advantages/Disadvantages to Using PVM
133

PVM Components
134

PVM Daemon (pvmd3)
135

PVM Daemon (pvmd3)
136

PVM Libraries
137

PVM Libraries
138

JAVA - Overview
139

JAVA - Overview
140

History of Java Language and Team
141

Some Key Java Features
142

Java Features -- It's Simple and Familiar!
143

Java Features -- It's Simple and Familiar!
144

Java Features -- It's Object-oriented
145

Java Features -- It's Object-oriented
146

Java Features -- It's Architecture-Neutral
147

Java Features -- It's Architecture-Neutral
148

Java Features -- It's Portable
149

Java Features -- It's Portable
150

Vava Features -- It's Distributed
151

Java Features -- It's (Hopefully) Secure
152

Java Features -- It's (Hopefully) Secure
153

Java Features -- It's Multithreaded
154

Java Features -- It's Multithreaded
155

Java Language -- Program Structure
156

Java Language -- Program Structure
157

Hello World Applet from Sun Tutorial
158

Hello World Applet from Sun Tutorial

Outside Index Summary of Material

HTML version of Basic Foils prepared May 7 1996

Foil 1 Metacomputing: The Informal Supercomputer

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Mark Baker and Geoffrey Fox

Northeast Parallel Architectures Center

Syracuse University

111 College Place

Syracuse, NY 13244-4100, USA

tel: +1 (315) 443 2083

fax: +1 (315) 443 1973

email: mab@npac.syr.edu

URL: http://www.npac.syr.edu/

HTML version of Basic Foils prepared May 7 1996

Foil 2 Lecture 1: An Introduction to the Concepts and Issues

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Overview

A Brief History of Scientific Computing

- Performance of Supercomputers and Networks

Some Terminology (1 - 5)

The Need for Supercomputing - A Rationale for Metacomputing

- Some Examples

- Current Access to Resources

- Need for Alternatives

- Computer Architectures

- Why Use Parallel Computing Techniques

HTML version of Basic Foils prepared May 7 1996

Foil 3 Lecture 1: An Introduction to the Concepts and Issues

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Parallel versus Distributed Computing

- Pros and Cons

- Why Use Distributed Computing Techniques

- Examples of Communications Performance

The Challenge

Infrastructure and Technology

Features of Distributed Systems

- Performance, Distributed File Systems, Namespace- Service Availability, Resource Allocation, Security

HTML version of Basic Foils prepared May 7 1996

Foil 4 Overview (Contd)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The Problem...

The Reality

Software

HTML version of Basic Foils prepared May 7 1996

Foil 5 Overview (Contd)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Software (Contd)

o Comparison of Three Programming Modelso Hardware and Software Trends Over the Next 5 Yearso Parallel Programming Languages - Legion - TreadMarks - Linda - HPF - MPI - PVM - JAVA

HTML version of Basic Foils prepared May 7 1996

Foil 6 A Brief History of Scientific Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Late 1970's and early 1980's

Departmental Systems - VAX's come into use - Time Shared Resources - Relatively high initial capital and maintenance costs - Skilled system administrator

Early 1980's to Mid 1980's

Unix workstations became available - and affordable - Limited CPU, memory and disk resources - Skilled system administrator

HTML version of Basic Foils prepared May 7 1996

Foil 7 A Brief History of Scientific Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Mid 1980's to late 1980's

Networking - more commonly - Ethernet - Distributed file systems and servers

Late 1980's to early 1990's

Desktop workstations start to proliferate - CPU/Memory/Disk more affordable - Parallel Languages - P4, PVM, Linda, etc. - Hippi and FDDI networks

HTML version of Basic Foils prepared May 7 1996

Foil 8 A Brief History of Scientific Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Early 1990's to Mid 1990's

Desktops used for distributed computing - Capabilities of CPU/Memory/Disk rapidly increasing - "loaded" workstations - ATM inter-connect - Software development environments begin to mature - Standards begin to appear - MPI, HPF, DCE...

Further Advances in the 1990's

- Performance of networks and computers will increase by 1000during the 1990's.- Competitive advantage to industries that can use either or bothHPC and Communications networks.

HTML version of Basic Foils prepared May 7 1996

Foil 9 Computer Performance (1940 - 1995)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 10 Network Performance (1955 - 2005)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 11 Some Terminology - 1

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Bandwidth - The communications capacity (measured in bits per

second) of a transmission line or of a specific path through the

network.

Clustered Computing - An environment consisting of many

workstations connected together by a local area network and can

be generalised to a heterogeneous collection of machines with

arbitrary architecture and interconnect.

High-Performance Distributed Computing (HPDC) - The use of

distributed networked computers to achieve high performance on a

single problem, i.e., the computers are co-ordinated and

synchronised to achieve a common goal.

HTML version of Basic Foils prepared May 7 1996

Foil 12 Some Terminology - 2

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

LAN, MAN, WAN - Local, Metropolitan, and Wide Area Networks can

be made from any or many of the different physical network media,

and run the different protocols. LAN's are typically confined to

departments (less than a kilometre), MAN's to distances of order 10

kilometres, and WAN's can extend worldwide.

Metacomputer - This term describes a collection of heterogeneous

computers networked by a high-speed wide area network. Such an

environment would recognise the strengths of each machine in the

Metacomputer, and use it accordingly to efficiently solve so-called

Metaproblems. The World Wide Web has the potential to be a

physical realisation of a Metacomputer.

HTML version of Basic Foils prepared May 7 1996

Foil 13 Some Terminology - 3

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Metaproblem - Term describes a class of problem which is outside the

scope of a single computer architectures, but is instead best run on a

Metacomputer . These problems consist of many constituent subproblems.

Ex. the design and manufacture of a modern aircraft, which presents

problems in geometry grid generation, fluid flow, acoustics, structural

analysis, operational research, visualisation, and database management.

The Metacomputer for such a Metaproblem would be networked

workstations, array processors, vector supercomputers, MPP, and

visualisation engines.

Protocol - A set of conventions and implementation methodologies defining

the communication between nodes on a network. Ex. seven layer OSI

standard model going from physical link (optical fiber to satellite) to

application layer (such as Fortran subroutine calls).

HTML version of Basic Foils prepared May 7 1996

Foil 14 Some Terminology - 4

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Supercomputer - Most powerful computer that is available at any given

time. As performance is roughly proportional to cost, this is not very

well defined for a scalable parallel computer. Traditionally, computers

costing some $10 - $30 M are termed supercomputers.

Symmetric Multiprocessor (SMP) - A Symmetric Multiprocessor

supports a shared memory programming model -- typically with a UMA

memory system, and a collection of up to 32 nodes connected with a

bus.

Transmission Control Protocol (TCP) - A connection-oriented

transport protocol used in the DARPA Internet. TCP provides for the

reliable transfer of data, as well as the out-of-band indication of urgent

data.

HTML version of Basic Foils prepared May 7 1996

Foil 15 Some Terminology - 5

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

WWW Clients and Servers - A distributed set of clients (requesters

and receivers of services) and servers (receiving and satisfying

requests from clients) using Web Technologies.

World Wide Web and associated Technologies - A very important

software model for accessing information on the Internet based on

hyperlinks supported by Web technologies, such as HTTP, HTML,

MIME, Java, Applets, and VRML.

HTML version of Basic Foils prepared May 7 1996

Foil 16 The Need For Supercomputing - Some Examples

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Scientists - Verify or disprove things and get their work done faster...

Engineers - Simulate things (Car, drugs, aeroplanes...) in ever greater detail before producing expensive prototypes or production systems

Retailers - Examine sales records to tailor there marketing and production needs to the most lucrative market.

Airlines - Optimise the carriage of passengers on their aircraft - fill flights, understand route-trends, mine data for valuable information.

Financiers - Make vast profits by capitalising on the intelligent probabilistic prediction of stock mark shares faster than their competitors.

Consumers - Access to movies, down load to the home.

HTML version of Basic Foils prepared May 7 1996

Foil 17 The Need for Supercomputing - (Cont.)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Current Access to Supercomputing Resources

1. Community of users for FREE Supercomputing far exceeds the

available resources.

2. Computational needs of these users far exceeds the

Supercomputing resources available now or in the near future.

HTML version of Basic Foils prepared May 7 1996

Foil 18 The Need for Alternative Supercomputing Resources

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

1. Vast numbers of under utilised workstations available to use.

2. Huge numbers of unused processor cycles and resources that

could be put to good use in a wide variety of applications areas.

3. Reluctance to buy Supercomputer due to their cost and short life

span.

4. Distributed compute resources "fit" better into todays funding

model.

HTML version of Basic Foils prepared May 7 1996

Foil 19 The Need for Supercomputing - (Cont.)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Architectures of Supercomputers

Single processor - past/traditional

Vector

Multiple processor Shared Memory

Multiple processors

Combinations - modern/innovative

Main Architectures are Either Shared or Distributed Memory

HTML version of Basic Foils prepared May 7 1996

Foil 20 Parallel Architectures

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Vary greatly in complexity:

small machine with a handful of processors on one circuit board.

larger machines might have many thousands of complex processors connected by a number of specialised communications networks contained in many cabinets.

Basically, any parallel computer consists of three main elements:

Processors

Memories

Interconnection network to enable communication between these elements.

HTML version of Basic Foils prepared May 7 1996

Foil 21 Parallel Architectures

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Vary greatly in complexity:

A typical way to classify these architectures is with Flynn's taxonomy, which labels architectures according to instruction stream and data stream.

For example, an idealised serial computer would be labeled Single instruction Single Data (SISD) as it executes one instruction at a time on a single piece of data.

HTML version of Basic Foils prepared May 7 1996

Foil 22 Single Instruction Multiple Data (SIMD)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The SIMD architecture consists

Many (typically simple) processors, with some local memory.

Executing the same instruction in lockstep, on a small piece of data in its local memory, with the instructions issued by the controller processor.

Such architectures are good for applying algorithms which require the same operation on a large array of elements, however, they suffer badly if the problem results in load imbalances as the processors synchronise after every step. This is not usually a problem for data parallel programs.

Examples - AMT DAP - Maspar

HTML version of Basic Foils prepared May 7 1996

Foil 23 Multiple Instruction Multiple Data (MIMD)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

MIMD architectures consist of a number of powerful processors which can each execute individual instruction streams. The usual subdivision of this class is by the relationship between the processor and memory.

1. MIMD - Shared Memory

2. MIMD - Distributed Memory

3. MIMD - Virtual Shared Memory

HTML version of Basic Foils prepared May 7 1996

Foil 24 Computer Architectures - MIMD - Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 25 Computer Architectures - MIMD - Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

All processors associated with the same shared memory structure access the exact same storage location

Synchronisation is achieved by controlling tasks' reading from and writing to the shared memory.

A shared memory location must not be changed by one task while another, concurrent task is accessing it.

HTML version of Basic Foils prepared May 7 1996

Foil 26 Computer Architectures - Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Data sharing among tasks is fast (speed of memory access).

Attractive feature of shared memory is that the time to communicate among the tasks is effectively a factor of a single fixed value, that being "the time it takes a single task to read a single location."

Disadvantage: scalability is limited by number of access pathways to memory.

If there are more tasks than connections to memory, you have contention for access to the desired locations, and this amounts to increased latencies while all tasks obtain the required values. So the degree to which you can effectively scale a shared memory system is limited by the characteristics of the communication network coupling the processors to the memory units.

HTML version of Basic Foils prepared May 7 1996

Foil 27 Computer Architectures - Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Examples - SM can be either via hardware - SMP's or emulated through software.

Software - HPF (various sources), Craft (Cray), Legion (Virginia), TreadMarks (Rice).

Hardware - Cray -C90, SGI - power series, Convex - Exemplar.

HTML version of Basic Foils prepared May 7 1996

Foil 28 Computer Architectures - MIMD - Distributed Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 29 Computer Architectures - MIMD - Distributed Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Memory is physically distributed among processors; each local memory is directly accessible only by its processor.

Similar to buying an ordinary workstation, each component of a distributed memory parallel system is, in most cases, a self-contained environment, capable of acting independently of all other processors in the system.

To achieve the true benefits of this system, there must be a way for all of the processors to act in concert, which means "control"

Synchronisation is achieved by moving data between processors (communication).

HTML version of Basic Foils prepared May 7 1996

Foil 30 Computer Architectures - MIMD - Distributed Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The only link among these distributed processors is the traffic along the communications network that couples them; therefore, any "control" must take the form of data moving along that network to the processors.

HTML version of Basic Foils prepared May 7 1996

Foil 31 Computer Architectures - MIMD - Distributed Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

A Major Concern

A major concern is data decomposition -- how to divide arrays among local CPUs to minimise communication

Here is a major distinction between shared- and distributed-memory:

In SM the processors don't need to worry about communicating with their peers, only with the central memory, while in the DM there really isn't anything but the processors.

HTML version of Basic Foils prepared May 7 1996

Foil 32 Computer Architectures - MIMD - Distributed Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

A single large regular data structure, such as an array, can be left intact within SM, and each co-operating processor is simply told which ranges of indices its is to deal with;

In DM, once the decision as to index-ranges has been made, the data structure has to be decomposed, i.e., the data within a given set of ranges assigned to a particular processor must be physically sent to that processor in order for the processing to be done, and then any results must be sent back to whichever processor has responsibility for co-ordinating the final result.

And, to make matters even more interesting, it's very common in these types of cases for the boundary values, the values along each "outer" side of each section, to be relevant to the processor which shares that boundary.

HTML version of Basic Foils prepared May 7 1996

Foil 33 Distributed Memory: Some Approaches

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Distributed memory is, for all intents and purposes, virtually synonymous with message-passing, although the actual characteristics of the particular communication schemes used by different systems may hide that fact.

Message-passing approach:

Tasks communicate by sending data packets to each other.

Messages are discrete units of information and can be distinguished from all other messages (theory).

Parallel tasks use these messages to send information and requests for same to their peers.

HTML version of Basic Foils prepared May 7 1996

Foil 34 Distributed Memory: Some Approaches

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Message-passing approach:

The overhead is proportional to size and number of packets (more communication means greater costs; sending data is slower than accessing shared memory.).

Message-passing is not cheap: every one of those messages has to be individually constructed, addressed, sent, delivered, and read, all before the information it contains can be acted upon.

In the general case, message-passing will take more time and effort than shared-memory.

HTML version of Basic Foils prepared May 7 1996

Foil 35 Distributed Memory: Some Approaches

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Message-passing approach:

Shared memory scales less well than message passing, and, once past its maximum effective bandwidth utilisation, the latency associated with message-passing may actually be lower than that encountered on an over-extended shared memory communications network.

There are ways to decrease the overhead associated with message-passing, the most significant being to somehow arrange to do as much valuable computation as possible while communication is occurring.

HTML version of Basic Foils prepared May 7 1996

Foil 36 Distributed Memory: Examples

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

hypercube machines (e.g., nCube)

Hypercube architectures typically utilise off-the-shelf processors coupled via proprietary networks (both hardware and message-passing software);

IBM SPx

Utilising standard RS6000-type processors coupled via a very high speed, high bandwidth switch networking architecture, the nodes within an SPx system use a special very low-level message-passing library for the lowest-level (i.e., most efficient, fastest) mechanism for communicating with one another.

HTML version of Basic Foils prepared May 7 1996

Foil 37 Computer Architectures - MIMD - Virtual Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Important class of these hybrids is the so called virtual shared memory architecture.

Each processor has its own local memory (like a distributed machine), however, direct remote memory access (like a shared memory machine) is possible via a global address space.

Relies on special hardware or software to deal with the communication, while the processors continue computing.

Such a systems benefits from fast communications with a scalable architecture.

HTML version of Basic Foils prepared May 7 1996

Foil 38 The Need for Supercomputing - (Cont.)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Why use parallel computing techniques ?

Limitations on traditional Supercomputer

- Speed of light - Silicon technology - Production costs
Rapid increase in the performance of commodity processors.
- - Intel CISC X86 architecture chip has increased in performance by a factor of 450 between 1978 and 1994. - RISC growth even faster, Sun/HP/IBM/MIPS expect theperformance of their processors to double every 18 months.

HTML version of Basic Foils prepared May 7 1996

Foil 39 The Need for Supercomputing - (Cont.)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Existing and emerging network technologies are making the practical use of distributed computing resources on a range of user applications a reality.

Parallel/Distributed computing is the natural

evolution of the techniques and methods that will

fulfil our computational needs in the future.

HTML version of Basic Foils prepared May 7 1996

Foil 40 Parallel Computing versus Distributed Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Parallel Computing

Communication must be high bandwidth and low latency.

Low flexibility (and overhead) in messages (point-to-point).

Distributed Computing

Communication can be high or low bandwidth.

Latency typically high ---> can be very flexible messages involving fault tolerance, sophisticated routing, etc.

Parallel Computing should on technical grounds define standards as more restrictive (demanding)? However it is smallest field.

HTML version of Basic Foils prepared May 7 1996

Foil 41 Parallel Computing versus Distributed Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Distributed Computing

World Wide Web revolutionising both parallel and distributed approaches and standards

HTML version of Basic Foils prepared May 7 1996

Foil 42 Some Comments about Parallel and Distributed Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Why use Distributed Computing Techniques ?

Expense of buying, maintaining and using traditional MPP systems.

Rapid increase in commodity processor performance.

Commodity networking technology (ATM/FCS/SCI) of greater than 200 Mbps at present with expected Gbps performance in the very near future.

The pervasive nature of workstations in academia and industry.

Price/Performance of using existing hardware/software

HTML version of Basic Foils prepared May 7 1996

Foil 43 Communications Performance of Some Parallel and Distributed Systems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Comms1 - From ParkBench Suite

HTML version of Basic Foils prepared May 7 1996

Foil 44 The Challenge

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

To fully utilise a heterogeneous computing environment where different types of processing elements and inter-connection technologies are effectively and efficiently used.

The use of distributed resources in this framework is known as Metacomputing and such an environment has the potential to maximise performance and cost effectiveness of a wide range of scientific and distributed applications.

HTML version of Basic Foils prepared May 7 1996

Foil 45 Infrastructure and Technology

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Features of Distributed Systems

The Pros and Cons of distributed systems - examine the following

areas:

- Performance.- Distributed File System.- User Name Space.- Service Availability.- Scheduling Capability.- System Management.- Security.

HTML version of Basic Foils prepared May 7 1996

Foil 46 Features of Distributed Systems - Performance (1)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Decreased turnaround time

- Time-critical applications... - Size of workload increased - More effective use of Users time o Short turnaround - rapid iteration/refinement o Visualisation and data analysis

Solving Larger problems than would fit on a single system

- Easing of memory restrictions - out-of-core calculations... - Improved cache utilisation

HTML version of Basic Foils prepared May 7 1996

Foil 47 Features of Distributed Systems - Performance (2)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Utilise Pooled Resources more effectively

- "fat" and "thin" systems - MPP/Vector/...

Better fault tolerance

- Replication of data - Replication of service

HTML version of Basic Foils prepared May 7 1996

Foil 48 Features of Distributed Systems - Distributed File Systems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Ideally a single environment-wide namespace is needed for all files

- Provides high-levels of security - Reduces the replication of data and programs by users - Simplifies needs for tape-type media - Simplifies backup and archive - Identical access paths from any system - Reduces data-movement by user - Simplifies distributed applications - Common environment customised for individual systems

HTML version of Basic Foils prepared May 7 1996

Foil 49 Features of Distributed Systems - Single User Namespace

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Single version of password entry across all systems

- One password - One $HOME directory

Reference Password File

- Improved security and control - Minimises administrative effort

Crossing Administrative Boundaries

- Username mapping (common gid's & uid's).

HTML version of Basic Foils prepared May 7 1996

Foil 50 Features of Distributed Systems - Service Availability

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

High Availability

- Replication of services - Commonly seen now - read-only file-systems

System independent resource naming access mechanisms

- Remove dependence on Unix, VMS, NT, etc...

Name service to maintain resource information

- Security

HTML version of Basic Foils prepared May 7 1996

Foil 51 Features of Distributed Systems - Resource Allocation

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Single job acceptance system for all resources (CMS)

- Collaboration/Integration with local scheduling software - Co-operation with foreign systems

Single Environment for execution

- Batch and interactive inter-changeable - Dependent on services and filesystem

Replicated Scheduler (CMS)

- Availability and performance

Interface to Security

- Access control

HTML version of Basic Foils prepared May 7 1996

Foil 52 Features of Distributed Systems - Security

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Authentication

- Proof of identity

Access Control

- Control of allocated/available resources

Auditing

- Access to logging

Accounting

- Utilisation statistics

HTML version of Basic Foils prepared May 7 1996

Foil 53 Features of Distributed Systems: The Problems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

High Initial and Maintenance Costs

- Cost of distributed software - Systems support staff - Technical expertise

Applications Development

- Relatively immature technology - Few standards- Difficult to debug- Immature software development tools - Applications difficult and time consuming to develop - Difficult to tune and optimise across all platforms

HTML version of Basic Foils prepared May 7 1996

Foil 54 Features of Distributed Systems: The Problems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

General

- Load balancing

The net result is that there is not a universally used or mandated environment with which to implement, manage and run a Metacomputer at this time.

At present there are only relatively small integrated local systems. These are customised for local circumstances and conditions.

To create the infrastructure for a Metacomputer you would Ideally like to call-up your local computer vendor and buy a package that does everything that you want!

HTML version of Basic Foils prepared May 7 1996

Foil 55 Features of Distributed Systems: The Problems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The reality is that you will need to:

- Buy/acquire the hardware.

- Buy/acquire/develop or create the necessary software.

- Integrate the hardware and software into a coherent system.

- Edit/debug/tune/optimise your application before you can run it.

- Support, maintain and "port" as time goes by.

HTML version of Basic Foils prepared May 7 1996

Foil 56 Categories of Application Problems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Sequential - Traditional codes - "dusty deck FORTRAN or COBOL or C++ etc, typically "single threaded".

Data Parallel - Synchronous: Tightly coupled, software needs to exploit features of problem structure to get good performance. Comparatively easy as different data elements are essentially identical.

Data Parallel - Loosely Synchronous: As above but data elements are not identical. Still parallelises due to macroscopic time synchronisation.

HTML version of Basic Foils prepared May 7 1996

Foil 57 Categories of Application Problems

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Asynchronous - Functional (or data) parallelism that is irregular in space and time. Often loosely coupled and so need not worry about optimal decomposition's to minimise communication. Hard to parallelise (massively) unless ....

Embarrassingly parallel - Essentially independent execution of disconnected components.

Metaproblems - Asynchronous collection of (loosely) synchronous components where these programs themselves can be parallelised (coarse grain task parallelism - each data component data parallel).

HTML version of Basic Foils prepared May 7 1996

Foil 58 Software Issues - Methodology of Parallel Computing

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Simple, but general and extensible to many more nodes is domain decomposition

Most successful concurrent machines with many nodes have obtained good performance from "Data Parallelism" or "Domain Decomposition"

Problem is an algorithm applied to data set

- Obtain parallelism by acting on data concurrently.

The three architectures considered here differ as follows:

- MIMD-DM - Processing and Data Distributed - MIMD-SM - Processing Distributed but memory shared - SIMD-DM - Synchronous Processing on Distributed Data

HTML version of Basic Foils prepared May 7 1996

Foil 59 Example of Basic Problem Architectures for MetaProblem Class

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Metaproblems are of growing importance in general HPCC community

One important Example is:

Manufacturing and Design including Multi-disciplinary Optimisation which combines many fields to look at total product design, manufacturability etc.

- Geometry - Grid Generation - Fluid Flow ----> Performance - Acoustics - Structural Analysis - Optimisation Module - Visualisation

HTML version of Basic Foils prepared May 7 1996

Foil 60 Affordable Systems Optimisation Process (ASOP)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Example of an application that has been designed to use NII from outset.

ASOP consortium is made up of:

- The major US aerospace companies- NASA Langley- Aircraft engine makers- NPAC

Aims to examine the problems of inter-disciplinary collaborative design and manufacture.

HTML version of Basic Foils prepared May 7 1996

Foil 61 Affordable Systems Optimisation Process (ASOP)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Motivation is the complexity of designing modern aircraft

- Hundreds of contractors- Thousands of engineers- Use of thousands of computer programs

Web technology is proposed to bind the management of large projects

- Provide uniform interface to all computer platforms- Allow the management of remote executions

HTML version of Basic Foils prepared May 7 1996

Foil 62 Affordable Systems Optimisation Process (ASOP)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

For example:

- A database engine on an enterprise server

- A CFD calculation on an MPP

- CAD/CAM visualisation

HTML version of Basic Foils prepared May 7 1996

Foil 63 Affordable Systems Optimisation Process (ASOP)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 64 What is the "Correct" Machine Architecture for Each Problem Class

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Problem Machine

Synchronous SIMD, MIMD

Loosely Synchronous MIMD

Asynchronous Unclear

Metaproblems Heterogeneous network

Embarrassingly Parallel NOW, SIMD, MIMD

HTML version of Basic Foils prepared May 7 1996

Foil 65 Enabling HPCC Software Technologies

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Information Production

PVM, MPI, Linda, HPF, TreadMarks, Legion, JAVA

Parallel / Distributed Computing Runtime Tools - debugger, profilers, and performance visualisation

Image processing software - AVS and Extensions

Parallel Operating Systems

- I/O - Scheduling
Virtual Reality
Event Driven Simulator

HTML version of Basic Foils prepared May 7 1996

Foil 66 Enabling HPCC Technologies

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Information Analysis, Access and Integration

Parallel (Relational) Database e.g. Oracle 7.0

Object database

High Speed Networks

Multilevel Mass Storage

Integration Software ("glue")

Integration of Parallel and Distributed Computing

Compression

Parallel Rendering

Linkage Analysis (between records of database)

Sorting (large databases), etc...

HTML version of Basic Foils prepared May 7 1996

Foil 67 Comparison of Three Different Programming Models

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 68 Some Hardware and Software Trends 0ver the next 5 Years

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

ATM networks have rapidly transitioned from research Gigabit networks to commercial deployment

- ATM likely to be a major force in local area as well as wide area networks

Computer Hardware trends imply that all computers (PC's -> supercomputers) will be parallel by the year 2000

- Up to 1993, parallel computers are from small start-upcompanies (except Intel Supercomputer Division)

HTML version of Basic Foils prepared May 7 1996

Foil 69 Some Hardware and Software Trends 0ver the next 5 Years

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

-- Now Cray, Convex (HP), Digital, IBM have massively parallel computing systems and Silicon

Graphics is becoming a powerful force high performance computing vendors

- Several architectures but only one: Distributed memory MIMDmulticomputer is known to scale from one to very many processors.

Software is challenge and could prevent/delay hardware trend that suggests parallelism will be a mainline computer architecture

- We must get systems software correct- Simultaneously develop applications software in gradually improvingparallel programming environment

HTML version of Basic Foils prepared May 7 1996

Foil 70 Distributed/Parallel Programming Paradigms

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Ideally Should be able to use any Programming Paradigm.

The Programming Spectrum:

- Sequential - monolithic- Sequential - with forked processes- Parallel - Shared objects (Legion)- Parallel - Virtual Shared Memory (TreadMarks)- Parallel - Shared task space (Linda)- Parallel - Shared memory using directives (HPF)- Parallel - Message passing interface (MPI)- Parallel - Message passing Environment (PVM).- Parallel - Thread based (Java)

HTML version of Basic Foils prepared May 7 1996

Foil 71 Parallel/Distributed Programming Languages

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Legion (University of Virginia)

TreadMarks (Rice University - Tx)

Network Linda (Yale University)

HPF

MPI

PVM (ORNL/UTK)

Java (Sun)

HTML version of Basic Foils prepared May 7 1996

Foil 72 Legion - Introduction

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The Legion project is an attempt to design and build system services that provide the illusion of a single virtual machine.

Legion targets wide-area assemblies of workstations and supercomputers

HTML version of Basic Foils prepared May 7 1996

Foil 73 Legion - Introduction

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

It aims to provide -

- Shared-object and shared-name spaces, - Application adjustable fault-tolerance, - Improved response time, - Greater throughput, - Wide-area network support, - Management and exploitation of heterogeneity, - Protection, - Security, - Scheduling, - Resource management, - Parallel processing- Object inter-operability.

HTML version of Basic Foils prepared May 7 1996

Foil 74 Legion - Philosophy

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Legion is an object-oriented system - designed around C++

The principles of the object-oriented paradigm are the foundation for the construction of Legion; the following features are exploited:

HTML version of Basic Foils prepared May 7 1996

Foil 75 Legion - Distributed Objects

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HTML version of Basic Foils prepared May 7 1996

Foil 76 Legion - Philosophy

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Under lying themes: cannot design a system that will satisfy every

user's needs, so must design an extensible system.

For example:

Security - Trade-off between security and performance (due to the cost of authentication, encryption, etc.). Rather than providing a fixed level of security, users may choose their own trade-offs by implementing their own policies or by using existing policies via inheritance.

Fault-Tolerance - Select the level of fault-tolerance needed, and pay for only what they use. By allowing the user to implement there own services or to inherit them from library classes, providing flexibility complemented by a menu of existing choices.

HTML version of Basic Foils prepared May 7 1996

Foil 77 The Legion Testbed at the University of Virginia

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Campus Wide Virtual Computer (CWVC)

The Campus Wide Virtual Computer is a heterogeneous distributed computing system built on top of Mentat, an object-oriented parallel processing system.

The CWVC is used to demonstrate the benefits of a Legion-like system, providing departments across the UVa campus with an interface to high performance distributed computing.

The CWVC allows researchers at the University to share resources and to develop applications that will be usable in a true Legion setting.

HTML version of Basic Foils prepared May 7 1996

Foil 78 The Legion Testbed at the University of Virginia

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Organisations

The CWVC is currently used by, and based on resources owned by,

the following organisations:

- The Department of Computer Science, UVa

- Information Technology and Communication, UVa

- The School of Medicine, UVa

- The Department of Electrical Engineering, UVa

- The Department of Engineering Physics, UVa

- NASA Langley Research Center, Hampton, Va

HTML version of Basic Foils prepared May 7 1996

Foil 79 The Legion Testbed at the University of Virginia

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Hardware Resources

CWVC contains more than 100 workstations of varying types, including:

- IBM RS6000 workstations, and a small 8 node SP-2- Sun Microsystems SparcStations, including several small multiprocessor workstations- Hewlett Packard PA-RISC workstations- Silicon Graphics Indy, Indigo, and Onyx workstations, including a two processor Onyx

Mixture of network technologies, including ATM, FDDI, and Ethernet.

HTML version of Basic Foils prepared May 7 1996

Foil 80 The Legion Testbed at the University of Virginia

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The CWVC is an integrated interface to heterogeneous campus-wide distributed computing. It provides a set of tools that facilitate application development, debugging, and resource management.

These tools include:

Parallel C++ : The Mentat Programming Language

Mentat is an object-oriented parallel processing system designed to directly address the difficulty of developing architecture- independent parallel programs.

HTML version of Basic Foils prepared May 7 1996

Foil 81 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The fundamental objectives of Mentat are to

(1) Provide easy-to-use parallelism,

(2) Achieve high performance via parallel execution

(3) Facilitate the execution of applications across a wide range of platforms.

The Mentat approach exploits the object-oriented paradigm to provide high-level abstractions that mask the complex aspects of parallel programming, including communication, synchronisation, and scheduling, from the programmer. Instead of managing these details, the programmer concentrates on the application.

HTML version of Basic Foils prepared May 7 1996

Foil 82 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The programmer uses application domain knowledge to specify those object classes that are of sufficient computational complexity to warrant parallel execution.

Federated File System

Objects that execute on hosts in the CWVC are presented with a single unified file system abstraction called the "Federated File System" (FFS).

The FFS is necessary because the hosts made available by participating departments have their own local file systems, NFS mount structures, etc., so files visible on one host may not be available on another.

HTML version of Basic Foils prepared May 7 1996

Foil 83 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The FFS allows objects to view a single, unified file name space, and thus execute in a location independent manner.

In the CWVC FFS, files are simply objects which persist.

Library routines to access FFS files are provided for C, C++

The Mentat Programming Language (MPL), C and Fortran; these interfaces are similar to the Unix C standard library file system interface; therefore, utilisation of the CWVC-FFS typically requires little change to existing code.

HTML version of Basic Foils prepared May 7 1996

Foil 84 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Thermostat

Thermostat provides an GUI to manage resources.

It allows a resource owner to schedule the times of day and the days of the week that hosts will be available for CWVC use.

It also allows the resource owner to specify the percent of the CPU time and memory that may be used by the CWVC during individual available time slots.

HTML version of Basic Foils prepared May 7 1996

Foil 85 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Resource Accounting Services

Resource utilisation is logged on a per-user basis, while resource availability (controlled via the Thermostat) is logged on a per machine basis.

Usage and machine availability "credits" are scaled based on the computing power of the hosts involved - a minute of time on an SP-2 node is clearly worth more than a minute on a Sun IPC. A report generation tool is provided to extract and summarise usage statistics.

HTML version of Basic Foils prepared May 7 1996

Foil 86 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

MAD

MAD is a set of tools that enables programmers to debug their Mentat/CWVC applications using the debugger of their choice.

MAD supports post-mortem debugging: CWVC programs run to completion, or until an error occurs, and are only examined after the fact.

MAD is based on the record and replay technique, and consists of two phases.

In the recording phase, objects of a CWVC application log their incoming messages to a file.

HTML version of Basic Foils prepared May 7 1996

Foil 87 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

In the playback phase, objects can reproduce their original execution by extracting the appropriate messages from the log file.

The programmer can then replay a specific object, i.e. reproduce its execution, under the control of a sequential debugger, and can use the traditional cyclic debugging technique.

HTML version of Basic Foils prepared May 7 1996

Foil 88 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Data Parallel Computation Scheduler

Prophet is an automatic run-time scheduling system for SPMD computations designed for heterogeneous workstation networks.

Prophet solves three problems associated with scheduling data parallel computations in this environment:

(1) Processor selection (2) Data domain decomposition(3) Task placement.
Processor selection chooses the best subset of processors to use based on the problem computation granularity.

HTML version of Basic Foils prepared May 7 1996

Foil 89 Legion Tools

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Data domain decomposition decomposes the data domain to provide processor load balance.

Placement assigns tasks to processors to limit communication overhead. All of these factors contribute to reduced completion time.

Prophet has been integrated into the Mentat-Legion system.

HTML version of Basic Foils prepared May 7 1996

Foil 90 Legion - Applications

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Scientific Applications

Genome Library Comparison

Atmospheric Simulation

Automatic Test Pattern Generation for Integrated Circuits

The Quest group in the University of Virginia department of Electrical Engineering has developed a parallel Automatic Test Pattern Generation (ATPG) application.

URL http://www.cs.virginia.edu/~legion

HTML version of Basic Foils prepared May 7 1996

Foil 91 TreadMarks: Shared Memory Computing on Networks of Workstations

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

TreadMarks is a software Distributed Shared Memory system built at Rice University.

It has an efficient user-level DSM system that runs on commonly available Unix systems.

URL http:/www.cs.rice.edu/~willy/TreadMarks/0verview.html

HTML version of Basic Foils prepared May 7 1996

Foil 92 The TreadMarks Interface

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

TreadMarks provides primitives similar to those used in hardware shared memory machines.

Application processes synchronise via two primitives: barriers and mutex locks.

Barriers

The routine Tmk_barrier(i) stalls the calling process until all processes in the system have arrived at the same barrier.

HTML version of Basic Foils prepared May 7 1996

Foil 93 The TreadMarks Interface

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Locks

Locks are used to control access to critical sections.

The routine Tmk_lock_acquire(i) acquires a lock for the calling processor, and the routine Tmk_lock_release(i) releases it.

TreadMarks uses a lazy invalidate version of release consistency and a multiple-writer protocol to reduce the amount of communication involved in implementing the SM abstraction.

The virtual memory hardware is used to detect accesses to shared memory.

HTML version of Basic Foils prepared May 7 1996

Foil 94 TreadMarks - Parallel "Hello World"

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

#include <stdio.h>

#include <errno.h>

#include <sys/signal.h>

#include "Tmk.h"

void main(int argc, char **argv)

{

int c ;

while ((c = getopt(argc, argv, "")) != -1) ;

Tmk_startup(argc, argv);

printf("hello from process %d\n", Tmk_proc_id) ;

/* All processes wait for keyboard input on local host. */

if(Tmk_proc_id == 0)

do {

c = getchar() ;

} while (c < 0 && errno == EINTR);

Tmk_barrier(0);

Tmk_exit(0);

}

HTML version of Basic Foils prepared May 7 1996

Foil 95 Programming Experience between PVM and TreadMarks - Generalisations.

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

It is easier to program using TreadMarks than using MP.

Although there is little difference in programmability for simple programs, for those with complicated communication patterns, such as 3-D FFT, it takes more effort to figure out what to send and whom to send it to.

Results show that the use of release consistency and the multiple-writer protocol, TreadMarks can performs comparably with MP on a variety of problems.

The separation of synchronisation and data transfer and the request-response nature of data communication in TreadMarks is responsible for lower performance for all the TreadMarks programs.

HTML version of Basic Foils prepared May 7 1996

Foil 96 Programming Experience between PVM and TreadMarks - Generalisations.

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

In MP, data communication and synchronisation are integrated together. The send and receive operations not only exchange data, but also regulate the progress of the processors.

In TreadMarks, synchronisation is through locks/barriers, which do not communicate data. Moreover, data movement is triggered by expensive page faults, and a diff request is sent out in order to get the modifications.

In addition, MP benefits from the ability to aggregate scattered data in a single message, an access pattern that would result in several miss messages in the invalidate-based TreadMarks protocol.

HTML version of Basic Foils prepared May 7 1996

Foil 97 Programming Experience between PVM and TreadMarks...

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Although the multiple-writer protocol addresses the problem of simultaneous writes to the same page, false sharing still affects the performance of TreadMarks.

HTML version of Basic Foils prepared May 7 1996

Foil 98 Network Linda

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Overview

- What is Linda? - The Linda Model - Master/Worker Model using Virtual Shared Memory

Linda Basics

- Definitions - Operations - Templates - Template Matching Rules

URL http://www.cs.yale.edu/HTML/YALE/CS/Linda/linda.html

HTML version of Basic Foils prepared May 7 1996

Foil 99 Linda - Overview

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

What is Linda?

Parallel programming language based on C (C-Linda) and FORTRAN (Fortran-Linda)

Combines co-ordination language of Linda with programming languages of C and FORTRAN

Enables users to create parallel programs that perform on wide range of computing platforms

HTML version of Basic Foils prepared May 7 1996

Foil 100 Linda - Overview

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Ease of use

Based on logically global, associative object memory called tuple space.

Tuple space provides inter-process communication and synchronisation logically independent of the underlying computer or network

Implements parallelism with a small number of simple operations on tuple space to create and co-ordinate parallel processes

Commercially available - Scientific Computing Associates Inc.

HTML version of Basic Foils prepared May 7 1996

Foil 101 The Linda Model

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Virtual Shared Memory

Different parts of the data can reside on different processors.

Looks like one single global memory space to component processes.

Linda's Virtual Shared Memory is known as tuple space

Can be used to implement many different types of algorithms

Lends itself well to master/worker distributed data structure algorithms

HTML version of Basic Foils prepared May 7 1996

Foil 102 Master/Worker Model using Virtual Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Task and workers are independent of each other

Master divides work into discrete tasks and puts into global space

Workers repeatedly retrieve tasks and put results back into global space

Workers notified of work completion by having met some condition, receiving an end-signal or terminated by some other means

Master gathers results from global space

HTML version of Basic Foils prepared May 7 1996

Foil 103 Master/Worker Model using Virtual Shared Memory

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Possible ways that tasks can be distributed:

- Bag of tasks (unordered) - Ordered tasks by using a shared counter in tuple space along with task identifiers. - Tasks identifiers used to find related data.

HTML version of Basic Foils prepared May 7 1996

Foil 104 Linda Basics

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Tuple Space

Linda's name for its shared data space. Tuple space contains tuples.

Tuples

The fundamental data structure of tuple space.

Tuples are represented by a list of up to 16 fields, separated by commas and enclosed in parentheses.

Examples: - ('arraydata', dim1, 13, 2) - (var1, var2)

HTML version of Basic Foils prepared May 7 1996

Foil 105 Linda Basics

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Associative Memory Model

A tuple is accessed by specifying its contents.

From the programmer's point of view, there is no address associated with a tuple.

HTML version of Basic Foils prepared May 7 1996

Foil 106 Linda Basics

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Operations - There are four basic operations

Tuple Generation

out

- Generates a data (passive) tuple - Each field is evaluated and put into tuple space - Control is then returned to the invoking program

HTML version of Basic Foils prepared May 7 1996

Foil 107 Linda Basics

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

eval

- Generates a process (active) tuple - Control is immediately returned to invoking program- Logically, each field is evaluated concurrently, by a separateprocess and then placed into tuple space

HTML version of Basic Foils prepared May 7 1996

Foil 108 Linda Basics

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Tuple Extraction

- Uses a template to retrieve tuple from tuple space. - Once retrieved, it is taken out of tuple space and no longer available for other retrievals. - If no matching tuple is found process will block.- Provides for synchronisation between processes.

- Uses a template to copy data without removing it from tuple space. - Once read it is still available for others. - If no matching tuple is found process will block.

HTML version of Basic Foils prepared May 7 1996

Foil 109 Example Code: Features Illustrated in FORTRAN/C Code

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Name of program has an .fl or .cl extension

Top level routine must be named real_main but has the same argv and argc parameters as the C main does

Top level routine must be a parameter-less subroutine named real_main - Fortran

Bulk of program is pure FORTRAN/C

Linda operations, eval, in, out

Program is asynchronous - no guarantee that any particular process will execute before any other

HTML version of Basic Foils prepared May 7 1996

Foil 110 Linda - Parallel "Hello World"

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

#define NNODES 4

#include <stdio.h>

int node(int taskId) {

printf("Hello from task %d.\n", taskId) ;

return 0 ;

}

void real_main()

{

int i ;

for (i = 0 ; i < NNODES ; i++)

eval("worker", node(i)) ;

}

HTML version of Basic Foils prepared May 7 1996

Foil 111 High Performance Fortran

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The discussions on an HPF standard were started at SC'91 where DEC had organised a discussion meeting for interested parties.

This meeting led to the formation of the HPF Forum (HPFF).

The first meeting of the HPFF was held in Houston, Texas, in Jan `92 and was attended by 130 people.

HTML version of Basic Foils prepared May 7 1996

Foil 112 High Performance Fortran

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Detailed work began in March when a working group of 40 people started fleshing out a standard with an intention to finish it by Jan `93. A further 8 meetings were held during `92 leading to the publication of the HPF Specification v1.0 in May `93.

The working group - members from industry, universities and (US) Labs

HTML version of Basic Foils prepared May 7 1996

Foil 113 HPF Goals

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HPFF goals were to define a language that offers:

Data parallel programming (Single threaded, global name space, loosely synchronous parallel computation).

Top performance with SIMD and MIMD machines with non-uniform memory access costs.

Code tuning for various architectures.

These goals were addressed in the form of new intrinsics and first class language constructs.

HTML version of Basic Foils prepared May 7 1996

Foil 114 HPF Goals

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Subsidiary goals were also established, including:

- Portability of existing code - Efficient portability of new code - Maintained compatibility with existing standards (particularly Fortran 90), - Simplicity and ease of implementation - Open interface to other languages or programming styles. - Availability of compilers in the near future

HTML version of Basic Foils prepared May 7 1996

Foil 115 HPF Language

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HPF is based upon the Fortran 90 programming language - it includes the F77 and provides many new features including:

- Array operations - Improved facilities for numeric computation - User defined data types - New control constructs - Facilities for modular data and procedure definition - New source form - Optional procedure arguments and recursion - Dynamic storage allocation - Pointers

HTML version of Basic Foils prepared May 7 1996

Foil 116 HPF Language

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HPF adds the following features to those in Fortran 90.

Support for controlling the alignment and distribution of data on a parallel machine.

New data parallel constructs.

An extended set of intrinsic functions and standard library providing much useful functionality at a high level of abstraction.

EXTRINSIC procedures which standardise the interface with the other languages or styles

Directives to address some sequence and storage association issues.

HTML version of Basic Foils prepared May 7 1996

Foil 117 Message Passing Interface (MPI)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

In principle, a sequential algorithm is portable to any architecture supporting the sequential paradigm.

However, programmers require more than this: they want their realisation of the algorithm in the form of a particular program to be portable - source-code portability.

The same is true for message-passing programs and forms the motivation behind MPI.

URL http://www.mcs.anl.gov/MPI

HTML version of Basic Foils prepared May 7 1996

Foil 118 Message Passing Interface (MPI)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

MPI provides source-code portability of message-passing programs written in C or Fortran across a variety of architectures.

Just as for the sequential case, this has many benefits, including protecting investment in a program, allowing development of the code on one architecture (e.g. a network of workstations) before running it on the target machine (e.g. fast specialist parallel hardware)

Basic concept of processes communicating by sending messages to one another has been understood for a number of years, it is only relatively recently that message-passing systems have been developed which allow source-code portability.

HTML version of Basic Foils prepared May 7 1996

Foil 119 MPI Forum

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

MPI was the first effort to produce a message-passing interface standard across the whole parallel processing community.

Sixty people representing forty different organisations -- users and vendors of parallel systems from both the US and Europe -- collectively formed the "MPI Forum".

The discussion was open to the whole community and was led by a working group with in-depth experience of the use and design of message-passing systems (including PVM, PARMACS, and P4).

The two-year process of proposals, meetings and review resulted in a document specifying a standard Message Passing Interface (MPI).

HTML version of Basic Foils prepared May 7 1996

Foil 120 MPI's prime goals are:

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

To provide source-code portability

To allow efficient implementation across a range of architectures

It also offers:

A great deal of functionality

Support for heterogeneous parallel architectures

Deliberately outside the scope of MPI is any explicit support for:

- Initial loading of processes onto processors - Spawning of processes during execution - Debugging - Parallel I/O

HTML version of Basic Foils prepared May 7 1996

Foil 121 MPI - Parallel "Hello World"

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

#include <stdio.h>

#include "mpi.h"

int main(int argc, char* argv []) {

int rank ;

MPI_Init(&argc, &argv) ;

MPI_Comm_rank(MPI_COMM_WORLD, &rank) ;

printf("hello from procesor %d\n", rank) ;

MPI_Finalize() ;

}

HTML version of Basic Foils prepared May 7 1996

Foil 122 Parallel Virtual Machine (PVM)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

PVM - What is it?

- Background Information - Advantages/Disadvantages - Components o Deamon o Libraries

URL http://www.epm.ornl.gov/pvm/

HTML version of Basic Foils prepared May 7 1996

Foil 123 PVM - What is it?

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

PVM (Parallel Virtual Machine) is a software system to link clusters of machines.

It enables a heterogeneous collection of computers (the virtual machine) to be programmed as a single machine.

It provides the user with process control and message passing calls for communication between tasks running on different hosts.

HTML version of Basic Foils prepared May 7 1996

Foil 124 PVM - What is it?

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The individual computers in the virtual machine may be shared- or local-memory multiprocessors, vector supercomputers, specialised graphics engines, or scalar workstations, that may be interconnected by a variety of networks, such as ethernet or FDDI.

User programs written in C, C++ or Fortran access PVM through library routines.

HTML version of Basic Foils prepared May 7 1996

Foil 125 PVM Features

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The PVM system provides a set of user interface routines that may be incorporated into existing C or Fortran programs.

Routines exist for the process control, message transmission and reception, broadcasting, synchronisation via barriers, mutual exclusion, shared memory, dynamic process groups and some forms of error detection and recovery.

HTML version of Basic Foils prepared May 7 1996

Foil 126 PVM Features

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Processes may be initiated synchronously or asynchronously, and may be conditioned upon the initiation or termination of another process, or upon the availability of data values.

Message transmission is governed by routines that ensure data is transmitted in a machine independent form.

Overall, these routines allow for parallel implementation of the most appropriate computational model and for division into tasks that can be assigned to the most appropriate

HTML version of Basic Foils prepared May 7 1996

Foil 127 PVM - History

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The first version of PVM was written during Summer of `89 at Oak Ridge National Laboratory. It was not released, but was used in applications at the lab.

Version 2 was written from scratch during Feb `91 at the University of Tennessee, Knoxville, n and released in March of that year. It was intended to clean up and stabilise the system so that external users could benefit.

Version 3 was redesigned from scratch, and a complete rewrite started in Sep `92, with first release of the software in March `93. While similar in spirit to version 2, version 3 includes features that did not fit the old framework - most importantly fault tolerance, better portability and scalability.

HTML version of Basic Foils prepared May 7 1996

Foil 128 Advantages/Disadvantages to Using PVM

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Advantages

It is fairly small, with a limited user routine library, yet it is sufficient to implement any parallel programming scheme.

There is a large and growing user base.

There are tools (XPVM) for debugging, performance analysis and visualisation.

Also, continued upgrades are expected in the areas of visualisation, fault tolerance, multimedia applications and performance optimisation for specific architectures.

HTML version of Basic Foils prepared May 7 1996

Foil 129 Advantages/Disadvantages to Using PVM

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Advantages

Portability: Probably the most portable message passing library available.

Promises of distributed computing for the future's computing requirements

Scalable Parallelism

Easy to install and use

Public domain software

Popular - probably the most widely used parallel MP library

Local, wide-area or combination of networks

Flexible

HTML version of Basic Foils prepared May 7 1996

Foil 130 Advantages/Disadvantages to Using PVM

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Advantages

Easy definition and modification of your parallel virtual machine

Arbitrary control and dependency structures - your application "decides":

- where and when tasks are spawned and terminated

- which machines to add or delete from your parallel virtual machine

- how tasks communicate and/or synchronise with each other.

HTML version of Basic Foils prepared May 7 1996

Foil 131 Advantages/Disadvantages to Using PVM

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Disadvantages

Performance: Even with recent advancements, PVM may be slower than other message passing languages

There is no real PVM standard

It is deficient in some of the message passing functionality.

Limited fault tolerance. The user must monitor the processes and hosts to insure that no hosts have gone down during execution. In this event, the user must reset the virtual machine and restart the processes.

HTML version of Basic Foils prepared May 7 1996

Foil 132 Advantages/Disadvantages to Using PVM

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Disadvantages

Network latencies can cause problems in applications that have a high communication to computation ratio.

There is as yet no automation of the compilation of different object modules on different architectures, though this may be added in the future.

HTML version of Basic Foils prepared May 7 1996

Foil 133 PVM Components

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Components are the parts of your Parallel Virtual Machine responsible for things such as:

- Communication - Process control - The programming interface for the user.

There are two - the PVM Daemon and the PVM Libraries.

HTML version of Basic Foils prepared May 7 1996

Foil 134 PVM Daemon (pvmd3)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The PVM daemon pvmd3 is a Unix process which oversees the operation of user processes within a PVM application and co-ordinates inter-machine PVM communications.

One daemon runs on each machine configured into your PVM machine. Each will have their own pvmd3s running.

The "master" PVM daemon is the first PVM daemon started and is responsible for starting all other PVM daemons in your parallel virtual machine. Typically started on your login machine.

Each daemon maintains a table of configuration and process information relative to your parallel virtual machine.

HTML version of Basic Foils prepared May 7 1996

Foil 135 PVM Daemon (pvmd3)

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

User processes communicate with each other through the daemons:

- They first talk to their local daemon via the library interfaceroutines. - Local daemon then sends/receives messages to/from remotehost daemons.

The PVM daemon software must reside on every machine which you will be using:

- each machine must have its own architecture dependentversion of pvmd3 built and installed.

HTML version of Basic Foils prepared May 7 1996

Foil 136 PVM Libraries

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The three PVM libraries are:

- libpvm3.a - library of C language interface routines. - libfpvm3.a - additionally required for Fortran codes - libgpvm3.a - required for use with dynamic groups

Contain simple calls that the application programmer embeds in concurrent or parallel application code. Provide the ability to:

- initiate and terminate processes - pack, send, receive and broadcast messages - synchronise via barriers - query and dynamically change configuration of the parallelvirtual machine

HTML version of Basic Foils prepared May 7 1996

Foil 137 PVM Libraries

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Library routines do not directly communicate to other processes. Instead, they send commands to the local daemon and receive status information back.

Can be installed in user filespace - default location is $HOME/pvm3/lib.

External Data Representation (XDR) format conversion is performed automatically between hosts of different architectures. This is the default.

HTML version of Basic Foils prepared May 7 1996

Foil 138 JAVA - Overview

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

HotJava is a Web browser that supports dynamically downloadable interactive content.

Arbitrarily sophisticated dynamic multimedia applications inserts called Applets can be embedded in the regular HTML pages and activated on each exposure of a given page.

Applet constructs are implemented in terms of a special HTML tag:

<APPLET codebase="URL directory path" code="Java class file name" width=".." height=".." >

HTML version of Basic Foils prepared May 7 1996

Foil 139 JAVA - Overview

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Where the URL and class file name points to a chunk of server side software that is to be downloaded and executed at the client side on each presentation of a page containing this applet which executes in window specified in size by width and height in pixels.

Applets are written in Java - a new general purpose object-oriented programming language from Sun Microsystems.

HTML version of Basic Foils prepared May 7 1996

Foil 140 History of Java Language and Team

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Starts in 1991 by Project Green --- a group in Sun focused on operating software for consumer electronic devices such as smart set-top boxes Gosling (creator of Sun NeWS which had major conceptual impact both on current Java and Telescript models) realizes that C++ is not adequate and initiates development of a new language Oak, later renamed as Java.

A PDA (Personal Digital Assistant -- codename *7) based on Oak/Java ready in 1993. Green Team incorporates as FirstPerson, Inc. *7 proposal to Time-Warner rejected in 1993. 3DO deal falls through in 1994. FirstPerson, Inc. dissolves.

Small group (~30 people, now Java Team) continues development and decides to adapt Oak as a Webtechnology.

HTML version of Basic Foils prepared May 7 1996

Foil 141 Some Key Java Features

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Document The Java: A White Paper by Sun Microsystems -- October 1995 draft by Gosling & McGilton -- key features of Java:

- Simple and Familiar - Object-oriented - Architecture-neutral - Portable - Somewhat Interpreted - Distributed - Robust - Secure - High performance - Multi Threaded - Dynamic

HTML version of Basic Foils prepared May 7 1996

Foil 142 Java Features -- It's Simple and Familiar!

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java omits several rarely used, poorly understood and confusing features of C++ including operator overloading, multiple inheritance, pointers and automatic type coercion's.

It adds automatic garbage collection which makes dynamic programming easier in Java than in C or C++.

No more mallocs!

It also adds 'Interface' construct, similar to Objective C concept, which often compensates for the lack of multiple inheritance by allowing method calling syntax to be "inherited".

HTML version of Basic Foils prepared May 7 1996

Foil 143 Java Features -- It's Simple and Familiar!

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

The resulting language is familiar as it looks like C++ but is simpler and hence easier to program in.

It also results in a much smaller kernel which is suitable for planned Java ports to consumer electronic devices. Base (alpha) interpreter is ~40Kb, libraries and threads add additional 175Kb.

HTML version of Basic Foils prepared May 7 1996

Foil 144 Java Features -- It's Object-oriented

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java model can be viewed as a C++ subset, with some dynamic elements inherited from Objective-C (method overloading, garbage collection).

Structures, Unions and Functions are absorbed into data and methods of Java classes -- Java is Simple!

The strength of Java object-oriented model is not is sophistication but in simplicity and the extensive class library associated with the system (some 250 public classes were released in both alpha and beta).

HTML version of Basic Foils prepared May 7 1996

Foil 145 Java Features -- It's Object-oriented

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java class plays also a role of a communication atom in the Web embedding model. Applet classes identify themselves by names in the HTML applet tag.

Applet downloads other classes, present in the applet source.

Hence, the Java class names play the role of addressing mode for the distributed Java code database.

HTML version of Basic Foils prepared May 7 1996

Foil 146 Java Features -- It's Architecture-Neutral

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

C/C++ programming in a heterogeneous network environment requires use and compatibility across several vendor platforms and the corresponding compilers. This problem is solved in Java by designing platform-independent binary representation called Java bytecode (or opcode).

Java compiler (written in Java and platform-independent) reads Java source bytecode.

These bytecodes are shipped to client machines upon browser requests.

HTML version of Basic Foils prepared May 7 1996

Foil 147 Java Features -- It's Architecture-Neutral

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Each client machine must run Java interpreter which performs runtime execution of Java bytecodes.

Java interpreter is written in POSIX compliant ANSI C and needs to be ported to and conventionally compiled (once) on each individual platform.

Once the interpreter is ported, application developers don't need to worry at all about platform specificity and differences between native compilers.

They write portable Java which is compiled by platform independent Java compiler and executed on Java interpreter, ported to various platforms.

HTML version of Basic Foils prepared May 7 1996

Foil 148 Java Features -- It's Portable

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java language offers a uniform abstract (virtual) machine model which is identical for all platforms. SUN owns the Java Virtual Machine (see online report) -- it is universal while classes can be added by any user.

Unlike in C/C++ where various integers match the architecture of a physical machine at hand, Java byte, char short, int and long are always of the same size, equal to 8, 16, 16 (unicode), 32 and 64 bits, respectively.

No header files, preprocessors, #define etc.

HTML version of Basic Foils prepared May 7 1996

Foil 149 Java Features -- It's Portable

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Floating point is always IEEE 754

Differences between vendor specific windowing environments (X Windows, MS Windows, Macintosh) are removed in terms of the Abstract Windowing Toolkit (AWT) metaphor.

AWT is given by ~60 Java classes (alpha) which offer a universal GUI programming model, portable between UNIX, PC and Mac, and translated automatically to native windowing systems on individual platforms by Java interpreters.

HTML version of Basic Foils prepared May 7 1996

Foil 150 Vava Features -- It's Distributed

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Popular TCP/IP based protocols such as FTP or HTTP are supported in terms of network protocol classes.

This facilitates various forms of distributed processing. New protocols (e.g. PVM etc.) can added and dynamically installed.

Distributed computing model of Java is mainly client-server, with Java compiler preparing the opcodes at the server side, and Java interpreter executing it at the client side.

One can expect more dynamic uses of Java with Java threads on both Server and Client side communicating with each other.

HTML version of Basic Foils prepared May 7 1996

Foil 151 Java Features -- It's (Hopefully) Secure

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java binaries are shipped across the network and executed on client machines. Security is therefore a critical issue and strongly enforced in Java.

Java contains its own networking classes which are designed to be secure

Modifications of the C++ model such as eliminating pointer arithmetic and coercion were dictated mainly by the security requirements.

Most viruses are based on acquiring access to private/protected sectors of computer memory which is impossible in Java.

HTML version of Basic Foils prepared May 7 1996

Foil 152 Java Features -- It's (Hopefully) Secure

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java opcodes are executed at the client side by Java interpreter which operates exclusively on the virtual memory. Hence, unless there are security bugs in the Java interpreter itself, the model is safe and users cannot create security holes by incorrectly or maliciously written applets.

The byte codes sent across network are verified at the client which prevents evil/corrupted classes from causing problems.

HTML version of Basic Foils prepared May 7 1996

Foil 153 Java Features -- It's Multithreaded

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java model offers preemptive multithreading, implemented in terms of the Thread class.

Thread methods offer a set of synchronization primitives based on monitor and conditional variable paradigm by C.A.R. Hoare.

Java threads inherit some features from the pioneering Cedar/Mesa System by Xerox Park that gave birth to Macintosh and object-oriented programming.

A typical use of Java multithreading in applet programming is to have several independent but related simulations (e.g. various sorting algorithms), running concurrently in an applet window.

HTML version of Basic Foils prepared May 7 1996

Foil 154 Java Features -- It's Multithreaded

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Multithreading is also used internally by the HotJava browser to handle multiple document dynamics.

Another interesting application domain are multi-HotJava environments to come such as collaboratory or gaming.

Java threads don't have built-in point-to-point communication primitives. Various thread communication environments can be provided by coupling the thread and network protocol objects.

HTML version of Basic Foils prepared May 7 1996

Foil 155 Java Language -- Program Structure

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Source code of a Java program consists of one or more compilation units, implemented as files with .java extension.

Each compilation unit can contain:

- A package statement

- Import statements

- Class declarations

- Interface declarations

HTML version of Basic Foils prepared May 7 1996

Foil 156 Java Language -- Program Structure

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

Java compiler (called javac) reads java source and produces a set of binary bytecode files with .class extensions, one for each class declared in the source file. For example, if Foo.java implements Foo and Fred classes, then javac Foo.java will generate Foo.class and Fred.class files.

Suppose that Foo implements an applet and Fred is an auxiliary class used by Foo. If HotJava/Netscape encounters a tag <APPLET code="Foo.class">, it will download Foo.class and Fred.class files and it will start interpreting bytecodes in Foo.class.

HTML version of Basic Foils prepared May 7 1996

Foil 157 Hello World Applet from Sun Tutorial

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

This prints in applet window, the classic Hello World string!

import java.awt.Graphics

public class HelloWorld extends java.applet.Applet

{

public void init() {

}

public void paint(Graphics g) {

g.drawString("Hello world!", 50, 25);

}

HTML version of Basic Foils prepared May 7 1996

Foil 158 Hello World Applet from Sun Tutorial

From MetaComputing -- MRA Meeting Part I:Concepts and Issues Tutorial for CRPC MRA Meeting at Cornell -- May 7 1996. *

Full HTML Index

class HelloWorldApp {

public static void main (String args[]) {

System.out.println("Hello World!");

}

Note args are of class String and needed even though helloWorldApp has no command line arguments

Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Sun Apr 11 1999