(09/27/95) Conclusions from the discussion with Miloje
-
Miloje is convinced that so far, nobody in NPAC,NASA or JPL knows what the
4DDA requirements for the parallel IO are. They all are working on the computational
part. Of course, the IO problem will be relevant in the future as the 4DDA system
deals with large volumes of data, but he expects it to happen not earlier than one
year from now. It is not clear, what parts of the parallel IO functionality will
be relavant by that time, as probably HPC systems will move towards virtual memory
so out-of-core computations will be no longer an issue. Probably, the mapping
between disks and memory arrays will be a dominant problem, together with overlapping
IO actions and checkpointing.
-
As it is difficult to design anything if we don't know the true nature of a problem,
I would suggest to carry out this project as a systematic survey of different
approaches for parallel IO, identifying problems and how the problems are tackled
by different parallel IO systems. I can also think about extending TCE towards
efficient IO operations and checkpointing.
(09/25/95) The 4DDA Issues
-
Still too little information to shape any sensible design for 4DDA parallel
IO requirements. I found the
description of the 4DDA principles (click here for
summary ) but it only scetches some requirements and algorithmic issues.
-
Two suggested requirements have been mentioned there - a possible need for
capabilities of restarting an interrupted computation (checkpointing) and software
support for efficient transfers between disks and mamory mapped arrays. The first
issue is only mentioned in one project, and
generally it requires some serious software engineering, the other is realated
with the PASSION work and also MPI-IO.
-
An open question is on which HPC platforms the 4DDA is going to run. In the paper
they mentioned Intel Paragon, Cray T3D and also IBM-SP1. For Intel Paragon the
work done for PASSION can be easily reaused (either at the language or at the
runtime level) for other architectures however there is no PASSION implementation.
-
When the targetted platforms have virtual memory (SP-1/2 or HPC workstation clusters) the out-of-core computation factor may not be relevant, but still the two-phase disk access strategy of PASSION can be used - but only as a model - PASSION does
not run on clusters.
-
Seems to me that the MPI-IO project could be very useful to look at. It has all relevant
components (except checkpointing that is) and will be the closest in spirit to what
I would propose using TCE. Both MPI-IO and PIO based on TCE will deal with IO
issues the the runtime level only, and they extend the message passing constructs
to cover the IO domain. Both models assume very little about the uderlying hardware and software support, therefore they can cover various HPC platforms.
(09/11/95) Assuming that we are moving from parallel MPP systems towords workstation (PCs) clusters I have
the following conclusions (Based on the PASSION Summary) :
-
A large portion of PASSION is dedicated to out-of-core programs. When it is very relevent for MPP
architectures, it is much less relevent for computing clusters. The reason for it is simple; nodes
on parallel machines don't have virtual memory, therefore when something doesn't fit into the physical
memory it has to be devided between the available physical memory window and disks. Workstations, on the
other hand have their virtual memory limited only by the available swap space. Since the swap space is
rarely less than 100MB a chance that something won't fit is rather slim. There are of course cases when
the problem can still arise; a heap limit for user processes for example, but I belive such obstacles
can be removed through appropriate system administration. A conversation with Miloje should verify this
claim.
-
If it is agreed that out-of-core data is less relevant for clusters, then the whole portion of PASSION
that deals with compiler and HPF language support will have little relevance.
-
The model of the portable, virtual file system VIP-FS defined in PASSION can be utlized in any environment.
The important question is whether the design choices made for MPP are realy the optimal way of doing things -
clusters are much flexible when placing process is concerned (two or more of them can reside on the same machine
while on parallel nodes it is not possible in general).
-
The model of the the Two-Phase Access Strategy should be equally applicable to MPP as well as clusters.
-
What I think would be a sensible thing to do in the cluster case is to design (or reuse the VIP-FS design) the
parallel/distributed file server. This file server would interact with appropriate runtime via message passing.
The runtime would support language or library constructs which do ALL IO operations in the prefetching/non-blocking
mode. The runtime would use tce functionality to implement the non-blocking IO access using threads. The same
framework could be used to implement the file server. Finely, we could add the HPF directives to express
the "new" asynchronous IO at the language lavel, and to allow the runtime to execute the Two-Phase Access Strategy.