ASBURY 1 ****************************************************************** Tom and I talked this morning about the presentation next week. It is my opinion that the Day 1 presentations should offer an overall vision of a grid environment within DoD over the near and long term periods. Capabilities within the context of Gateway should be highlighted and some priority or realistic timeline for achievement of these capabilities should be offered. I would like a clear distinction to be made between the Gateway efforts and other lower/system level efforts and the importance of both and how they should be constructed in order to blend them into a single functional grid environment for DoD. I believe that Gateway offers the only context for this type of high level grid discussion. Gateway allows for the inclusion of a wide variety of functions that must occur to operate within a grid, such as an interface for on-line training, collaborative tools, User Service functions like service tickets, allocation mods, etc. My hope is that we can weave a set of consistent goals and set of priorities to present to the HPCMP directors that will encourage them to actively participate in activities such as the Grid Forum. I am afraid that if Gateway does not offer this type of vision, this workshop will get bogged down in technical details that may prevent the desired outcome of producing recommendations to HPCMP that can set the stage for a future DoD grid environment. HAUPT 1 *********************************************************** Please, find enclosed my HPDC-8 talk on Gateway. I think that of particular importance is slide no.6 "The Software Hierarchy". It shows the relationship between the Gateway system and a metacomputing infrastructure ("the Grid"). We propose five INDEPENDENT layers. 1. the user interface (access to services of all kind - from access to compute servers to remote data to users services ...) is independent of the middle tier implementation (which happens to be WebFlow). The communication between front-end and the middle tier goes through technology neutral XML. It can work with perl/cgi through secure, mutually authenticated https (San Diego's HotPage does that) or secure CORBA over Kerberos, as we do. 2. the framework defined by WebFlow is independent of underlying distributed object model. At this time we use CORBA. Ultimately we can go to POW, HLA, DCOM, you name it. 3. We use CORBA to build proxy objects (wrappers to back-end services), and doing this we differ from the Computer Science crowd that tries to extend the component model down to the resources level. And as you can expect they end up with crying over CORBA/RMI performance or trying to reengineer CORBA (this is what DOE guys are doing with CCA - common component architecture). We build on commodity technologies and industry standards. This also means that we are building on top of an existing/emerging grid infrastructure. 4. I guess that the workshop is about to make an attempt to make a vision how the grid interface could possibly look like. We need a common interface for security, file access, job submission and monitoring, accounting, mass storage, databases, user services, etc. In a very rough approximation, Globus does that, and therefore we place Gateway on top of Globus. I cannot imagine seamless access to resources without well defined grid interface. 5. The actual grid implementation. We have very little to say, even though I believe this will be the focus of the workshop. We build on top of that. Now, tell me what aspects of Gateway you want me to elaborate on (make foils). With the HPDC presentation included, you have access to all my foils. HAUPT 2 *************************************************************************** > 1) and doing this we differ from the Computer Science crowd that tries to > extend the component model down to the resources level. And as you can > expect they end up with crying over CORBA/RMI performance or trying to > reengineer CORBA (this is what DOE guys are doing with CC > Does this mean that they do not have a "true multi tier model" > separating functionality (middle tier) and implementation (backend) > If so arent they obviously wrong? Well, yes and no. For data flow kind of application, they are obviously wrong. On the other hand, for non-tivially coupled applications, I am not that sure. In Gateway, we do not have that clear idea what to do, either. Currently, we just assume that the codes are coupled, they talk to each other using a protocol they want (MPI,..etc), and Gateway's responsibility is to coallocate them. I believe that this Gateway's model is good, but we cannot offer any tools that would assist the user to programm interactions/data exchange between components. To do that we need a component model for the back-end as well. But this is a far future, I think. > > > 2) > We need a common > interface for security, file access, job submission and monitoring, > accounting, mass storage, databases, user services, etc. In a very rough > approximation, Globus does that, and therefore we place Gateway on top > of Globus. I cannot imagine seamless access to resources without well > defined grid interface. > > Please take each topic and state how Gateway interfaces > Do we define XML structure or what? Here, by common, I mean common for all hardware and schedulers. For, example, Globus define a common method to sumit a job: you specify contact address (which defines machine and scheduler), and you specify RSL string (that defines executables, files, etc). That is, there is no difference in syntax when submitting a job to Origin2000/PBS, SP2/LL, or a command line on a workstation). There is identical authentication mechanism in each case, there is a standarized way to handle remote files using GASS, there is standarized information service MDS, and standarized HBM (heart beat monitor) that makes possible to add fault tolerance even on the application level. The other question is whether these Globus standards are good enough, and this is a question to be answered by the GridForum, or this DoD workshop. It is clear, that Globus as is, is not sufficient. A lot of stuff is missing, such as authorization, information service support hardware only at this time, mass-storage support, user services, etc. Also, it remains to be seen whether technologies adoped by Globus are acceptable by other (LDAP for information services, etc). Nevertheless, we use Globus. To submit a job, we need to generate RSL. Each application is 'imported' to Gateway through a single XML document that describes it. In particular the xml file contains all information that is needed to - either generate RSL string or - install the application (by moving executable or compiling the souce), and then generate RSL Now, the application can be arranged into networks (AVS-style or more object oriented). Such a network is described by another XML document, where applications are nodes, and links represent data transfers. The data transfer is translated into a GASS request, or Gateway (CORBA-based) module. We do not use HBM yet. There is no need to duplicate MDS info as XML documents, as long as the MDS info is adequate. However, at this time we store information that, according to us, is not grid related, such as user profile or applications, outside MDS. Our documents are consistently written in XML. > > > 3) We must define something of value -- namely > a) User Interface > b) Middle Tier proxy interfaces (aka tasks etc.) > or not? > I would formulate it that way: as soon as the grid services exists, we can build a user firendly, three-tier system on top of it. With all bells and whistles we promise for Gateway. Because we come with a component model in the middle-tier that allows for introduction proxy interfaces that can be manipulated by inuitive GUI (setting parameters, selecting target host, etc). More, these proxies can be combined into more complex structures (because we come with a component model) representing sophisticated, high level task and services, visually controlled by the user. And standarizing components of the user interfaces, and ways how ther communicate between each other and with the middle-tier simplifies development of the front-end tools. But this will work fine only if put on top of a well defined grid intrface. HAUPT 3 *************************************************************************** So far I was trying to sell the idea that once a meta- (web-, virtual-,...) computer is build, one can build a powerful, high level, web based interface to it. And it is my understanding that Gateway does just that. Similarly, as AVS is build on top of Unix. Gateway is a specific way of building such an interface, and in our opinion, superior to any other known approaches. It is object oriented (unlike Globus), it builds on top of industry standards of distributed objects (unlike Legion), and it introduces a specific object framework build on top of JavaBeanChild interfaces (again, industry standard) and an event model yet to be defined by CORBA and EJB, as current solutions are not adequate. We need personalized point to point event notification (thus not CORBA event channel) without necessity of implementing event listener interface (as Java 1.1+ and JavaBeans request). Without event listener interface we can truly develop reusable modules without any knowledge about other modules, and still make them to interoperate with each other. In addition, we use XML for serialization of the middle-tier objects, and as a communication protocol between the middle-tier and the front end. This way we keep front-end and the middle-tier independent of each other, and in turn this allows us to build many different types of front ends. Also, since the objects can be defined in XML, it is a convenient way of bringing them into Gateway system (for example, applications). Geoffrey Fox wrote: > So precisely where(I.e. to what features) do we > a)Define novel (i.e. Gateway only) Interfaces: > b)Provide an XML document that translates into RSL or some existing > defined interface. > both; we build on top of the Grid interface (such as Globus) and define high level interfaces for object outside/above the grid - applications, metaapplications, high level services, user specific data, etc. > > e.g. we must have a novel interface for graphs but use Globus machine interface? > yes > > What do we have to do to Link to Legion? > I am not a Legion expert. My understanding is that Legion introduces an OO grid interface. We need to map our object framework onto Legion's one. Bridges? > > We have a defined proxy component model > Why doesn't this give backend model by implication I do not know. It is tough. Personally, I like the idea of Globus with MPI implemented over Nexus, so it can detect the best possible connections automatically. But this means that communication is defined inside the code (MPI calls) beyond the control of Gateway - unless we do a trick such as DARP. Typically, CORBA/RMI... are to slow for high performance computing. Legion has something to offer. On the other hand, it is nice that you can run things on HPCC platforms without implementing Gateway there (Globus is enough). I really do not know. Need more experience. For easy stuff (such as LMS), there is no problem. Indeed we extend our model to the back-end. In general, it is tough. Question: do we really need it? Can we expect that coupling application in a way significantly more complex way than dataflow be done "automatically" be important any time soon?