From furm@npac.syr.edu Mon Nov 2 12:54:04 1998 Date: Sun, 1 Nov 1998 16:37:44 -0500 From: Wojtek Furmanski To: timucin@npac.syr.edu Cc: furm@npac.syr.edu Subject: Sandia issues here are some recent exchanges related to Sandia project not clear what's going but apparently Sandia still needs help and the project will likely continue after SC98 meetings Could you write a few paragraphs on heartbeat and XML support in JWORB as input for the 'care package'? We will also need a list of all our demos available at SC98 (JWORB, OW-RTI, Jager, CMS, OMBuilder, FMS Training Space, what else) - please make your list with one paragraph description per demo, I will make my list and let's compare/combine tomorrow or Tuesday. Geoffrey wants to discuss issues Thuersday. Tom's Haupt role unclear and he apparently does not know yet what's going on but it might suggest more Sandia money on the horizon (and more of the usual uncertainties re project management..) In any case, I think it is worth to put some effort in preparing care package and demo handouts for Sandia this week. Please give it some thought and let's talk more tomorrow. I will check OMG Fault Tolerance RFP tonight - it might be a useful item for the care package/next year proposal. thanks Wojtek >From furm@npac.syr.edu Sun Nov 1 16:18:06 1998 Date: Wed, 28 Oct 1998 14:17:03 -0500 (EST) From: Wojtek Furmanski To: Geoffrey Fox Cc: furm@nova.npac.syr.edu Subject: Re: FYI Here is more comments on this interesting email. I think we could indeed be of some help in providing JWORB based support for integrating various metacomputing environments, basically by continuing end extending the work we already started on cluster management for C-Plant. I think the concept of adopting HLA and extending it for more generic federation services applies to metacomputing as well. Perhaps we could call it Metagrid or GridHLA or MetaHLA or VHLA or something like that..? Each domain such as Globus, Legion, Condor etc. would be represented as a federate that conforms to a common FOM (Federation Object Model) and can join any time an existing or start a new Metagrid Federation, and interact with other federates via RTI bus services and their extensions. The latter can be naturally experimented with and supported by our Object Web RTI, for example using Globus/Nexus for high performance communication, or using Java for rapid prototyping of CORBA services, or using XML for universally parsable control messaging, metadata or trader formats etc. I include some more specific answers or comments below. On Tue, 13 Oct 1998, Geoffrey Fox wrote: > > Geoffrey Fox gcf@npac.syr.edu, http://www.npac.syr.edu > Director of NPAC and Professor of Physics and Computer Science > Phone 3154432163 (Npac central 3154431723) Fax 3154434741 > > ------- Forwarded Message > > Date: Mon, 12 Oct 1998 14:09:25 -0600 > From: "Pollock, Robert" > To: "'gcf@npac.syr.edu'" > Subject: Follow up on Conversation > > Jeffrey, > > I wanted to follow up on a few items you mentioned to me (and my colleagues) > last Friday on the way to the Chicago airport from the Java Grande workshop. > > > As we stated at the workshop, we are attempting to develop a DRM system so > that a consistent access to, and management of, high-end computational > resources that are distributed throughout the Defense Programs complex can > be made available to geographically dispersed users. (for example, the ASCI > and C-Plant resources) > It is perhaps worth mentioning that HLA is already making some inroads into the DoE - I noticed some interesting papers by Argonne people in the area of logistics simulations during the last HLA conference Fall98 SIW in Orlando where we presented our WebHLA work. > 1st. We are attempting to track down the reason you have not been able to > receive a copy of the C-Plant software so that you may load it on your > "C-Seed Plant" hardware for evaluation. Please standby. > > > 2nd. As you already know, we (SNL' DisCom2/DRM Group) are trying to > identify an implementation model that would fit nicely into our overall DRM > logical architecture model (seven conceptual layers) that was presented last > week at the workshop. You mentioned the WebFlow and JWorb three tier model > as a possible implementation for our logical model. The question I have > has to do with the maturity of the JWorb services. To what environments do > the JWorb servers support today? Does JWorb provide any API's for accessing > Globus managed resources? If so, to what extent? If not, are their any > plans (either from your group or from Globus community) for incorporating > JWorb interfaces into Globus? You also mentioned a paper that is currently > in draft form that addresses in more detail the Three-tier model and > specifically focuses on the JWorb concept. Is it possible for you to > provide Sandia with a copy of this paper for our review? > Our main focus so far was on providing JWORB services for DoD Modeling and Simulations and its Web based HPC extensions - therefore we selected DMSO HLA/RTI as the target for our first JWORB service called Object Web RTI (i.e. DMSO RTI implemented in Java as CORBA service - note btw that DMSO has now an HLA proposal into OMG for which our OW-RTI is already a prototype implementation). The next step and early work in progress is JWORB based Cluster Management Service for C-Plant - here we currently have heartbeat support operational and we start building Clustering FOM and XML data support for resourse allocation and management. Regarding Globus, we have some involvement on wrapping Globus applications as coarse grain WebFlow modules. New WebFlow based on JWORB middleware will offer a natural visual Metagrid authoring tools with Globus as one of the supported metacomputing domains. RCI paper included some top level description of JWORB only - more info will be available in our Wiley book by the end of this year. > 3rd. Do you currently have any plans for writing JWorb API's for > interfacing with the PBS scheduler? I believe the PBS scheduler (provided > by NASA) is being ported this year (FY99) to run on the C-Plant hardware. > See Art Hale for further details on this tasking. > We are looking into PBS and we are planning to provide support for various clustering/scheduling tools via our Clustering FOM - but priorities are not clear yet, though. So far, we were looking in more detail into Condor and Beowulf while waiting for more hints from the C-Plant team. > 4th. Do you know of any commercially available DRM systems that can > support several meta-computing service models (i.e., Globus, Condor, Legion, > etc...) ? > I doubt there are any robust commercial tools in this area as market for interoperable metacomputing is yet to be built. Our ansatz is that the DoD M&S has most experience in this area (or at least in the subset of irregular distributed computing) and various large scale simulation communities are being pushed now hard to interoperate due to the DoD budget cuts. Hence HLA initiative and early products which seem to be quite promising - SISO for IEEE standards, DoD-wide mandate, OMG presence, target for initial commercial activities, significant interest by Boeing and other large manufacturers etc. Adopting HLA standard and extending it towards Web based HPCC seems to be our unique angle so - perhaps we are closer then others to robust interoperable metacomputing framework? > > As part of our FY '99 demonstration of a DRM system, we are looking at the > C-Plant clusters as being a critical supported component in our solution. > In creating the DRM system, we recognize the need to take full-advantage of > commercially available products and tools as part of overall solution. I am > hoping that as we better understand your goals and vision, we might be able > to leverage off some of your activities and visa-versa. > I would love to learn more about DRM status. As I said before, more material on our approach including JWORB, WebHLA etc. will be available in our new Wiley book by the end of this year. > > > > Any insight, clarification, or guidance you can provide is much appreciated. > > > > bp > rdpollo@sandia.gov > 505-844-4442 > > ------- End of Forwarded Message > > >From gcf@npac.syr.edu Sun Nov 1 16:18:41 1998 Date: Fri, 30 Oct 1998 15:09:03 -0500 From: Geoffrey Fox To: "Pollock, Robert" Cc: furm@npac.syr.edu, Art Hale Subject: Comments on your interesting email! Who from from Sandia be at SC98 -- maybe we could meet there! We think we could indeed be of some help in providing JWORB based support for integrating various metacomputing environments, basically by continuing end extending the work we already started on cluster management for C-Plant. We think the concept of adopting HLA and extending it for more generic federation services applies to metacomputing as well. Perhaps we could call it Metagrid or GridHLA or MetaHLA or VHLA or something like that..? Each domain such as Globus, Legion, Condor etc. would be represented as a federate that conforms to a common FOM (Federation Object Model) and can join any time an existing or start a new Metagrid Federation, and interact with other federates via RTI bus services and their extensions. The latter can be naturally experimented with and supported by our Object Web RTI, for example using Globus/Nexus for high performance communication, or using Java for rapid prototyping of CORBA services, or using XML for universally parsable control messaging, metadata or trader formats etc. We include some more specific answers or comments below. Note two references 1. A long paper we wrote for RCI called High Performance Commodity Computing on the Pragmatic Object Web 2. A book we are writing called Building Distributed Systems on the Pragmatic Object Web > ------- Forwarded Message > > Date: Mon, 12 Oct 1998 14:09:25 -0600 > From: "Pollock, Robert" > To: "'gcf@npac.syr.edu'" > Subject: Follow up on Conversation > > Jeffrey, > > I wanted to follow up on a few items you mentioned to me (and my colleagues) > last Friday on the way to the Chicago airport from the Java Grande workshop. > > > As we stated at the workshop, we are attempting to develop a DRM system so > that a consistent access to, and management of, high-end computational > resources that are distributed throughout the Defense Programs complex can > be made available to geographically dispersed users. (for example, the ASCI > and C-Plant resources) > It is perhaps worth mentioning that HLA is already making some inroads into the DoE - We noticed some interesting papers by Argonne people in the area of logistics simulations during the last HLA conference Fall98 SIW in Orlando where we presented our WebHLA work. > 1st. We are attempting to track down the reason you have not been able to > receive a copy of the C-Plant software so that you may load it on your > "C-Seed Plant" hardware for evaluation. Please standby. > > > 2nd. As you already know, we (SNL' DisCom2/DRM Group) are trying to > identify an implementation model that would fit nicely into our overall DRM > logical architecture model (seven conceptual layers) that was presented last > week at the workshop. You mentioned the WebFlow and JWorb three tier model > as a possible implementation for our logical model. The question I have > has to do with the maturity of the JWorb services. To what environments do > the JWorb servers support today? Does JWorb provide any API's for accessing > Globus managed resources? If so, to what extent? If not, are their any > plans (either from your group or from Globus community) for incorporating > JWorb interfaces into Globus? You also mentioned a paper that is currently > in draft form that addresses in more detail the Three-tier model and > specifically focuses on the JWorb concept. Is it possible for you to > provide Sandia with a copy of this paper for our review? > Our main focus so far was on providing JWORB services for DoD Modeling and Simulations and its Web based HPC extensions - therefore we selected DMSO HLA/RTI as the target for our first JWORB service called Object Web RTI (i.e. DMSO RTI implemented in Java as CORBA service - note btw that DMSO has now an HLA proposal into OMG for which our OW-RTI is already a prototype implementation). The next step and early work in progress is JWORB based Cluster Management Service for C-Plant - here we currently have heartbeat support operational and we start building Clustering FOM and XML data support for resourse allocation and management. Regarding Globus, we have some involvement on wrapping Globus applications as coarse grain WebFlow modules. New WebFlow based on JWORB middleware will offer a natural visual Metagrid authoring tools with Globus as one of the supported metacomputing domains. RCI paper included some top level description of JWORB only - more info will be available in our Wiley book by the end of this year. > 3rd. Do you currently have any plans for writing JWorb API's for > interfacing with the PBS scheduler? I believe the PBS scheduler (provided > by NASA) is being ported this year (FY99) to run on the C-Plant hardware. > See Art Hale for further details on this tasking. > We are looking into PBS and we are planning to provide support for various clustering/scheduling tools via our Clustering FOM - but priorities are not clear yet, though. So far, we were looking in more detail into Condor and Beowulf while waiting for more hints from the C-Plant team. > 4th. Do you know of any commercially available DRM systems that can > support several meta-computing service models (i.e., Globus, Condor, Legion, > etc...) ? > We doubt there are any robust commercial tools in this area as market for interoperable metacomputing is yet to be built. Our ansatz is that the DoD M&S has most experience in this area (or at least in the subset of irregular distributed computing) and various large scale simulation communities are being pushed now hard to interoperate due to the DoD budget cuts. Hence HLA initiative and early products which seem to be quite promising - SISO for IEEE standards, DoD-wide mandate, OMG presence, target for initial commercial activities, significant interest by Boeing and other large manufacturers etc. Adopting HLA standard and extending it towards Web based HPCC seems to be our unique angle so - perhaps we are closer then others to robust interoperable metacomputing framework? > > As part of our FY '99 demonstration of a DRM system, we are looking at the > C-Plant clusters as being a critical supported component in our solution. > In creating the DRM system, we recognize the need to take full-advantage of > commercially available products and tools as part of overall solution. I am > hoping that as we better understand your goals and vision, we might be able > to leverage off some of your activities and visa-versa. > We would love to learn more about DRM status. As I said before, more material on our approach including JWORB, WebHLA etc. will be available in our new Wiley book by the end of this year. > > > > Any insight, clarification, or guidance you can provide is much appreciated. > > > > bp > rdpollo@sandia.gov > 505-844-4442 > > ------- End of Forwarded Message > > >From gcf@npac.syr.edu Sun Nov 1 16:19:03 1998 Date: Sat, 31 Oct 1998 01:04:41 -0500 From: Geoffrey Fox To: furm@nova.npac.syr.edu, haupt@boss.npac.syr.edu Subject: Sandia I agreed to meet Art Hale and others Wednesday Nov 11 at SC98 He agrees that Sandia CPLANT software not ready for release and so encourages our general approach 1) we need a "care package" documenting our work. Probably also a proposed follow on for next year We should have comments on following a)Security His concern is that Sandia security people have to check out everything on a case by case basis. He believes this is not viable. e.g. they are now checking "ORBIX on Solaris" -- this approach will not "scale" Can we arrange a general "wrapper" which once tested by their security folks will allow flexible implementations ORBIX/Zeus/JWORB/NT/Solaris/LInux underneath We use what in our implementation for communication? They have some sort of Kerberos environment now b)Thin v Thick Nodes Do we need a Java VM/will we run JWORB on all nodes or just their service nodes leaving their single user compute nodes uyntouched (host-node model) c) Pollard (who we sent email to) Likes Globus but not so keen on Nexus. d) He likes Vic Holmes work at Sandia who uses object database to store metadata describing computation which seems similar to classic WebFlow linkage of simulations data and visualization. He sees such metadata as being part of future product databases in engineering design processes Geoffrey Fox gcf@npac.syr.edu, http://www.npac.syr.edu Director of NPAC and Professor of Physics and Computer Science Phone 3154432163 (Npac central 3154431723) Fax 3154434741 >From haupt@npac.syr.edu Sun Nov 1 16:19:29 1998 Date: Sun, 1 Nov 1998 11:53:42 -0500 From: Tomasz Haupt To: "'gcf@npac.syr.edu'" , "furm@nova.npac.syr.edu" , "haupt@boss.npac.syr.edu" Subject: RE: Sandia A few comments to start with.... On Saturday, October 31, 1998 1:05 AM, Geoffrey Fox [SMTP:gcf@npac.syr.edu] wrote: > I agreed to meet Art Hale and others Wednesday Nov 11 at SC98 > He agrees that Sandia CPLANT software not ready for release and > so encourages our general approach > > 1) we need a "care package" documenting our work. Probably also a proposed > follow on for next year > What does it mean "our" work. A collection of QS, LMS, WebHLA, ...? A coordinated overview? A general target, or specific for Sandia? I guess it would help if we start with some outline and expectations for the total volume. > > We should have comments on following > a)Security > His concern is that Sandia security people have to check out everything on a > case by case basis. He believes this is not viable. > e.g. they are now checking "ORBIX on Solaris" -- this approach will not "scale" > Can we arrange a general "wrapper" which once tested by their security folks will > allow flexible implementations ORBIX/Zeus/JWORB/NT/Solaris/LInux underneath > We use what in our implementation for communication? > > They have some sort of Kerberos environment now This is a very complex issue because the word "secure" is so vaguely defined. I think our strategy should be pointing to solutions that are accepted by others. A good candidate is AKENTI. Thus my suggestion is to try to promise something that goes along AKENTI, at least some of its components, and delegate "responsibility" to experts in this field. > b)Thin v Thick Nodes > Do we need a Java VM/will we run JWORB on all nodes or just their service nodes > leaving their single user compute nodes uyntouched (host-node model) We can adopt several different strategies. WebFlow can work as a job broker and delegate resource allocation to some other system (such as Globus or Condor). This is a model suggested by "Seamless access..." of JavaGrande. Does it mean a Thin node? Well, from the point of view of WebFlow - yes. On the other hand there must be some resource allocation demon running on each node. If they got one (Globus, Condor,etc) there is no need to duplicate it by WebFlow. Other approach is to use "pure" RMI/CORBA approach which means changing WebFlow philosophy (why not?), going away from Commodity Components (?), and addressing the security issues by ourselves (I do not like this idea). The "pure" WebFlow approach means Fat nodes. What is interesting, we can offer any combination of the above because is some case one wants a fat node, in others (like using NSF/DOE/DoD HPCC resources) one want to stay away from the administrative/security issues. > c) Pollard (who we sent email to) > Likes Globus but not so keen on Nexus. This is a weird statement. Globus is layered on top of Nexus. Does he want to retarget Globus on something else? Seems to me as a major effort (look at resources $$$+menpower Globus got). So I do not understand. What exactly is wrong with Nexus? As a user you never see Nexus. > > d) He likes Vic Holmes work at Sandia who uses object database to store metadata > describing computation which seems similar to classic WebFlow linkage of simulations > data and visualization. > He sees such metadata as being part of future product databases in engineering design processes > > Geoffrey Fox gcf@npac.syr.edu, http://www.npac.syr.edu > Director of NPAC and Professor of Physics and Computer Science > Phone 3154432163 (Npac central 3154431723) Fax 3154434741 >