* I am trying to understand the role for XSIL in this project. Generally, it seems that a user will have input consisting of parameters (name/value pairs), input data (either internal to the XSIL doc, or external files, possibly URLs), output data, etc. For Gaussian, there are 150+ pages of keywords in the manual. I don't think we should get in the business of breaking down Gaussian input into XSIL, since this would be a mammoth task that we would have to repeat for every single code. So I think the thing to do is assume the Chemistry user knows how to construct his/her input file. A XSIL version of this would wrap the input file and not worry about the Gaussian-specific content at all: test000.com test000.out Here test000.com is the good old water input file: # SP, RHF/STO-3G punch=archive trakio scf=conventional Gaussian Test Job 00 Water with archiving 0 1 O H 1 0.96 H 1 0.96 2 109.471221 The first line contains some directives specifying a single-point (SP) energy calculation, the method to use and basis functions (RHF/ST0-3G), a "punched" output file (sent to the archive), routine-by-routine statistics on I/O and cpu usage (trakio), and that two-electron integrals are stored on disk and read in for each SCF iteration. Some title information follows, and then the coordinates of the water molecule are given in Z-matrix form. This is about as simple as it gets. Anyway, I presume that XSIL should be able to read in the .xml input wrapper and extract the the relevent information: what to name the job, what the input file is, and what the output file is. On the other hand, if my code has relatively generic input (such as my Ising code), the user will need to create an input file like the following: 1000 2.4 19212211 30 ising.out And of course I may want to string together any number of these as part of a parameter search. For instance I may want to do a parameter search over temperature, incrementing "Temp" by 0.1, sweeping the range 0.0 to 3.0. One problem with this approach is that XSIL only defines input streams, so the output tags should be changed to parameters, and we will then write our own output-handling code. Possibly this is the place for an extension. * I will take it as given that the following pieces are in some working order, although improvements will need to be made. - Overall architecture # JSP front end # WebFlow middle tier, perhaps supplemented by javabeans - Job submission - Job monitoring (although current version will need enhancement) - Record keeping, via the ContextData. - How to store info collected from the user (ContextData again). - Remote file management, including uploading and downloading via http, and transferring files between different backend systems (i.e., moving output files to archival mass storage systems). The following tasks need to be fleshed out and are top priority: - XML/DTD description of objects, hardware, queues. Do we need just one mega-xml file that describes everything about a system, or do we need several complimentary xml descriptions: one for codes ("application objects"), one for hardware ("hardware objects"). one for ...? - Generation of queue scripts from input data. Need to be able to handle different queuing systems (PBS, LoadLeveler). Need examples of how different codes take input, generate output. I can think of three cases: # Standard I/O: a.out < input.dat > output.dat # C-Style: a.out input1.dat input2.dat param1 output1.dat # Internal: Data files have standard names, are expected to be in certain locations that are specified within the code. I will modify the Ising.f test code so that there are versions that do each of the above three. - XSIL description of input data. What is the proper role for XSIL? I'm not sure if the tag is useful for running codes because we don't want to read in data files into a java program. We could use this for building visualization/analysis tools, however. The tags will be useful as is. The following will be needed in the long term: - Better event generation from the backend. I want to do away with the notify.pl script. - Revise the job monitorer so that (i) the guts form a new webflow module, (ii) additional capabilities are present, and (iii) it uses existing WebFlow modules (submitJob, ContextManager) where appropriate. * My first whack at writing the ApplDesc XML/DTD needs to be revised or thrown out. The Christmas tree approach tends to be harder to manipulate and write generic methods for. Also, we may be able to capture all of the information using just the XSIL DTD, even though this is not the the intent of XSIL. We gain the advantage of using XSIL's built in document eating capabilities. * I think the really killer selling point for this system is to allow a user to easily do parametric studies. He must be able to easily submit and manage dozens (perhaps hundreds or more) of jobs with slightly varying inputs. For example, in the Ising model simulation, I may want to submit 30 jobs with the temperature ranging from 0.1 to 3.0 to study the energy and magnetization over this range. It looks like XSIL's tag may be perfect for this. * Xlook extesions can probably be placed in applets and used for displaying results, etc. However, it seems that this would be a specialized activity for specific codes. * I suppose that we can do away with ServerServlet. JSP provides a way to maintain persistence ("session" or "application" scopes) between pages. So it is actually not necessary to share the static hash tables that ServerServlet creates. We can create a more natural interface this is not tied to browsers, but can be incorporated with Java Guis as well. * We have a set of defined services: job submission, job monitoring, vertical file services (upload and download), horizontal file services (transfer to/from backend archival services), security, session management (ContextManager, ContextData), and event notification. The events need to be defined. We also have our data representations and our user interface components. We want to keep this last bit as general as possible, to make it easily extensible to whatever format. So what we really need at some point is to define interfaces (java interfaces, in fact) that capture the desired functionality. This is different from the WebFlow modules and their idl's (although they may define the same.) My idea is taht we have another layer between the user and WebFlow. Define java interfaces that "developer" written code can extend for aquiring the various services. For instance, for a browser interface, all of the functionality can be defered to beans, which extend the ContextManager, submitJob, etc., interfaces as needed. The interface is actually implemented with WebFlow objects, but in principle we can rip out the guts, use something entirely different from Webflow, without having to change the code that uses the interfaces. * Anyway, I think we should get out some early applications, while keeping an eye on issues like the above. * I want to modify the tomcat configuration so that it has a new context, GOW (for Generic Object Wrapper or maybe Gateway Object Wrapper). Added a new line to server.xml. Also modified tomcat-apache.conf to mount another servlet directory. Moved the Generic directory to GOW, created WEB-INF underneath. Moved classes_iiop, classes_kerb, beancontext_2.2.3 (kerb and iiop) and Servlets (kerb and iiop) underneath WEB-INF. Also, descriptors. * Jini may be perfect for delivering events persistently. Need to investigate. * I'd like to do away with ServerServlet. Based on incomplete understanding of Tomcat. Tomcat servers already provide the persistence we need, so don't need the static hashtables of ServerServlet. * The createJobScript bean actually needs to be a hierarchy, with a different child for each type of queuing system. * Do we need something besides ApplDesc.xml database? Something like MDS? Can Jini fill this role? *