Abstract Application Descriptor

Abstract Application Descriptor (AAD)

version 1.03, November 1999 author: T. Haupt, Syracuse University

Application

Each atomic computational task is called <application>. An application has an unique ID (application name), and a flag telling whether it can be installed by user or not (installable attribute of tag <application>). A user instalable application can be installed in the user space by uploading sources and running a makefile (or uploading Java bytecodes), and to do that no special privileges, such as root privileges or updating a license server, are needed. Abstract Application Descriptor comprises a list of applications <ApplicationList>.

For each application <requirements> can be specified (operating system, memory, number of processors, etc.) which helps with determining a host on which the application is to be installed, if installable. Note, that this part of AAD definition is very preliminary.

Finally, zero or more <target> tags can be put within the <application> tag.

Example:

Target

The <target> tag describes installation of the application on the target host. The name of the host is provided as its id attribute. The <target> tag must include <status> tag, and it can include optional <installed> and <howto> tags. The arrtibute of <status> simply indicates whether the application is installed on this host or not. <installed> describe the code installation, while <howto> describes how to install the code, if installable.

Example:

Installed application

If the application is installed on a give host, the <installed> tag describes how to run it. There is 15 different tags that can be used to specify that. Most are self-describing, and only a few are discussed here (for a complete list, see Document Type Definition at the end of this document ).

The most important tag is <CmdLine> tag. Within it, <Command> and <Parameters> can be defined. (for backward compatibility we still preserve <RunCommand> tag which is identical as <Command> and should be not used). <CmdLine> defines name and location of the executable, and <Parameters> describe its parameters: arguments and switches. Arguments (<arg> tag) can be literal or symbolic, can be optional, or occur several times.

Examples:

<arg name="infile" type="name" m="question">
this argument is optional (can occur zero or one times) because of m=question. Its name is symbolic (type=name), hence the string "infile" should be replaced by the actual name of the input file

<arg name="infile" type="name" m="plus">
in this case the string "infile" is to be replaced by a list (one or more) of actual file names. (with m="star", the list can comprise zero or more file names).

<arg name="infile" type="literal" m="fixed">
this argument is fixed (it must ocurr exactly once), and its name is literal: no name substitution is allowed

The <NameValue> tag allows specification of arguments in form Name=Value.
The <switch> tag make it possible to specify command line switches. This tag has five attributes: name, ms, separator, value, type, m.

name is the name of the switch (e.g., -s).
ms takes value of fixed and question. Question means optional (zero or one), while fixed forces use of the swich (for example, if -cp is declared fixed then classpath must be defined in java -cp myclasspath myClass)
separator takes values none (-sValue), space (--s Value), and equals (s=Value)
value is a string
type can be literal (the value should be literally the string defined by value attribute) or name (the string defined in the value attribute should be replaced by the actual value). Example, with type=name, in java -cp myclasspath myClass, myclasspath should be replaced by the actual value, such as /home/haupt/java/classes. On a given host, it may be desired to use a predefined classpath. In such as case, it can be declared with type=fixed.
m defines "multiplicity" of switch values (a single value, a list) following common ?,*,+ conventions.

Sometimes codes a written is such a way that they expect input data in some default locations (optionally to be overridden by command line switches or environmental variables). Similarly, the output data are stored in a predefined location. Tags <input> and <output> handle such situation. Each <input> tag requires <inFile> and <source> pair of tags. The former describes the expected location of the input file, while the latter the actual location. Analogously, <output> tag requires <outFile> and <dest> tags. This information is used by the proxy middle-tier module to make file transfers accordingly, if the actual files are placed on host different than that on which the code is running.

There are also tags to defne environmental variables, working spaces and other requirements such as number of processors, requesting standard input, output, and error stream redirection, as well as defining input and output ports (events and methods) for type consistency checks while constructing ATD.

It follows that infomation provided in the <installed> tag should be sufficient to run the code. This is not exactly true. In some cases the name of the executable should be prefixed with a string such as mpirun (with optional -np switch). In batch system, a batch script has to be created to actually submit a job. When Globus is used for job submission, an RSL string has to be generated, too. We believe that the information provided in the <installed> is sufficient to generate batch scripts, RSL strings, and generate the actual submit command. Those can be generated in the fly, say by a middle-tier proxy objects. In some cases, however, it may be more conveniet to define the RSL string in advance (and specify its location in <RSL> tag) or specify the location of a batch script to be generated ( <script> tag).

How to install

At this time, this part of the AAD is very preliminary.

Application Descriptor (AD)

An application descriptor is a subset of AAD with some or all ambiguities resolved. It contains information on a single application. The AAD file is created once by the Gateway administrator. AD is created for each user request. Constructing AD starts with extracting information on a selected application. The file is then further reduced as soon as the choice of host, arguments, switches, input and output files, and so forth is made. The decisions are made either interactively by the user or automatically by the middle tier services (by analyzing computational graph, by contacting resource broker, etc.). The processing of AD results in generating RSL strings, batch scripts and actual submission of the task for exacution.

AAD.dtd

<!ELEMENT ApplicationList (application)* >
<!ELEMENT application (requirements|target)*>
<!ATTLIST application
id CDATA #REQUIRED
installable (Yes|No) #REQUIRED>
<!ELEMENT requirements (#PCDATA)>
<!ELEMENT target (status,installed?,howto?)>
<!ATTLIST target
id CDATA #REQUIRED>
<!ELEMENT status EMPTY>
<!ATTLIST status
installed (Yes|No) #REQUIRED>
<!ELEMENT installed (CmdLine|RSL|Script|input|output|workdir|datadir|dataoutdir|enviroment|stdin|stdout|stderr|event|method|requirements)*>
<!ELEMENT CmdLine (Command|RunCommand|Parameters)*>
<!ELEMENT Command EMPTY>
<!ATTLIST Command
path CDATA #IMPLIED
exec CDATA #REQUIRED>
<!ELEMENT RunCommand EMPTY>
<!ATTLIST RunCommand
path CDATA #IMPLIED
exec CDATA #REQUIRED>
<!ELEMENT Parameters (arg|switch|NameValue)*>
<!ELEMENT arg EMPTY>
<!ATTLIST arg
name CDATA #REQUIRED
type (literal|name) #IMPLIED
m (fixed|question|plus|star) #IMPLIED>
<!ELEMENT NameValue EMPTY>
<!ATTLIST NameValue
name CDATA #REQUIRED
value CDATA #REQUIRED>
<!ELEMENT switch EMPTY>
<!ATTLIST switch
name CDATA #REQUIRED
ms (fixed|question) #IMPLIED
separator (none|space|equals) #IMPLIED
value CDATA #IMPLIED
type (literal|name) #IMPLIED
m (fixed|question|plus|star) #IMPLIED>
<!ELEMENT RSL (#PCDATA)*>
<!ATTLIST RSL
rsl CDATA #REQUIRED
value CDATA #REQUIRED>
<!ELEMENT Script (#PCDATA)*>
<!ATTLIST Script
module CDATA #REQUIRED
class CDATA #REQUIRED
idl CDATA #REQUIRED
command CDATA #REQUIRED>
<!ELEMENT input (#PCDATA|inFile|source)*>
<!ELEMENT output (#PCDATA|outFile|dest)*>
<!ELEMENT inFile EMPTY>
<!ATTLIST inFile
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT outFile EMPTY>
<!ATTLIST outFile
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT source EMPTY>
<!ATTLIST source
Host CDATA #REQUIRED
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT dest EMPTY>
<!ATTLIST dest
Host CDATA #REQUIRED
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT workdir EMPTY>
<!ATTLIST workdir
name CDATA #REQUIRED>
<!ELEMENT datadir EMPTY>
<!ATTLIST datadir
name CDATA #REQUIRED>
<!ELEMENT dataoutdir EMPTY>
<!ATTLIST dataoutdir
name CDATA #REQUIRED>
<!ELEMENT environment (EnvVariable)*>
<!ELEMENT EnvVariable EMPTY>
<!ATTLIST EnvVariable
name CDATA #REQUIRED
value CDATA #REQUIRED>
<!ELEMENT stdin EMPTY>
<!ATTLIST stdin
Host CDATA #REQUIRED
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT stdout EMPTY>
<!ATTLIST stdout
Host CDATA #REQUIRED
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT stderr EMPTY>
<!ATTLIST stderr
Host CDATA #REQUIRED
Path CDATA #REQUIRED
Name CDATA #REQUIRED>
<!ELEMENT event (#PCDATA)*>
<!ATTLIST event
event_name CDATA #REQUIRED>
<!ELEMENT method (#PCDATA)*>
<!ATTLIST method
method_name CDATA #REQUIRED>
<!ELEMENT howto (OS)+>
<!ELEMENT OS (sourceCode,compiler,install_requirements)*>
<!ATTLIST OS
type (WinNT|Solaris|Compaq|IRIX5|IRIX6|JDK1.1|JDK1.2) #REQUIRED>
<!ELEMENT sourceCode (host|source_dirctory|source_file)*>
<!ATTLIST sourceCode
type (source|executable) #REQUIRED>
<!ELEMENT host EMPTY>
<!ATTLIST host
hostname CDATA #REQUIRED>
<!ELEMENT compiler (compiler_name|compiler_path|compiler_switch)*>
<!ATTLIST compiler_name
language (Fortran77|Fortran90|HPF|C|CPP|Java1.1|Java1.2) #REQUIRED>
<!ELEMENT compiler_path EMPTY>
<!ATTLIST compiler_path
path CDATA #REQUIRED>
<!ELEMENT compiler_switch EMPTY>
<!ATTLIST compiler_switch
switch CDATA #REQUIRED>
<!ELEMENT source_directory EMPTY>
<!ATTLIST source_directory
dname CDATA #REQUIRED>
<!ELEMENT source_file EMPTY>
<!ATTLIST source_file
sname CDATA #REQUIRED>
<!ELEMENT install_requirements EMPTY>