.opennet or the non Microsoft approach to Peer to Peer Grid Services

This consists of
**	XML databases (files and streams)
**	Oracle 8i and equivalent (web enabled DB)
**	Oracle 9i and equivalent (XML enabled DB)
**	Java and possibly Enterprise Javabeans
**  Advanced (in major ways) versions of JMS (change model to XML defined topic objects
    matched to XML object subscriber profiles) and JDBC (XML Query not SQL interface)
**  Web or Grid Service models
**  Peer to Peer network technology as we want to build communities linking
    people (P2P) with resources (defined as Grid or Web services)

**	C++ and CORBA (not so certain here. Maybe this is "legacy .opennet". I was told 
that a new start up from Iona (CORBA) just got substantial funding for a SOAP based 
environment!)

These appear to be modern technologies to be used in building distributed systems that 
are not Microsoft particular. .net is of course the Microsoft alternative which in fact will 
use some technologies like SOAP in common with .opennet. Presumably the Grid is the 
sum of .net and .opennet.
The relevant technologies are mostly immature but I think they are important in 
identifying functionalities -- e.g. RDF is a controversial technology but I believe it 
teaches us about some important features of .opennet, which may get eventually 
implemented in a very different way. 

It appears to me that the building block of .opennet are XML Schema. These seem 
destined to be the way all data structures are defined in the next generation of systems. 
There are several areas within XML and Schema specifications, which are either unclear 
or just missing. For instance, in the following we seem to need several functionalities that 
can be present in all Schema; roughly the equivalent of interfaces for Java classes. We 
need XML parsers that understand these interfaces, which imply additional 
elements/attributes that can be present in XML instances even if not defined in the basic 
Schema. XML interfaces of interest include:
**	javarization or programization: define the conversion of XML Schema into 
associated Java (or any programming language) classes.
**	binarydataencoding: specify how one can incorporate non ASCII data (such as 
Microsoft word files) into an XML stream. This is presumably some implementation 
of MIME techniques that are well established in similar e-mail situation.
**	talkabout: This is meant to generalize the capability of RDF to make assertions about 
XML Instances. For example it implies that elements of a Schema implementing RDF 
talkabout interface can have attributes like rdf:resource.

These particular interfaces could be viewed as associated with Schema or "just" as 
control parameters of the parser. javarization is a property that could need customization 
for individual Schema with a default approach specified to the system. It doesn't sound 
very nice maybe roasting or whatever one does to coffee in its processing is a better term.

Simplification: Hereafter I assume that Java is programming language of .opennet. 
Obviously one could substitute another language.

In applications I am interested in, I need several hundred different classes and I assume 
large-scale commercial applications can be orders of magnitude larger. Thus it seems that 
we must systematize the conversion between XML Schema and Java classes. We assume 
that the process is:
a)	Design the XML Schema of the External data -- the data to be transmitted between 
the different components of the system.
b)	Choose a default roasting method for mapping Schema into Java and specify Schema 
specific roastings when necessary -- either because mapping unclear or does not 
produce an elegant structure for the Java programmer. The generated Java classes are 
produced with an inverseroasting (xmlarization) interface, which defines the Java to 
XML conversion. Castor and Indiana have a roasting scheme.
c)	The programmer is free to use any class specifications for objects used internally to 
the Java program.
  
We assume that all External Data when outside our Java program is logically represented 
in XML. Information transmitted by the system is XML forms of External Data and also 
"events" or messages controlling the system. The latter are also logically represented as 
XML when outside our program. In fact such events are just additional External Data 
generated by the running system. The external XML External Data is wrapped up in 
some protocol like SOAP and can use the talkabout Interface to annotate the data to 
specify the purpose of the transmission. Maybe XML is not efficient enough as 
transmission format and so we define fastXML as the actual transmitted form -- this is just 
a performance issue -- it could be used to avoid time consuming conversions at each end 
of a communication link; fastXML could also be used to represent element and attribute 
values more efficiently than character strings. The difference between fastXML and XML 
is only seen by transport and marshalling subsystems. It is transparent to the applications. 
A natural choice for fastXML appears to be RMI using serialization of Java version of 
External Data. Thus we find the following transport scenarios:
**	XML over SOAP -- natural but not always most efficient transport mechanism in 
.opennet. The XML could be produced by inverseroasting of Java objects.
**	RMI over SOAP -- as developed at Indiana. Used when RMI (serialized Java) 
selected as a data representation format and SOAP as appropriate transport protocol. 
In principle the serialized Java could come from roasting XML.
**	Java over RMI -- where Java could be fastXML either gotten by roasting XML 
instances or from "not inverseroasting" data in our Java program
**	XML over RMI -- Here XML is roasted to Java as fastXML and transported with 
RMI. 

It is essential for many applications, that one be able to include binary data into XML 
data nuggets. In GXOS, one wishes to able to reference an object either externally (as 
CORBA, Java etc.) or internally as an attachment. We should model this on SOAP 
approach (http://www.w3.org/TR/SOAP-attachments, 
http://www.gotdotnet.com/team/xml_wsspecs/dime/default.htm) and perhaps add an 
appropriate XML attribute (binarydataencoding) to specify approach and so allow it to 
change.
The Java program must query the backend store -- even if this ends up as a table 
eventually, a SQL like syntax (JDBC) seems very unnatural. It is more natural to use a 
DOM like interface (XPATH). Approaches are XQUERY and http://www.xmldb.org. 
We term this emerging collection of interfaces java.opennet.
For small problems (Personal.opennet) the easiest implementation is to read all the 
information into memory so that in this case java.opennet must reduce to natural memory 
lookup for given roasting. For large problems (Enterprise.opennet), the CPU memory 
typically forms a cache and Enterprise Javabeans appear a useful implementation vehicle.
In my applications a single tree (Schema) describes all XML instances. A typical data 
nugget is node of the tree described by a Subschema. This has some implications 
described later. 
We presumably want to view existing Java-JDBC-DBMS model as a special case of 
.opennet. "Legacy" web-linked databases either are treated "as is" bypassing .opennet 
completely or we find the Schema equivalent to a given relational structure and fit in the 
legacy approach this way. If we use the optional mediator of the figure, the JDBC calls 
are translated into XML messages sent to the mediator and the traditional database driver 
runs on the mediator or on a subsystem invoked by it.

The mediator (or intelligent router) is nominally a publish/subscribe server. We term the 
standard here GMS -- for Grid message service. As discussed elsewhere, GMS can either 
be implemented as one or a few servers (JMS) or as a P2P network as in JXTA. All 
transmission in GMS uses fastXML with Schema (in addition to those needed by External 
Data) to define the special attributes needed for such a service. GXOS has a start on this 
but needs enhancement for P2P case among other things. GMS would have the following 
differences from JMS
**	Transport could by SOAP or RMI; data encoding could be XML or serialized Java
**	The publish/subscribe model would by implemented as interacting agents 
representing publishers and subscribers. This service would use XML objects and not 
the SQL like property testing in JMS.
**	Currently leading JMS providers, do refined tests on clients; for efficiency this should 
be moved to servers.
The mediator approach supports distributed heterogeneous data sub-systems, 
collaboration (with multiple subscribers to a given publication) and asynchronous data 
access and delivery. Perhaps most importantly use of a mediator (or more precisely 
multiple networked mediators) allows one to support P2P networks with .opennet 
technology.
Looking a tiny bit more deeply at the mediator for P2P, we note that
a)	A mediator is the control center of a collection of JXTA pipes -- asynchronous 
message queues
b)	The "mediator intelligence" is less critical than for central server case -- messages 
move just a bit in the right direction each hop; for the central server case discussed 
above (and used in JMS) one has but one chance to get the multicasting right!

search.opennet
This can have either a central (server) or P2P style of search. This would depend on the 
nature of mediator service. As well as the distribution of search requests (the mediators), 
one can distinguish the style of search request. These styles would reflect different 
interfaces in java.opennet. Two obvious styles are:
**	Directory: equivalent to directory in Google and Yahoo with a URI label being part 
of the search. This could correspond to searching database indices -- the URI must be 
indexed in any reasonable database support of .opennet.
**	Gallimaufry: this you will find is the hodge-podge of scattered information which 
has a search interface nearer that of Web Search in Google or Yahoo. Technically in 
directory search the URI and/or indexed metadata are considered critical as they 
classify meaning of the information. In the Gallimaufry one searches on less precise 
information as the URI is probably not so helpful as it often specifies some accidental 
property (such as computer location) and not meaningful knowledge. 
Consider the infamous 1000 person accelerator physics experiment. The raw data and 
later processed data are clearly naturally organized as a directory with a precise URI and 
precise computer generated metadata. The intellectual electronic heart of this endeavor is 
different; it is the contents of the disks of personal and other computers associated with 
the involved scientists. Here are Powerpoint presentations, half-finished reports, group 
pictures, results of analysis jobs etc. As in Napster, one needs to search this Gallimaufry 
to truly support the experiment. Both forms of information are important; the Directory is 
nearest the classical digital library; the Gallimaufry captures the intrinsic disorganization 
of innovation and is nearest the P2P model. Both search styles must be supported.

uri.opennet
Both to include the semantic web concepts and as a central organizing principle, it is 
useful to assume that any permanent resource in .opennet has a URI. Here a resource is 
any digital entity i.e. any distributed object. This URI can have the allowed form (--
http://www.isi.edu/in-notes/rfc2396.txt) but it need not have any special hierarchical 
significance to opennet://one/two/three/ ...... /nthnumber/. Rather the strings between 
slashes can be the values of your favorite keywords.

GXOS.*.opennet
Here * can be Personal or Enterprise.

In this .opennet system, all actions are provoked by events, which contain an XML Script 
provisionally written in RDF. One critical feature is the use of a URI to uniquely label all 
items in ones Enterprise or Personal data. The Personal data just consist of a selection of 
nodes from full Enterprise GXOS tree. The intended applications -- education/training or 
scientific research appear to naturally lead to data nuggets with a unique URI. Anything 
permanent must have a URI; this includes messages to be preserved either as a log or to 
replay a session. Ephemeral messages can be labeled with the 128 bit UUID's as in 
JXTA.
Consider a document store generated with GXOS. The entries in this store are not 
necessarily the documents but are typically meta-objects containing meta-data for object 
including a URL or other object access specification. These are service advertisements in 
JXTA. These meta-objects are labeled by URI's, which could be looked up by 
java.opennet. The store is grown by any methodology that generates an event sent to the 
mediator and containing a specification of a resource. These messages can come from 
several sources including email, output of an online form etc. The messages are 
processed asynchronously and in distributed fashion. Suppose GXOS is used to manage 
your photo collection. You can send email to the mediator with a photo as attachment and 
metadata in the mail body; if this work was being done on your disconnected laptop, 
these messages would be sent later and gradually GXOS would build the photo database 
as a collection of simple messages. Instead of an attachment, the digital photos could be 
on a web server and one can just record the URL of this in your meta-data message. As 
well as this low technology solution, one could have a wizard directly generating XML 
messages from a custom GUI requesting allowed meta data elements.
The same approach can be used to build databases of documents for training, research, 
project management etc.

The stored information can be modified by update events referencing the URI and 
specifying one or more changed elements. These update events could be processed on a 
FIFO fashion for each resource. More precisely one can associate a lock attribute with 
each element and first issue a message requesting a lock (i.e. setting lock attribute) and 
after getting this do the update safely.   

Powerful threaded discussion lists could naturally be built with this technology.