.opennet or the non Microsoft approach to Peer to Peer Grid Services This consists of ** XML databases (files and streams) ** Oracle 8i and equivalent (web enabled DB) ** Oracle 9i and equivalent (XML enabled DB) ** Java and possibly Enterprise Javabeans ** Advanced (in major ways) versions of JMS (change model to XML defined topic objects matched to XML object subscriber profiles) and JDBC (XML Query not SQL interface) ** Web or Grid Service models ** Peer to Peer network technology as we want to build communities linking people (P2P) with resources (defined as Grid or Web services) ** C++ and CORBA (not so certain here. Maybe this is "legacy .opennet". I was told that a new start up from Iona (CORBA) just got substantial funding for a SOAP based environment!) These appear to be modern technologies to be used in building distributed systems that are not Microsoft particular. .net is of course the Microsoft alternative which in fact will use some technologies like SOAP in common with .opennet. Presumably the Grid is the sum of .net and .opennet. The relevant technologies are mostly immature but I think they are important in identifying functionalities -- e.g. RDF is a controversial technology but I believe it teaches us about some important features of .opennet, which may get eventually implemented in a very different way. It appears to me that the building block of .opennet are XML Schema. These seem destined to be the way all data structures are defined in the next generation of systems. There are several areas within XML and Schema specifications, which are either unclear or just missing. For instance, in the following we seem to need several functionalities that can be present in all Schema; roughly the equivalent of interfaces for Java classes. We need XML parsers that understand these interfaces, which imply additional elements/attributes that can be present in XML instances even if not defined in the basic Schema. XML interfaces of interest include: ** javarization or programization: define the conversion of XML Schema into associated Java (or any programming language) classes. ** binarydataencoding: specify how one can incorporate non ASCII data (such as Microsoft word files) into an XML stream. This is presumably some implementation of MIME techniques that are well established in similar e-mail situation. ** talkabout: This is meant to generalize the capability of RDF to make assertions about XML Instances. For example it implies that elements of a Schema implementing RDF talkabout interface can have attributes like rdf:resource. These particular interfaces could be viewed as associated with Schema or "just" as control parameters of the parser. javarization is a property that could need customization for individual Schema with a default approach specified to the system. It doesn't sound very nice maybe roasting or whatever one does to coffee in its processing is a better term. Simplification: Hereafter I assume that Java is programming language of .opennet. Obviously one could substitute another language. In applications I am interested in, I need several hundred different classes and I assume large-scale commercial applications can be orders of magnitude larger. Thus it seems that we must systematize the conversion between XML Schema and Java classes. We assume that the process is: a) Design the XML Schema of the External data -- the data to be transmitted between the different components of the system. b) Choose a default roasting method for mapping Schema into Java and specify Schema specific roastings when necessary -- either because mapping unclear or does not produce an elegant structure for the Java programmer. The generated Java classes are produced with an inverseroasting (xmlarization) interface, which defines the Java to XML conversion. Castor and Indiana have a roasting scheme. c) The programmer is free to use any class specifications for objects used internally to the Java program. We assume that all External Data when outside our Java program is logically represented in XML. Information transmitted by the system is XML forms of External Data and also "events" or messages controlling the system. The latter are also logically represented as XML when outside our program. In fact such events are just additional External Data generated by the running system. The external XML External Data is wrapped up in some protocol like SOAP and can use the talkabout Interface to annotate the data to specify the purpose of the transmission. Maybe XML is not efficient enough as transmission format and so we define fastXML as the actual transmitted form -- this is just a performance issue -- it could be used to avoid time consuming conversions at each end of a communication link; fastXML could also be used to represent element and attribute values more efficiently than character strings. The difference between fastXML and XML is only seen by transport and marshalling subsystems. It is transparent to the applications. A natural choice for fastXML appears to be RMI using serialization of Java version of External Data. Thus we find the following transport scenarios: ** XML over SOAP -- natural but not always most efficient transport mechanism in .opennet. The XML could be produced by inverseroasting of Java objects. ** RMI over SOAP -- as developed at Indiana. Used when RMI (serialized Java) selected as a data representation format and SOAP as appropriate transport protocol. In principle the serialized Java could come from roasting XML. ** Java over RMI -- where Java could be fastXML either gotten by roasting XML instances or from "not inverseroasting" data in our Java program ** XML over RMI -- Here XML is roasted to Java as fastXML and transported with RMI. It is essential for many applications, that one be able to include binary data into XML data nuggets. In GXOS, one wishes to able to reference an object either externally (as CORBA, Java etc.) or internally as an attachment. We should model this on SOAP approach (http://www.w3.org/TR/SOAP-attachments, http://www.gotdotnet.com/team/xml_wsspecs/dime/default.htm) and perhaps add an appropriate XML attribute (binarydataencoding) to specify approach and so allow it to change. The Java program must query the backend store -- even if this ends up as a table eventually, a SQL like syntax (JDBC) seems very unnatural. It is more natural to use a DOM like interface (XPATH). Approaches are XQUERY and http://www.xmldb.org. We term this emerging collection of interfaces java.opennet. For small problems (Personal.opennet) the easiest implementation is to read all the information into memory so that in this case java.opennet must reduce to natural memory lookup for given roasting. For large problems (Enterprise.opennet), the CPU memory typically forms a cache and Enterprise Javabeans appear a useful implementation vehicle. In my applications a single tree (Schema) describes all XML instances. A typical data nugget is node of the tree described by a Subschema. This has some implications described later. We presumably want to view existing Java-JDBC-DBMS model as a special case of .opennet. "Legacy" web-linked databases either are treated "as is" bypassing .opennet completely or we find the Schema equivalent to a given relational structure and fit in the legacy approach this way. If we use the optional mediator of the figure, the JDBC calls are translated into XML messages sent to the mediator and the traditional database driver runs on the mediator or on a subsystem invoked by it. The mediator (or intelligent router) is nominally a publish/subscribe server. We term the standard here GMS -- for Grid message service. As discussed elsewhere, GMS can either be implemented as one or a few servers (JMS) or as a P2P network as in JXTA. All transmission in GMS uses fastXML with Schema (in addition to those needed by External Data) to define the special attributes needed for such a service. GXOS has a start on this but needs enhancement for P2P case among other things. GMS would have the following differences from JMS ** Transport could by SOAP or RMI; data encoding could be XML or serialized Java ** The publish/subscribe model would by implemented as interacting agents representing publishers and subscribers. This service would use XML objects and not the SQL like property testing in JMS. ** Currently leading JMS providers, do refined tests on clients; for efficiency this should be moved to servers. The mediator approach supports distributed heterogeneous data sub-systems, collaboration (with multiple subscribers to a given publication) and asynchronous data access and delivery. Perhaps most importantly use of a mediator (or more precisely multiple networked mediators) allows one to support P2P networks with .opennet technology. Looking a tiny bit more deeply at the mediator for P2P, we note that a) A mediator is the control center of a collection of JXTA pipes -- asynchronous message queues b) The "mediator intelligence" is less critical than for central server case -- messages move just a bit in the right direction each hop; for the central server case discussed above (and used in JMS) one has but one chance to get the multicasting right! search.opennet This can have either a central (server) or P2P style of search. This would depend on the nature of mediator service. As well as the distribution of search requests (the mediators), one can distinguish the style of search request. These styles would reflect different interfaces in java.opennet. Two obvious styles are: ** Directory: equivalent to directory in Google and Yahoo with a URI label being part of the search. This could correspond to searching database indices -- the URI must be indexed in any reasonable database support of .opennet. ** Gallimaufry: this you will find is the hodge-podge of scattered information which has a search interface nearer that of Web Search in Google or Yahoo. Technically in directory search the URI and/or indexed metadata are considered critical as they classify meaning of the information. In the Gallimaufry one searches on less precise information as the URI is probably not so helpful as it often specifies some accidental property (such as computer location) and not meaningful knowledge. Consider the infamous 1000 person accelerator physics experiment. The raw data and later processed data are clearly naturally organized as a directory with a precise URI and precise computer generated metadata. The intellectual electronic heart of this endeavor is different; it is the contents of the disks of personal and other computers associated with the involved scientists. Here are Powerpoint presentations, half-finished reports, group pictures, results of analysis jobs etc. As in Napster, one needs to search this Gallimaufry to truly support the experiment. Both forms of information are important; the Directory is nearest the classical digital library; the Gallimaufry captures the intrinsic disorganization of innovation and is nearest the P2P model. Both search styles must be supported. uri.opennet Both to include the semantic web concepts and as a central organizing principle, it is useful to assume that any permanent resource in .opennet has a URI. Here a resource is any digital entity i.e. any distributed object. This URI can have the allowed form (-- http://www.isi.edu/in-notes/rfc2396.txt) but it need not have any special hierarchical significance to opennet://one/two/three/ ...... /nthnumber/. Rather the strings between slashes can be the values of your favorite keywords. GXOS.*.opennet Here * can be Personal or Enterprise. In this .opennet system, all actions are provoked by events, which contain an XML Script provisionally written in RDF. One critical feature is the use of a URI to uniquely label all items in ones Enterprise or Personal data. The Personal data just consist of a selection of nodes from full Enterprise GXOS tree. The intended applications -- education/training or scientific research appear to naturally lead to data nuggets with a unique URI. Anything permanent must have a URI; this includes messages to be preserved either as a log or to replay a session. Ephemeral messages can be labeled with the 128 bit UUID's as in JXTA. Consider a document store generated with GXOS. The entries in this store are not necessarily the documents but are typically meta-objects containing meta-data for object including a URL or other object access specification. These are service advertisements in JXTA. These meta-objects are labeled by URI's, which could be looked up by java.opennet. The store is grown by any methodology that generates an event sent to the mediator and containing a specification of a resource. These messages can come from several sources including email, output of an online form etc. The messages are processed asynchronously and in distributed fashion. Suppose GXOS is used to manage your photo collection. You can send email to the mediator with a photo as attachment and metadata in the mail body; if this work was being done on your disconnected laptop, these messages would be sent later and gradually GXOS would build the photo database as a collection of simple messages. Instead of an attachment, the digital photos could be on a web server and one can just record the URL of this in your meta-data message. As well as this low technology solution, one could have a wizard directly generating XML messages from a custom GUI requesting allowed meta data elements. The same approach can be used to build databases of documents for training, research, project management etc. The stored information can be modified by update events referencing the URI and specifying one or more changed elements. These update events could be processed on a FIFO fashion for each resource. More precisely one can associate a lock attribute with each element and first issue a message requesting a lock (i.e. setting lock attribute) and after getting this do the update safely. Powerful threaded discussion lists could naturally be built with this technology.