|
A message or event encapsulates application data. It generally comprises of headers or descriptors which describe the data, its origins and other application related information and properties. Then there is also the payload, which is the chunk of data that needs to be processed by the application. Finally, there are also the routing footprints which are added by every location the message/event has traversed through.
What is Message Oriented Middleware? Message Oriented Middleware (MOM) facilitates communications between entities through the exchange of messages or events. Events and messages are for the most part identical and tend to be used interchangeably quite often. MOM facilitates asynchronous communications and allow for the development of far richer applications.
How does distributed messaging differ from other distributed frameworks? Remote method invocations have been used in distributed systems for quite some time. Frameworks such as CORBA from the Object Management Group (OMG) have had schemes in place to facilitate invocations on remote objects for more than a decade. There also has been support for remote invocations in programming languages, a case in point being the Java Remote Method Invocation (RMI) Framework. In these cases we could think of the remote object as providing a service comprising a set of functions. The provider exposes the service’s capability through an appropriate description language, which comprises the function names, the number and type arguments that a given service function takes and finally the return type that would be returned upon completion of the invocation. A simpler principle of remote invocations exists for Grid and Web Services.
Distributed messaging relies on entities responding, interacting and facilitating communications through the exchange of events rather than remote invocations. Both approaches are needed and serve different purposes.
What about messaging systems based on queuing? Messaging systems based on queuing rely on the store-and-forward approach for communications. Also such systems tend to rely on point-to-point communications and the setting of queues at each end to facilitate communications.
In publish/subscribe systems the routing of messages from the publisher to the subscriber is within the purview of the message oriented middleware (MOM), which is responsible for routing the right content from the producer to the right consumers. Publish/Subscribe systems provide a clear decoupling of the message producer and consumer roles that interacting entities might have.
This is especially useful if there are a large number of potential consumers for a given event. In such cases a producer need not keep track of the large number of consumers that you should route the event to. The middleware performs this function for you.
Is every messaging style asynchronous? No. Several systems impose implicit/explicit timing constraints on exchanges and incorporate timeouts in request-response interactions.
Is every messaging style publish/subscribe? No, not all messaging is publish/subscribe. Systems such as queuing systems involve the static setting up of queues and these queues then proceed to perform targeted forwarding of messages from one queue to another. The consumers and producers of content are thus closely tied to each other.
Are all middleware systems based on publish/subscribe? No. Middleware can be based on RPC and queuing too. In these cases the middleware is clearly not based on the publish/subscribe paradigm.
An overlay network is a virtual network built over one or more physical networks. The Internet is itself an example of an overlay network. In overlay networks the individual links which connect nodes can comprise multiple routers and hosts.
Why do you need overlay networks? It is difficult, if not impossible, to come up with efficient routing algorithms in network of nodes without a structure. Such systems resort to flooding, when an interaction is routed throughout the entire network to ensure that all interested entities receive it.
The flooding approach can lead to congestion, queuing threshold overflows and accompanying delays in response times. The problem is exacerbated in conditions involving dense client concentrations and high rates of interactions.
Another related issue is that of continuous echoing, which refers to a situation where the same interaction is routed over and over the same set of nodes, due to loops in connectivity. This echoing can be eliminated through the use of dissemination traces, which keep track of the nodes that an interaction has traversed through. However in unstructured networks this approach becomes infeasible especially when the number of nodes within the system is quite large. Systems which have a structure to them facilitate creation of efficient routing algorithms that exploit the underlying structure. Continuous echoing can also be eliminated by exploiting the structure, whereby a single footprint in the trace can be made to account for several nodes at a time. For example a cluster footprint would imply that all nodes in the cluster processed that interaction.
P2P systems and overlay networks P2P systems have traditionally been based on unstructured networks, where routing is based on forwarding. Messages routed have a Time-to-Live (TTL) indicator in them, which is incremented at every hop. The routing is thus akin to ripples in a pond which attenuate after a while.
DHTs have been quite popular in several P2P systems. Here each data object is associated with a key. A lookup service to locate this object returns the IP-address of the node hosting this object. Similar to a traditional hashtable data structure, other operations supported in the DHT include put and get. In P2P overlay networks are those in which the nodes are organized based on the content that they possess. DHTs are used to locate, distribute, retrieve and manage data in these settings. This scheme provides bounded lookup times.
P2P overlay networks do not facilitate keyword based searching, the lookups are instead based on the identifiers (generally computed by a hashing function such as SHA-1) derived from the content.
The push by Java to include publish subscribe features into its messaging middleware include efforts like the Java Message Service (JMS) specification. One of the goals of JMS is to offer a unified API across publish subscribe implementations. JMS comprises of a set of interfaces. JMS Providers or JMS compliant systems provide implementations of these interfaces.
The JMS specification results in JMS clients being able to interoperate with any service provider, this process generally requires clients to incorporate a few changes in the initialization sequences that are specific to the vendor being used, after which interactions, as specified in the JMS API, continue. JMS clients are provider agnostic, and with a change in initialization sequences a client should be able to function just as well with any other provider. JMS does not provide for interoperability between JMS providers, though interactions between clients of different providers can be achieved through a client that is connected to the different JMS providers.
NaradaBrokering is an open source technology supporting a suite of capabilities for reliable/robust flexible messaging; given the message based service architecture of Grids, this project is aimed at providing for the transport of messages between services and between services and clients. NaradaBrokering is designed around a scalable distributed network of cooperating message routers and processors. Messages/Events within NaradaBrokering encapsulate expressive power at multiple levels. Where, when and how these events reveal their expressive power constitutes information flow. NaradaBrokering manages this information flow.
What are some of the features available in NaradaBrokering?
Is NaradaBrokering firewall and proxy friendly? Yes, NaradaBrokering can tunnel through firewalls and authentication proxies. NaradaBrokering can deal with 3 different types of authentication challenges (Basic, Digest and NTLM). We have also tested NaradaBrokering with Microsoft’s ISA firewall/proxy and it has been successfully deployed on Linux systems residing behind firewalls. The next release of NaradaBrokering will have a firewall/proxy configuration manager which will guide users/administrators through setting up NaradaBrokering communications across such boundaries.
What is the difference between NaradaBrokering and Multicast ? One of the advantages of using the NaradaBrokering system is that you don't need MBONE for the system to work, which is the case in hardware multicast. Furthermore, the biggest drawback for using hardware multicast is that representing subscription sets as groups entails an enormous number of groups. For example if one could run into 2n groups for n subscribers. Furthermore NaradaBrokering also allows you to tunnel through authenticating firewalls and proxies; domains where multicast communication is not possible.
Robust messaging pertains to the ability to deliver messages to interested entities irrespective of the failures that might take place within the system. Similarly entities should be able to retrieve events after prolonged disconnects. NaradaBrokering provides this capability.
A single broker in NaradaBrokering can handle between 100-400 concurrent video streams with acceptable performance. NaradaBrokering has also been deployed in real-time settings by providing back end support for the Anabas conferencing software running in JMS emulation mode. Several online seminars have been conducted using this Anabas-NaradaBrokering combination and the number of users collaborating concurrently has been in the excess of 30 users several times. The strains imposed by Shared Display both in terms of the size of these messages and the rate of changes were handled adequately by NaradaBrokering.
The distributed architecture of NaradaBrokering allows us to support large concentrations of clients.
NaradaBrokering supports a very wide variety of constraints. These could be String based topics, Integer topics, Tag=Value pairs, XPath queries and SQL queries. We plan to add Regular Expression support in subsequent releases.
NaradaBrokering goes beyond other operational publish/subscribe systems in many ways. This is predicated on several factors including its support for JMS, P2P interactions, audio-video conferencing, integrated performance monitoring, communication through firewalls among others. The system has been designed to scale over a wide variety of devices - from hand held computers at one end to high performance computers and sensors at the other extreme.
What are the applications that currently use NaradaBrokering? NaradaBrokering has been used for supporting A/V conferencing applications, collaboration software, for communications within portals, fault tolerant Grid FTP among others. For a more comprehensive list please refer to the SC03 handout.
|