DISorientation Paul F. Reynolds, Jr. Dept of Computer Science University of Virginia reynolds@virginia.edu 1.0 Abstract Instant popularity is a double-edged sword. The meteoric rise of the Distributed Interactive Simulation (DIS) Protocol has even DIS visionaries concerned about potential unproductive attempts by simulationists to induce DIS-based interoperability where it just won't fit. That's a well-founded concern. DIS, like so many bottom-up development efforts before it, lacks sufficient abstraction to adapt and evolve. Alternatives should be explored with vigor. Successful evolution occurs in those systems sufficiently abstract to incorporate the unexpected. A key to success is identifying the right set of abstractions. Another is the language with which those abstractions are described and manipulated. What are the proper abstractions for distributed simulations? What capabilities should be natural and efficient? Can a single, monolithic linkage technology satisfy the needs of military simulations? What other kinds of simulations should be considered? Answers to these and related questions can only be determined after establishing the broadest set of technologically supportable requirements for capabilities desired and anticipated. That's a substantial task, humbly begun here. 2.0 Introduction >From a recent DIS document outlining visions for DIS[1], we encounter: "Expectations of what can be done...are growing. At some point it may be necessary to look at inherent limits of the process to curb unrealistic expectations and subsequent disillusionment." The "necessary look" is the task undertaken here. However, the reader is cautioned that what follows is by no means meant to be the final word on the topic. If the following ideas spark discussion and lead to the development of a more robust, scalable and interoperable distributed simulation protocol then the author's intentions will have been met. So, what capabilities should a linkage technology that calls itself Distributed Interactive (alternatively "Interoperable") Simulation Protocol [2] provide to *anyone* wishing to use it? Moving out of the military realm (or is it?) for a moment, an economist might expect support in modelling world economies, a meteorologist could expect support for modelling worldwide weather and an entomologist might expect to simulate the migration of pests across continents. Any or all of the three might realize they need what the others provide in order to have a truly accurate model. Demands for fidelity will likely drive military simulationsists to a similar conclusion: there are important processes that must be considered. What are the requirements of a linkage technology that supports, in a manageable fashion, an ever-expanding base of capabilities? Logically centralized control, generality, scalability, flexibility, robustness and support for varying levels of fidelity and abstraction appear to be important characteristics of a useful, extensible linkage technology. The economist, left to his or her own devices, may decide that a linkage technology that is based on common assumptions about currency flow and exchange policy is sufficient. But such a technology becomes inadequate as soon as the migration of fire ants from South America has to be incorporated as a significant event into the model. In the same vein, a linkage technology that defines itself in terms of platforms with weapons and sensors will miss, say, the natural event that disables critical elements of a force. Lest any military simulationists dismiss these considerations as unnecessary for accurate military simulations, a review of the role of external and often unexpected events, natural or otherwise, in the success or failure of military missions is in order. That won't be done here. It will be a premise that those who strive for accuracy at the "object level" in simulations, distributed or otherwise, must include considerations for all sorts of phenomena that can have direct effect on military operations. Thus, even if we concede ownership of the general purpose name "Distributed Interactive Simulation Protocol" to military simulationists, we conclude that the protocol must be general, flexible, scalable, robust, capable of variable fidelity and a natural manager of abstractions. Computing scientists have learned over the last ten years that object-oriented design has many advantages, not the least of which is organization by extraction of common elements. Through inheritance then, new types of objects can be described in the framework of that which is already captured. This is the power of proper management of abstractions. Additional benefits include improved understanding of the relationships among the objects that have been captured. Extraction of common elements provides the benefit of software reuse. While it is not the intent of the author to join in the excessive overplay given to the benefits of object-oriented design, it is a premise here that the organization and insight that object-oriented design offers distributed simulations is critical. Object-oriented design was begun in a simulation language (Simula-67). It is applicable here. At the moment, DIS is working well. As requirements move beyond platforms with weapons and sensors, and the limits of a protocol that captures entity interactions in a flat (non-hierarchical) manner become evident, DIS will not work well. DIS can't scale. Among other things, this means, in DIS, interactions among entities must be individually handcrafted. Lacking the proper abstraction mechanisms with which to build, DIS restricts in unacceptable ways the kinds of entities and processes which can be successfully captured in a scalable manner. To have interoperability, we need better abstractions. In the sequel we characterize some requirements for truly distributed simulations. Ultimately these requirements are best satisfied by a paradigm that supports the definition of interactions among abstractions of varying levels. We present a simple framework for an alternative distributed simulation protocol with theexpectation that it will inspire discussion. Finally, we discuss the benefits of this framework to distributed interoperable simulation. 3.0 Identifying Requirements We outline some requirements for a distributed interactive simulation architecture. Emphasis is on general purpose distributed simulations with an eye towards the needs of the military simulation community. The needs turn out not to be very different. That in fact is a primary reason this discussion is necessary. Perhaps the most important aspect of a distributed simulation, a simulation which can grow to hundreds of thousands or millions of objects and concepts, is control. Control manifests itself in a number of places: control of basic concepts, control of standards, configuration control and exercise control. With respect to exercise control the DIS vision document notes: "planning, setup, execution and monitoring of a large, multi-site exercise is a complex exercise that may ultimately prove to be a greater challenge than managing the network traffic itself." [1] Indeed. And controlling concepts, standardization and configuration is even more difficult. Concept control, as its name implies, is central to the design of a simulation architecture. It gives the architecture meaning and drives its design and implementation. Concepts, in turn, are defined by the true requirements of intended applications. What is it we want a particular distributed simulation application and/or architecture to do? What kind of fidelity is required, or conversely, are we willing to tolerate? How are entities and/or concepts in both intended applications and architectures related? These are central questions leading to basic decisions, and should be made with the best input from the best minds on a given topic. They are central because omissions and oversights can preclude validity. Decisions on these topics must be made early and they must be carried out in a centralized manner. Leaving them to distributed, ad hoc development will lead to a design disaster. Standards are an essential requirement. As military simulationists have attempted to connect disparate simulations in the name of interoperability over the last eight years, an appreciation for standards has evolved. Standards are often applied to fairly low level concepts such as coordinate systems and environmental data. DIS applies standards primarily in the design of protocol data units (PDU's) [2]. Standards are needed for higher level concepts as well. As argued below, distributed simulations must allow flexibility in the description and incorporation of entities and concepts in a simulation environment. Flexibility, managed correctly, however, requires a significant amount of control, best embodied as standards in the system architecture. Configuration control. Again, the DIS vision document recognizes a part of the problem: "another daunting dimension of this problem is configuration management.... Where interfaces to wargames are included, they can easily represent thousands of parameters to be recorded." [1] What the document fails to recognize is that with a flat structure, as represented by the DIS PDU approach, the problem noted is an n^2 problem. Every system connected to DIS faces this problem with every other system connected to DIS. Without a development structure that leads naturally to sharing of ideas and code, configuration control will be a lot of configuration and little control. Configuration control should be proactive. The tools and languages provided to developers should encourage sharing and interface development in a natural way. This requirement goes beyond computing-related concerns: it addresses the need for a social structure that is conducive to sharing of ideas and the reuse of the work of others. Exercise control should consist of more than planning, setting up, executing and monitoring a large multi-site exercise. It should be more flexible and dynamic in nature. It should provide and enforce standards control during the exercise. For many, standards control ends when the simulation begins. This static view practically ensures that dynamic configuration and dynamic creation of entities and concepts cannot occur easily. Yet, for a system that is to support simulations that execute for weeks, dynamic creation of entities and concepts is an important requirement. With the proper development structure, users could propose "what if" concepts that could be readily incorporated into the system. In order to ensure integrity of the system, much of exercise control would have to be automated. In particular, standards enforcement should be automated. Besides control-oriented requirements, a useful distributed simulation protocol should provide for generality, scalability, flexibility, robustness and support for varying levels of fidelity and abstraction. We discuss each of these in turn. Imagine a military scenario in which an opposing force is marginally loyal to its fanatical leader. At any time components of the opposing force may defect. Knowing this, the leader may issue orders that defy all military doctrine. Now, should an accurate distributed military simulation include socio-political and/or psychological models as a part of a high fidelity simulation? It would be difficult to incorporate models such as these into a simulation protocol that is standardized on platforms with weapons and sensors. A mechanism for incorporating what was once unforeseen must be present in any useful distributed simulation protocol. Thus, the requirement for generality. Scalability, as with control, is multifaceted. There is scalability of simulations and scalability of the simulation paradigm itself. There is scalability with respect to cost. In all three cases we desire that costs and complexity grow linearly or no worse with system growth. Scalability requires that complexity be managed carefully. The degree to which one can benefit from what has already been accomplished is inversely proportional to complexity. To make a distributed simulation cost- and complexity-scalable, success in many of the control issues mentioned above must be achieved. Adding a feature to an existing simulation must fit the design paradigm in place. In turn, the design paradigm must support the addition of features through reuse, flexible standards and careful, yet flexible configuration control. Flexibility and generality have much in common. We distinguish them in the following sense: generality allows for the incorporation of new concepts and ideas in a natural way. Flexibility allows them to occur dynamically. Imagine an ongoing exercise in which an inspired officer notices that the utility of a particular sensor platform could be improved significantly if only a certain capability existed. He acquires requisite permission and changes the characteristics of the platform as the exercise continues. One problem is that his enhanced sensors require additional information about various simulation objects. Can this information be obtained with an assurance that doing so won't disrupt the ongoing simulation? A system that supports the addition or modification of simulation components in an ongoing exercise meets the flexibility requirement. By extension, a flexible system supports the dynamic replacement of a low level object by a higher level abstraction. This capability is often discussed with respect to DIS-based simulations in the form of interchanging humans in the loop with software modules that may represent them at a more abstract level. A system is robust if it supports activities such as the dynamic modification just discussed. It is only reasonable to do it through automated safeguards to the extent possible. In addition to supporting flexibility, the robustness of a system is measured by its fault tolerance. It should be possible to dynamically substitute alternative software components whenever the components they represent fail unexpectedly. This aspect of robustness is not discussed further in this paper. Fidelity and abstraction are often confused. It is possible to have high fidelity in highly abstract systems [3], something virtual (as opposed to constructive) simulationists are often loath to admit. Obviously, anytime a low cost abstract model can faithfully replace a high cost concrete model, resources are conserved. A distributed simulation architecture should provide support for the incorporation of abstractions of varying concreteness. While simply stated, this requirement leads to the well known aggregation/disaggregation problem that is still the subject of basic research. Nonetheless, provisions should be made for incorporating abstractions of varying concreteness. Finally, we must get our abstractions right. All of the requirements above are influenced by the quality of the abstractions we choose to work with. If we choose our abstractions to be platforms with weapons and sensors, then the resulting system is restricted in two ways: additional abstractions must be defined separately and integrated in an ad hoc manner, and the ability to build on the work of others is lost. Thus we lose scalability in all senses. Furthermore, the system is neither general nor flexible. How does one choose the right set of abstractions? Abstractions should be as general as possible while still supporting efficiency and still taking advantage of automation to the greatest extent possible. We want our abstractions to be general and our standards to force compatibility --at all levels: conceptual as well as in implementation. There are many other requirements a distributed simulation protocol should have. Among others are security, conformance with existing external standards, portability of implementation, ease of use and the like. Furthermore, the standards embodied in DIS should be considered seriously. In the implementation proposed in the next section, the difference lies in how those standards are embodied, and how they can be extended. 4.0 Making Theory Work in Practice >From the DIS vision document [1]: "Most architectural schemes for DIS have evolved in an ad hoc fashion." That mistake should be avoided the second time around. The proposed architecture that follows is a sketch at best. It is meant to spark discussion. A distinct advantage in presenting a framework of such limited detail is that it offers a wealth of opportunity to experiment with various implementation details. Another useful insight from the DIS vision document [1]: "A fundamental challenge is how to describe the attributes and characteristics of DIS elements such that the user and exercise control can determine that the elements are appropriate (i.e. valid) for the purposes of the DIS exercise and capable of functioning together acceptably for that purpose." A thesis in this paper is that there is only one good way to do this and that's through logically centralized control and abstraction management as offered by object-oriented design techniques. These days, with communications as they are, "centralized" no longer has to mean "physically centralized." It is quite possible for a small group to behave as though centralized while being physically distributed. In a similar sense, a centralized aspect of a computer architecture can be distributed as long as consistency is properly maintained among sites. So as not to confuse the reader, we will use the term "logically centralized" to reflect the fact that physical proximity is not required. Consider a paradigm for distributed simulation in which all important aspects of control are logically centralized. Small groups are responsible for conceptual design of the system as well as for maintaining standards and all implementation related materials. The groups need not be the same or collocated. The important idea is that control in each case is in the hands of a logically centralized group. The details of decisions made by the group responsible for conceptual design are beyond the scope of this paper. In general, such a group would be responsible for describing those concepts that are core to the distributed simulation of interest to them. In the case of a DIS-oriented simulation, they would be responsible for describing the entities and related concepts important to military simulation and the relations among those entities. They would be responsible for decisions to exclude certain capabilities and for designing the system to be sufficiently general to allow for the later incorporation of the rejected capabilities should their decision to exclude turn out to be in error. Finally, they would be responsible for identifying the important attributes of the entities and concepts they deem important. This is where an object-oriented framework applies. Object-oriented design allows the structured characterization of entities such as wheeled vehicles and track vehicles and the like. At the same time it supports aggregation in a fairly straightforward manner: more abstract concepts, such as higher echelons, can be represented as base classes, while lower level concepts can be represented in classes derived from the base classes. Of course, caution should be advised here: the object oriented paradigm is not perfect, but it does provide an excellent starting framework. Recent successes with distributed object-oriented paradigms makes object-oriented design even more attractive. A system such as Legion {4} makes it possible to encapsulate existing codes as independent concurrent objects. With this development it is possible to incorporate existing codes of varying levels of abstraction and use them immediately. Evolution and refinement of the code can follow later. We conclude that DIS itself could be so encapsulated in the architecture we propose here. Consider a system, call it DIScover, with the following characteristics. This system, a distributed one, consists of a logically centralized library of class definitions for entities and concepts important to distributed simulations. Some of the definitions would pertain to simulation management, no matter what the type of simulation application. Others would pertain to particular simulation applications (e.g. hierarchical definitions of simulation entities). Gluing them all together would be a simple underlying communication system that at its most abstract level supports communication with both the logically centralized library and any sites that have imported the communication routines from the centralized library. Communications could be in the form of inquiries, requests for class definitions in the library, communications pertaining to the nature of a particular site, as well as communications pertaining to an ongoing simulation. An important idea here is that the actions related to building a simulation and those related to conducting one are managed concurrently. Done successfully, it is easy to see how such a system could support the flexibility requirement stated in the previous section. Coexisting in DIScover with the logically centralized library would be logically centralized simulation control. Here the various forms of control would be carried out. There's no reason that configuration control and exercise control could not operate concurrently; the system would simply need to ensure that configuration changes that occur in the course of a simulation pass through prescribed human and automated integrity checks. Access to the centralized library would be managed by simulation control. In addition to the reading operations discussed above, provisions would exist for adding to the centralized library. Additions would include new class definitions (of simulation entities, inter-class relations and the like). Additions to the library would have to meet established standards, as enforced by simulation control. The line between developing code for a simulation and executing a simulation would be less distinct than it is in DIS, although it would be equally well controlled through a properly crafted simulation control. (There is no reason a simulation controller could not freeze the instances of the definitions used to build a simulation, just as they are currently done in DIS exercises.) What we are proposing here is a system that is logically centralized in control and in maintenance of all critical information; it is flexible in conformance with standards that are enforced in a combined human-machine procedure; it is hierarchical in nature, with code reuse and inheritance employed to the extent the object-oriented paradigm permits; it is scalable, again, to the extent that one can exploit logically centralized control and hierarchical, object-oriented design; it supports varying levels of fidelity through well defined methods for relating abstractions (easy to say, difficult to do); and finally, it supports multiple levels of abstraction through class structuring. It is robust in ways not pursued here. As a way of contrasting DIS as it exists with DIScover, as it is proposed, we make two observations: 1) DIScover represents a different culture: design and control are logically centralized throughout the lifetime of the protocol. Members of a DIScover confederation operate in a hierarchically organized environment. Building on standards and fundamental operations provided centrally, they build a distributed simulation in accordance with standards that are always enforced by a logically centralized (and automated to the extent possible) control. 2) There only needs to be one syntax for messages in DIScover: (, , , , ). With this syntax, an arbitrary number of message types could be supported (as dynamically determined by logically centralized communication control). Multicasts could be supported, as could real time delivery requirements. The message type would uniquely identify all remaining fields in the message. In this way, standards are where they should be: encapsulated in the abstractions that make up a part of the control portion of a simulation architecture. There they are accessible for inquiry and in general just one more piece in the abstractions comprising a distributed simulation environment. 5.0 Benefits Let's say it can work. What do we have? First, by incorporating standards into the way entities and concepts can interoperate, rather than in the way they must communicate, we provide for a significant amount of flexibility without sacrificing much. With logically centralized control, developers still must answer to standards as imposed by the controlling group. However, the turnaround from concept to implementation could be hours or days, rather than partial or actual years, as it now takes with the development of DIS PDU standards. By imposing an object-oriented paradigm on the structure of distributed simulations and the entities represented within them, we guarantee that control over configurations as well as concept and code sharing are enhanced. Control becomes logically centralized. Con- cepts and code become logically centralized, and logically connected since they are a part of a centrally developed simulation structure. And yet, by logically centralizing these functions we give developers more flexibility and more access to simulation design and control. 6.0 Conclusions DIS is a flat architecture suffering from lack of abstraction and therefore lack of ability to grow and/or scale in directions it will soon be pushed. The time to look at alternatives is now, before DIS's inertia makes it impossible to seriously consider alternatives. This is not to say that DIS is not doing well under current circumstances; it simply won't do well when organizations try to use it to simulate other than platforms with weapons and sensors. By capturing attributes and standards in PDU's, DIS fails to exploit the commonality among varying entities that supports scalability. We have explored requirements for a more general distributed simulation environment, drawing on knowledge of existing scalable object-oriented systems. We have proposed that flexibility, generality, scalability, robustness and proper management of and support for abstractions is the hallmark of a useful distributed simulation protocol. We have sketched the design of a framework in which distributed simulations could operate while meeting the requirements just mentioned. This framework is by no means complete, in description or in terms of conceptual completeness. It is a strawman, meant to inspire discussion that the author hopes will lead to the design and development of a more usable distributed simulation protocol. 7.0 References 1) Seidensticer, S. (Ed.), "The DIS Vision," DIS Steering Committee, September, 1993. 2) IST-CR-93-15 "Proposed IEEE Standard Draft, Standard for Information Technology - Protocols for Distributed Interactive Simulation Applications Version 2.0, Third Draft," May, 1993. 3) Cheatham, P. (Ed.), "Peer Review of Aggregate Level Linkage Technology," DMSO, Feb 1993. 4) Grimshaw, A., et al. "Legion," University of Virginia, Computer Science working document, May, 1994.