12/15 An Interactive Visualization Environment for Financial Modeling on Heterogeneous Computing Systems OUTLINE Introduction define problem -- interactive gui based on sequential technology purpose -- integrate gui with parallel computing environment use option pricing application to illustrate focus -- on functionality of system, integration issues, prelim perf analysis Option Price Modeling stock option pricing models (describe briefly what model does, set 4 models) want to evaluate model performance, incorporate expert user, perform model/market comparison System Configuration hardware software An introduction to AVS high performance distributed computing system distributed I/O, computation, memory System Integration and Implementation AVS framework mix of sequential, parallel, C, Fortran models data-flow control-flow GUI implementation Performance Analysis Discussion and Conclusion generalize to other applications INTRODUCTION Advances in parallel computing systems provide a feasible sulotion to many computationally intensive applications which otherwise would be impossible or impractical on conventional computers. High performance compuiting is expected to play a leading role in 1990's. Many applications, however, especially modeling and simulation related applictions, require an interactive graphical users interface in real-life computing environement. The GUI's are mostly event-driven thus serial in nature and it's difficult to ultilize their parallelism. On the other hand, most visulization tools on parallel systems either require special hardware support or are implemented for those specific systems. They are non-portable. Thus, well portable interactive graphical users interface tools, developed on sequential computing systems, are also needed in a parallel computing environment. Another observation is that currently there is no such a general-purpose supercomputer that it is good to solve any class of problems. We need a metacomputer consisting of networked computing resources of various architectures and computational power, in which subproblems of a large application are distributed to architecures best suited to them. In this study, we integrate an interactive visualization environment into a heterogeneous computing system to run stock option price modeling. A commercial available visualization system AVS is used both as a data visualization tool and as a networking tool to integrate this distributed system in which high performance computational modules, running on two parallel systems and two high-end workstations with different architectures, are coupled with network-based visualization, interactive system control and distributed I/O modules on three workstation. Using stock option pricing as an example of a challenging application problem in this paper, we describe design issues and performance requirements for integrating visual and scientific computing in a heterogeneous computing environment. We break down our analysis of design issues and performance requirements into three components: system integration, visualization functionality, and the ratio of system components. PRICING MODELS Stock option pricing models are used to calculate a price for an option contract based on a set of variables defined by the market, e.g. stock price, exercise price, interest rate, maturity time, and a set of model parameters: volatility of stock price, variance of the volatility, and correlation between volatilty and stock price. These model parameters are not directly observable, and must be estimated from market data. We use a set of four option pricing models in this study. Simple pricing models treat stock price volatility as a constant, and treat only European (option exercised only at maturity of contract) options. More sophisticated models incorporate stochastic volatility processes, and allow American contracts (option exercised at any time in life of contract)[1][2][3]. However, those stochastic volatility pricing models are computationally intensive and have significant communication requirements. In previous studies [4][5], following four pricing models are investigated : BS model --- European, constant volatilty, Black-Scholes AMC model --- American, constant volatility, binomial model EUS model --- European, stochastic volatility, binomial model AMS model --- American, stochastic volatility, binomial model we developed serial and data parallel versions of these models on various sequential and parallel systems, conducted a comparison of model and market prices, and evaluated model performance and parallel software issues. Parallel models developed on the Connection Machine-2/5 and DECmpp-12000 run approximately 100 times faster than sequential models on a high-speed workstation. Extremely fast model run-times are important for trading purpose, pricing large portfolios, and incorporating optimization techniques for model parameter estimation. To further evaluate and optimize pricing models to run on massively parallel, along with high performance computing modules for real-time pricing, real-time visualization of model results and market conditions and a graphical user interface allowing expert interaction with pricing models are needed. We now incorporate dynamic visualization and system control in a heterogeneous computing environment to further develop use of optimization techniques for model parameter estimation and improved model accuracy, and develop a graphical interface to our model system so that a market expert can start and stop the model, adjust model parameters, and call optimization routines according to dynamically changing market conditions. Analytic models are useful tools in the financial market, but require expert interpretation. Comprehensive visualization functionality is required for the dynamic visualization of market and model prices fluctuations and the graphical model interface. SYSTEM CONFIGURATION The overall system configuration is shown in Figure 1. Total 7 machines with 5 different architectures are used in this application. All workstations in the system, including the front-end workstations of DECmpp-12000 and CM5, are connected by a 10MB/sec local network Ethernet. UNIX and its software environment are running all the workstations. Introduction of AVS as a visualization tool and a networking tool: (to be added) . comprehensive visualization functionalities(graphical programming style, high-level object-oriented design or modular design) . networking capability(integrator), transparant networking, fault-tolerent, etc. . highly portability over different vendors (industry standard), XDR . support data flow programming model High performance distributed computing system The computing modules for the four option pricing models are running on 4 different remote machines: BS model on a DEC5000, AMC model on a SUN4, EUS model on a CM5 and AMS on a DECmpp-12000. The DECmpp-12000 is a massively parallel SIMD machine with 8192 processor. Each RISC-like processor has a 1.8-MIPS control processor, forty 32-bit registers, and 16 KBytes of RAM. The peak performance is 650 Mflops DP, 117 Gbytes/sec memory bandwidth and 1.5 Gbytes router bandwidth. PEs are arranged in a rectangular two-dimensional grid and are tightly coupled with a DEC5000 front-end workstation. DECmpp-12000 has a high speed overlapping I/O subsystem. The Connection Machine 5 is a parallel SIMD/MIMD machine with 32 processing node. Each PN consists of a SPARC processor for control, four proprietary vector units for numerical computation and 32 MBytes of RAM. The control node(front-end) of CM5 is a SUN4 workstation. Our home machine is a IBM RS/6000 graphical workstation which has a GTO Graphics Adapter with 24-bit color. This machine is devoted to displaying graphical user interface, rendering graphical output, monitoring user's runtime interaction. windows(the ideal machine is a SGI workstation or a X terminal). As the home machine the user is logging in, it is also the physical interface through which user set up the system at start and interacts with the system by using mouse, keyboard and other peripheral I/O devices. AVS kernel and system modules are running on this machine. The actual user interface is running on a remote SUN4. This machine is devoted mainly to a file server for bulk data input of the system from databases on disk, in addition to other functions like overall system synchronization, broadcast of collected input data to the remote computing machines. In our application, the input sources are historical market data on disk files and user runtime input from GUI on home machine. Future improvements in our modeling environment include a real-time market data service. Another IBM RS/6000 is used as another file server for non-graphical output of system generated bulk data. Based on output data produced from all remote computing modules, it can also perform other non-graphical functions like data synthesis, statistical analysis. In our application, the output distination are databases created for later analysis. Notice that computing modules on each of the 4 computing machines can allow I/O on that remote. As can be seen in Figure 1, we have made following requirements/resoures distributed for a large problem: (1) Distributed Computing (2) Distributed Memory (3) Distributed Input/Output SYSTEN INTEGRATION AND IMPLEMENTATION Issues invloved in the integration of modules written in C, Fortran, Fortran90, MPL and implemented on SUN4, DEC5000, IBM6000 and DECmpp, CM5(CM2). AVS framework AVS provides us with both a visualization tool and a high-level networking tool. Using the AVS library routines, we imlemented portable user input/output interfaces on SUN, DEC and IBM workstations. Using the remote modules and network editor, we distributed the interfaces and computing modules on different machines over Eithernet and integrated them in a single workstation-based visulaization environment. This Section decribes the implementation details of this remote visualization environemnt on a hertergenous system. All modules on different machines are compiled and linked like stand-alone programs at operating system level. The only requirement is that input and output ports must be defined in modules by the programmer, using specific library routines provided by AVS. System issues include integration of diverse functions running on hardware best suited for high performance. Our present problem includes two computationally intensive models running on two massively parallel machines, with visualization and system control functions implemented on network connected workstations. Larger, more complex application problems will likely require us to construct a network of machines to integrate functions such as scientific computing, visualization, database services, and using large memory distributed over a network, while providing system integration and synchronization. We require a high-level framework that is portable across systems rather than architecture-specific computation and visualization libraries and tools. We use multiple languages, and especially need calls between sequential and data parallel versions of C and Fortran. A mix of programming languages on the various compute nodes--Fortran77 on the DECstation 5000, C on the SUN4, CMFortran on the CM-5, and MPL (data parallel C) on the DECmpp-12000. current research issue: integrating program environment--compiler, OS, debuggers from sequential environment (front-end of parallel machines is a sequential environment) In AVS, message passing among different processes on the same machine is done by shared memory, while message passing from one machine to another is through TCP/IP in XDR format. Data Flow: As shown in Figure 1, input data of this system come from two sources: (the data brocasted from input file server to each computing machine) (1) Historical market data on disk files of file server machine. model primitive input . initial stock price . exercise (striking) price . risk-free rate . time to maturity . time to dividend (early exercise time) . dividend amount . call price initial (default) model pramater . variance of stock volatility . correlation between stock price and its volatility (2) user runtime input from the GUI on home machine. (the data from home machine to input file server machine) . variance of stock volatility . correlation between stock price and its volatility . each model's initial volatility (if they are not defined, default values are calculated by a function in user interface module) Output data of this system go to three distincations: . each new model price(a scalar value), with certain number of previously generated values, is rendering graphically in a graphic window on home machine. (the data from each computing machine to home machine) . each new model price is also displayed numerically in a shell window on home machine . all 4 new models prices, with other model input data, are stored to databases on the file server machine in binary(or ASCII) format. (the data from each computing machine to the file server machine) . (optional) each new model price, with other data, is stored to a file on the computing machine. Control flow This is an event-driven system in which following events from different machines are involved in: . user runtime interaction through GUI on home machine . model input from disk file on (input) file server machine . send/receive data to/from other machine(s) . computing module on each model computing machine . model output to disk file on (output) file server machine . model graphical output on home machine . system synchronization on (input) file server machine Figure 2 shows control flows of the overal network and thosse on home machine and (input) file server machine. We use a simple overall system synchronization method: to start a new cycle of modeling(started by broadcasting new input data to each computing machine) if and only if the interface module on the file server machine checks that the previous cycle of modeling is finished (ended by all model output rendered on home machine). GRAPHICAL INTERFACE The graphical interface is the component where the stock market expert is meant to interact with the modeling environment. The graphical interface manages user runtime input and output, and the system configuration. As discussed in data flow, runtime input includes user defined model parameters and system execution styles, and the output are 2-dimensional displays of model and market prices calculated in the compute nodes. The system configuration includes choice of pricing models, network configurartions and interface layouts. Plans for future improvements include integrating a real-time decision support system for traders to improve expert use of modeling system. GUI -- Input interface on home machine Why we design buttons for optimzation, sleep, single: Before reading from the model parameters from a file, a control module (right name??) checks for user input. Each compute node checks the input market data, and in cases of flagged market observations, the pricing model is inverted to estimate a value of sigma, termed implied volatility, that is used to price options until the next flagged option is observed. We flag options at half-hour intervals and apply the estimate of implied volatility through the remainder of the half-hour. Depending on the approach to estimating xi and rho, these values are read in data files, entered at run time by the user, or estimated using optimization techniques. Why buttons for psi, rho, 4 v's: Pricing models are extremely sensitive to sigma, xi, and rho, yet these variables cannot be directly observed. We use a variety of methods including expert opinion, historical values, and optimization to estimate sigma, xi, and rho. These parameters may be read from data files (historical estimates), calculated just prior to running the pricing model (optimization), or defined at run time (expert user). Performance Analysis: see additional sheet. DISCUSSION AND CONCLUSION outline functionality required for our application what are key integration issues and how we achieved this Interactive control, real-time visualization of model output and market data, and system integration make this an attractive software environment for future research in financial modeling. This highly portable software environment running on massively parallel computers can be implemented with relatively small programming effort, and allows for rapid prototyping. In the future, we will take advantage of on-going research (Hariri citation) in performance prediction tools meant to analyze application and system input parameters, and predict performance of the application on a specified hardware system. This will allow users to examine alternative system configurations for the given application, and examine predicted execution time, communication time, computation time, idle time for each node and system as a whole. application code implemented on hardware/software system hardware--compute nodes, network connection, I/O devices include visualization run-time system loads on compute nodes and network emergence of high performance distributed computing systems based on availability of high speed networks and advances in computational power need software to integrate, support heterogeneous, network based computing environment... must combine computation with dynamic visualization and system control Ethernet network provides 10 Mbit/sec but only 1 -2 Mbit/sec available for application host/sytem interface must be improved current software implemented as stack of software layers, consuming much of system capacity, leaving little bandwidth for application research needed in high speed communication protocols, ... we focus here on functionality required for industry applications on het distribut systems performance prediction remains a research issue \bigskip \centerline{\psfig{figure=gang_fig1.idr,height=4.1in}} \bigskip \centerline{\psfig{figure=gang_fig2.idr,height=4.1in}} \begin{flushleft} [1] K. Mills, M. Vinson, and G. Cheng, ``A Large Scale Comparison of Option Pricing Models with Historical Market Data,'' in {\it The 4th Symposium on the Frontiers of Massively Parallel Computation}, Oct. 19-21, 1992, McLean, Virginia. \smallskip [2] K. Mills, G. Cheng, M. Vinson, S. Ranka, and G. Fox. ``Software Issues and Performance of a Parallel Model for Stock Option Pricing" in {\it The Fifth Australian Supercomputing Conference.} Dec. 6-7, 1992, Melbourne, Australia. \end{flushleft} \author{\underbar{Gang Cheng}\\ Kim Mills\\ Geoffrey Fox\\\\ Northeast Parallel Architectures Center\\ Syracuse University, Syracuse, NY 13244-4100} \end{document}