Introduction and Impact: Earthquakes in urban centers are capable of causing enormous damage. The recent January 16, 1995 Kobe, Japan earthquake was only a magnitude 6.9 event and yet produced an estimated $200 billion loss. Despite an active earthquake prediction program in Japan, this event was a complete surprise. Similar scenarios are possible in Los Angeles, San Francisco, Seattle, and other urban centers around the Pacific plate boundary.

The development of forecast or prediction methodologies has been complicated by the fact that large events responsible for the greatest damage repeat at irregular intervals of hundreds to thousands of years,a limited historical record that has frustrated phenomenological studies. An alternative approach, proposed here, is to develop and analyze realistic earthquake simulations that generate arbitrarily long seismicity catalogs. This approach provides a "numerical laboratory" in which the physics of earthquakes can be investigated from a systems viewpoint. Physical processes on a wide range of spatio-temporal scales such as friction, fluid flow and the branching and interaction of fractures can be integrated into a common theoretical framework. "Model based inference" is then possible in which seemingly diverse data sets taken at different scales can be interrelated, reconciled, with predictions extrapolated to motivate and guide further observational studies. The impact of this new simulation technology on earthquake science will be to allow hypothesis testing and data integration on a scale not heretofore possible.

Nature of Earthquake Physics -- Space and Time Scales: Complex, nonlinear fault systems exhibit a wealth of emergent, dynamical phenomena over a large range of spatial and temporal scales, including space-time clustering of events, self-organization and scaling. Examples of the latter include the Gutenberg-Richter magnitude-frequency relation, and the Omori law for aftershocks (and foreshocks). The physical processes associated with earthquakes also occur on a wide range of spatial and temporal scales (Figure 1).

Frictional processes, the primary nonlinearity associated with earthquakes, are known to be physically significant from the molecular scale, on time intervals of less than 10-8 seconds and lengths of Angstroms (), to plate motion scales, on time intervals of 106 years and lengths in excess of 1000 km. The physical processes directly associated with faulting and seismology, including nucleation and quasistatic crustal deformation, occur over time intervals of fractions of seconds to thousands of years, and lengths of meters to a hundreds of km. By contrast, experiments on frictional sliding of rock samples are usually carried out over laboratory bench-top scales extending over typical time intervals of 10-3 seconds to days, and over length scales of cm to m. Figure 1 indicates that the length scales associated with processes of faulting, seismology and crustal deformations are as far removed from laboratory experiments on rock samples as the laboratory processes are from those on the molecular scale.

Some of the spatial scales for physical fault geometries include:

The microscopic scale (~ 10-6 m to 10-1 m) associated with static and dynamic friction (the primary nonlinearities associated with the earthquake process).

The fault-zone scale (~ 10-1 m to 102 m) that features complex structures containing multiple fractures and crushed rock.

The fault-system scale (102 m to 104 m), in which faults are seen to be neither straight nor simply connected, but in which bends, offsetting jogs and sub-parallel strands are common and known to have important mechanical consequences during fault slip.

The regional fault-network scale (104 m to 105 m), where seismicity on an individual fault cannot be understood in isolation from the seismicity on the entire regional network of surrounding faults, and where concepts such as "correlation length" and "critical state" borrowed from statistical physics have led to new approaches to understanding regional seismicity.

The tectonic plate-boundary scale (105 m to 107 m), at which Planetary Scale boundaries between plates can be approximated as thin shear zones and the motion is uniform at long time scales.

Broad Scientific and Computational Objectives: GEM models and simulations are needed to address fundamental questions related to earthquakes that can be understood by no other means. The types of broad scientific objectives that a GEM-type approach can ultimately address include:

Cataloging and understanding the nature and configurations of space-time patterns of earthquakes, and examining whether these are scale-dependent or scale-invariant in space and time. Certain characteristic patterns may indicate that a given event is a candidate foreshock of a future, larger event.

Developing and testing potential earthquake forecast algorithms, based primarily upon the use of space-time patterns in the fault system of interest.

Understanding the physical conditions that allow space-time coarse-graining of sub-grid scale processes, and whether these processes can be represented as noise.

Developing the theoretical framework to integrate diverse data and extrapolate existing data in space and time so that model predictions can be tested with new observations (Model-Based Inference).

Developing an understanding of hardware, software and algorithmic issues in computational support for multi-scale science and engineering simulations which are of pervasive importance in many fields.

These objectives underscore the need for models and simulations that mirror the wide range of space and time scales of the underlying processes.

Implications of Multi-Scale Physics: Past earthquake research has focused on problems that display only a limited range of spatial and temporal scales. Examples include laboratory studies of friction at the bench-top scale and crustal deformation at the plate boundary scale. However, the Gutenberg-Richter and Omori scaling laws, the scale-invariant physical structure of fault zones, and observations that seismic events cluster at all space and time scales investigated, all imply that multi-scale processes are at work, similar to those observed in many other areas of physics. Other complex nonlinear systems that exhibit qualitatively and quantitatively similar multi-scale features include the earth's weather and climate, neural networks with learning and cognition, superconductors and charge density wave systems that exhibit magnetic depinning transitions, liquid crystals undergoing smectic or nematic transitions, and ferromagnets and thin films in which magnetized domains evolve.

Large-scale computing approaches have had a significant and lasting impact in many areas of science where the underlying phenomena span significant ranges in spatial and temporaal scales. Therefore it is highly likely that a computational approach to the earthquake problem will produce a similarly successful outcome, provided that the major focus of the work is understanding the origins and implications of the fundamentally multi-scale phenomena that are observed in many seemingly disparate systems.

Classes of Earthquake Problems: Within the region of spatial and temporal scales in Figure 1 describing earthquakes (denoted "faulting and seismology"), there exist a sub-hierarchy of processes roughly defined by current research activity within the scientific community. These include:

1. Deterministic prediction of rupture on a single planar fault (Time Scales: 10-1 sec to 102 sec; Length Scales: 10-1 km to 102 km).

2. Prediction of stress transfer and seismic triggering of multiple earthquakes on several faults (Time Scales: 104 sec to 108 sec; Length Scales: 101 km to 103 km).

3. Space-time patterns and emergent structures in large populations of earthquakes on multi-scale fault systems (Time Scales: 102 sec to 1012 sec; Length Scales: 10-2 km to 104 km).

4. Geologic structure and evolution of faults and fault zones including the roles of fragmentation, fluids, and thermomechanics (Time Scales: 103 to 108 years; Length Scales: 10-6km to 10 km).

Research over the last two decades has demonstrated that the significant physical processes associated with problems 1 and 2 occur over a narrow range of scales. For these problems, a classical "reductionist" approach continues to produce new results. However, progress on problems 3 and 4 has been limited, precisely because few methods have been available to earthquake scientists that allow physical processes to be linked across large ranges in scale. Because our interests lie directly in the area of understanding and learning to forecast large and damaging earthquakes, we propose a Pilot Project to address some of the important facets of problem 3. The remainder of this pre-proposal describes this Pilot Project.

Pilot Project: The purpose of the proposed work is to develop the modeling and analysis technologies that will be needed to implement large GEM simulations. We have chosen to focus on Problem 3 above because it directly addresses issues of scale in earthquake physics and because uncertainty in the details of friction and rupture processes is relatively unimportant at these time scales. The spatial and temporal details of the rupture process will be coarse-grained in the dynamical equations. Only differences between the coarse-grained pre- and post-earthquake physical states are relevant, and the statistical evolution of the stress correlation that these imply. For these models, simple scale-invariant Coulomb friction will suffice.

Over the past decade, several members of the GEM Team (Rundle, Ward, and Ben-Zion) have demonstrated the workstation-level feasibility of fault system computations of the kind contemplated. Typically, these computations involve about 100 to 1000 individual fault elements, usually of roughly similar size and having Coulomb-type friction laws. Interactions of faults on different scales have generally not been addressed. Computations, data assimilation, pattern analysis, and other interpretation techniques have been severely limited by available computational resources, although a preliminary parallel version has been developed by the collaboration for one code. The ability to include realistic fault geometry and rheology has been limited. Moreover, existing code is often neither portable nor transparent to other users. Significant progress on this critical problem requires a major leap forward in modeling, simulation, analysis, and software technology.

Models: A centerpiece of early development will be a newly constructed, variable-scale fault-interaction model, to be built by a team of earthquake scientists and experts in parallel-computation using modern distributed object and component paradigms. Specifically, the proposed Pilot Project will consist of a set of N coupled, nonlinear boundary value problems that are obtained by formulating the balance of forces on each site x at time t of all of the faults in the model. Space-time evolution of the fault slip s(x,t) is a function both of the nonlinear friction on the fault surface, and of the stress Dsij(x,t) transferred to the site at x at time t from slip at the site x' at the earlier time t'. Stress transfer is characterized by a stress Green's function tensor Tijkl(x-x',t-t'). Coulomb friction leads to a problem characterized by a sharp failure stress threshold: No slip occurs until the total shear stress sS(x,t) exceeds a value proportional to the total normal stress sN(x,t). Other friction laws, of either the slip-weakening, or rate and state type, lead to equations of evolution for s(x,t) that do not have such sharp thresholds. For these types of friction laws, recent research has led to efficient algorithms for solving these problems on workstations for small numbers of faults.

The planned adaptation of the highly successful astrophysical N-body algorithm known as the Fast Multipole Method to the implementation of the stress Green's functions Tijkl(x-x',t-t') for interacting fault systems will represent a major advance, which when completed will provide a unique capability for modeling complex crustal dynamics. The expected computer requirements will eventually exceed 1 teraflop. This capability will become available over the duration of the project. Our approach will ensure that these resources are well used, since the N-body algorithm is both accurate and efficient. The primary friction model(s) used for the GEM pilot project will be of the scale-invariant Coulomb-type. However, the code and software will be constructed so that other friction laws, including the well-known slip-weakening and rate-and-state models, can be tested.

Pattern Analysis: Observations accumulated over many years clearly indicate the existence of recognizable space-time patterns in seismicity data. The exact nature of these patterns has been the subject of considerable debate. As part of the project, computational tools will be developed to analyze the information content of both real and simulated earthquake distributions using new ideas about pattern analysis and pattern reconstruction in complex systems. These include methods based upon the discrete Karhunen-Loeve expansion, scatter matrices and separability criteria, Bhattacharyya distance, correlation dimension, and "log-periodic" techniques.

Computations: The current computing environment is exemplified by isolated research groups using in-house software. The move to new, more fruitful collaborations marked by shared resources will take advantage of the emerging web-based object technology to create an environment where computing platforms, software, data and researchers will easily communicate. Early in the collaboration, agreement will be reached on a standard computational framework, consisting of a web-server object broker system and web-browser-based development environment, together with data-flow and visualization tools. Components of the framework will be acquired by the collaborating institutions. Because this is a rapidly evolving field, we defer the choice of tools to the initial phase of the work, and expect it to evolve as the collaboration proceeds. Promising technologies are described in "Building Distributed Systems for the Pragmatic Object Web" (see www.npac.syr.edu/users/shrideep/book), and prototypes have been studied at JPL and NPAC.

Data Assimilation and Model Validation: GEM will build and test modules that will allow comparison to observational data. First, models will be tested for self-consistency. Then data assimilation capabilities will be added so that the models can be calibrated and initialized using real data. Data assimilation techniques have been developed by the climate and ocean modeling community that start with a given model code from which the "adjoint model" is calculated. The adjoint can be run backward in time, beginning with current data, to predict the initial and boundary conditions that led to the current conditions. This computational procedure allows a subset of model variables to be adjusted so that initial conditions, historical observations, and current data can be optimally matched. Field observations of interest include GPS, InSAR, seismicity, and paleoseismic data all of which are readily available in computerized databases.

Multidisciplinary Aspects, HPC/NCC and Relation to KDI: The proposed Pilot Project in the GEM New Computational Challenge involves further development and use of a variety of technologies from other fields. These technologies, which have never been contemplated for use in earthquake-related problems include:

Fast Multipole Methods that have been used extensively in gravitational N-body problems and computational fluid mechanics.

Karhunen-Loeve Expansions and other pattern recognition, feature extraction, mode shaping and "data mining" techniques that have been developed for the study of electronic systems, weather forecasting and oceanographic analysis.

Data Assimilation Methodologies that have been developed to perform "model steering" feedback in ocean and climate simulations.

Analysis Techniques from theoretical analysis of random field systems for use in understanding the impacts of random properties, un-modeled sub-grid scale physics, and space and time coarse graining.

Computational, Web-Based, Multi-user Interactive Software that will use modern web-based technologies allowing large groups of scientists to interact meaningfully on difficult computational problems.

Current evidence suggests that forecasting earthquakes of magnitude ~6 and greater will depend upon understanding the space-time patterns displayed by smaller events; i.e., the magnitude 3's, 4's and 5's. With at least 40,000 km2 of fault area in southern California, as many as 108 grid sites will be needed to accommodate events down to magnitude 3. Extrapolations based upon existing calculations indicate that using time steps of ~100 sec implies that ~108 time steps will be required to simulate several earthquake cycles. Using current computational technology, run-time estimates of several months for this problem can be documented.

It should be emphasized that the technologies described above for the Pilot Project involve considerations of both scale and structure, as well as successfully addressing issues related to the interplay between computations and data. Development of a working forecast technology demands that human interaction with the computational simulation must be real-time. Thus, the space and time scales of this complex system, together with the novel multidisciplinary and human interaction aspects, clearly place GEM into the category of KDI/NCC problems.