GEM_Strategic_Plan.rtf

General Earthquake Models (GEM)

Plans, Options, and Goals

A Program to Enhance Understanding

of Earthquake Physics

through High Performance Computation

Prepared by:

John B. Rundle

Department of Physics

Director, Colorado Center for Chaos & Complexity

Cooperative Institute for Research in Environmental Sciences

University of Colorado

Boulder, CO

December, 1998

Introduction

The General Earthquake Model (GEM) group has the goal of developing the capability of carrying out realistic, high performance computational simulations of earthquake processes on individual and systems of faults within the earth’s crust. Recently, advances have been made on understanding specific aspects of earthquake processes on various spatial and time scales, however the efforts are generally carried out by individual scientists, leaving the field of earthquake science somewhat disjointed and not well integrated. The next logical step is to bridge the gaps between models of various scales, which will greatly advance our understanding of earthquake processes.

We propose to develop a set of tools for use on high performance computers to simulate earthquake processes. By developing these tools such that the gaps between small and large scale models are bridged we will effectively develop a "numerical laboratory". Investigating the physics of earthquakes can then be approached as an "experimental science" that will complement existing observational and theoretical approaches. GEM will allow us to test for consistency between models, to explore the effects of friction, segmentation, loading due to plate motion, boundary conditions, random events, and the neglect of sub-grid-scale processes, all of which will result in more physically realistic models, further advancing our understanding of earthquakes.

Data will be assimilated to validate the models, and through GEM, diverse data sets can be related to each other by a process known as "Model-Based Inference". Data collection and archiving will not be a part of GEM, however we anticipate that through GEM data needs will be identified. This will help to drive data collection efforts in a variety of disciplines.

We plan to use the GEM simulations to develop new space-time pattern recognition and analysis techniques. These may very likely lead to the development of forecasting and prediction methodologies for the reduction of earthquake hazard to lives and property.

Spatial and Temporal Scales

Processes associated with earthquakes are known to occur over a wide variety of spatial and temporal scales. Figure 1 illustrates this point. For example, frictional processes, the primary nonlinearity associated with earthquakes, are known to be physically significant from the molecular scale, on time intervals of less than 10-8 seconds and lengths of Angstroms (), to plate motion scales, on time intervals of 106 years and lengths in excess of 1000 km. The physical processes directly associated with faulting and seismology, including nucleation and quasistatic crustal deformation, occur over time intervals of fractions of seconds to thousands of years, and lengths of meters to a hundreds of km. By contrast, experiments on frictional sliding of rock samples are carried out over laboratory scales extending over typical time intervals of 10-3 seconds to days, and over length scales of cm to m. Figure 1 indicates that in terms of length scales, processes of faulting, seismology and crustal deformations are as far removed from laboratory experiments on rock samples, as these laboratory rock processes are from those on the molecular scale. There are no reliable means presently available to relate results observed on one set of scales to those on another. It is therefore of critical importance to develop an experimental numerical capability that can help to span these large ranges in scales, so that the applicability of physical (and chemical) processes on one scale can be evaluated for their importance on other, distinctly different scales. An example is the need to determine whether empirical or theoretical friction laws developed to describe the nonlinearity of sliding on one temporal and spatial scale applies to sliding on other scales.

Within the region of spatial and temporal scales in figure 1 describing earthquakes (denoted "faulting and seismology"), there exist additional "sub-scales" that describe additional heirarchies of processes. These include, but are not limited to, the following:

1. Individual earthquakes on a single planar fault. Processes that may be important include inertia; stress transfer via waves and comparison to quasistatic stress transfer calculations; elastic heterogeneity at small scales (elastic wave scattering); quasistatic and dynamic friction; the pore fluid flow; and wearing and evolution of the sliding surface. Important to have friction models for both the quasistatic and dynamic regimes based upon velocity-, slip-, or rate and state-weakening processes.

Time Scales: 10-1 sec to 102 sec

Space Scales: 10-1 km to 102 km

2. Multiple earthquakes on several faults. Processes that may be important include quasistatic and dynamic friction; pore fluid flow; dynamic stress transfer via waves; quasistatic/static stress transfer via elasticity and viscoelasticity; earth rheology; Coulomb Failure Functions; Viscoelastic Coulomb Failure Functions; stress shadows; geometry of faults; and how faults influence one another.

Time Scales: 10 sec to 108 sec

Space Scales: 10-1 km to 103 km

3. Statistics, correlations, and space-time patterns of earthquakes in large populations of earthquakes on large fault systems. Processes that may be important include quasistatic friction; pore fluid flow; quasitatic/static stress transfer; earth rheology; sub-gridscale faults (sources of noise); plate-tectonic stress loading processes; and evolution in fault geometry due to fracturing of fresh rock. Since space-time correlations in populations of earthquakes are of primary interest, only processes on time scales longer than an earthquake source time are important, because all smaller time scales are coarse-grained out of the problem. Scaling, critical phenomena and nucleation processes are clearly relevant on these time and space scale, as is obvious from observations of the Gutenberg-Richter relation, Omori's law, and so forth.

Time Scales: 102 sec to 1012 sec

Space Scales: 10-1 km to 103 km

4. Physics of fault gouge formation, role of granula media in geologic structure of faults, heat flow questions. Processes that may be important include comminution; formation and mechanics of granular gouges; formation of stress "bridges" between sides of faults; fluid infiltration; fracturing and abrasion of intact rock; and spatial-temporal evolution of fault geometries. These processes are primarily associated with the formation of geologic fault structures, and their influence on heat flow and thermomechanics of slip.

Time Scales: 10-1 sec to 106 years

Space Scales: 10-6 km to 1 km

It will clearly not be possible to address all of these problems in a single proposal, or perhaps even within a single computational or physical framework. Choices must therefore be made as to which problem(s) to address in a given proposal, and in which priority. These choices and priorities will be determined in consideration of a given funding target of opportunity.

GEM is an HPC-class Problem

The very large range of spatial and temporal scales that these problems involve clearly demonstrate that GEM is an HPC-class problem. Current evidence indicates that forecasting the damaging earthquakes of magnitude ~ 6 and greater almost certainly depends upon understanding the space-time patterns displayed by smaller events, e.g., the magnitude 3's, 4's and 5's. At least 40,000 km2 of fault area in southern California are capable of participating in magnitude 6 and greater events. Hence, needing a spatial resolution of about 100 m to eliminate grid-scale effects and to capture the physical processes of the magnitude 3 events, and a temporal resolution of ~ 100 sec, one arrives at the conclusion that as much as 106 grid sites will be necessary for a maximally realistic simulation. If grid sizes at the 10 m scale are used to capture the failure physics of the magnitude 3 events, then ~ 108 grid sites will be needed. Using current computational technology, run-time estimates of several months for this problem can be documented.

Scientific Objectives:

A well-constructed simulation technology should hold the promise of an iterative solution to earthquake problems through a direct comparison of simulations with a wide variety of data. In this context, specific scientific objectives of our research include:

Objective 1. Cataloguing and understanding the nature and configurations of space-time patterns of earthquakes and examining whether these are scale-dependent or scale-invariant in space and time. Correlated patterns may indicate whether a given event is a candidate foreshock. We want to study how patterns form and persist. One application will be to assess the validity of the "gap" and "antigap" models for earthquake forecasting. Another will be to understand the physics of "correlation at a distance", and "time delayed triggering", which result in seismicity that seems correlated over larger distances and time intervals than previously thought possible.

Objective 2. Identifying the key parameters that control the physical processes and space-time patterns. We want to understand how fault geometry, friction laws, and Earth rheology enter the physics of the process, and which of these are the controlling parameters.

Objective 3. Understanding the importance of inertia and seismic waves in determining details of space time patterns and slip evolution.

Objective 4. Understanding the physical justifications for space-time coarse-graining over smaller scales, the role of sub-grid scale processes, and whether these might be parameterized in terms of uncorrelated or correlated noise.

Objective 5. Ascertaining the possible effects of unmodeled processes, including neglected, hidden or blind faults, lateral heterogeneity, variability in friction laws, nature of the tectonic forcing and Earth rheology.

Objective 6. Developing and testing potential earthquake forecast algorithms, based primarily upon the use of space-time patterns and correlations in the fault system of interest.

Project Plans

Models: The first step in developing General Earthquake Models will be to identify existing codes and models ranging in scale from tenths to tens of kilometers. We expect that GEM will encompass the dynamics and physics of individual faults, all the way through systems of faults. This includes fault rupture, the causes of rupture growth into a large earthquake, fault interactions on short and long time scales, and the role of rheology, heterogenity, random property, and noise.

A centerpiece of early development will be a newly constructed fault-interaction model, to be built by a world-class team of parallel computing physical modelers using modern object/component paradigms. The stress transfer interactions between faults will be described by stress Green's functions whose mathematical forms have been tabulated in the literature. Computing these interactions, however, is known to be an order ~ N2 operation, and will therefore represent a major time constraint on the practical size of the models. Recently however, dramatic progress in astrophysical and fluid mechanical computations have been made possible by the development of "Fast Multipole Methods". In this technique, tremendous computational savings may be realized by combining bodies into "cells" and approximating their external field with a truncated multipole expansion. When this idea is applied systematically, the number of interactions may be reduced to O(N*logN). The cells are generally arranged in a tree, with the root of the tree representing the entire system, and descendants representing successively smaller regions of space. Using this method, it has have demonstrated that such codes can run in parallel on thousands of processors, simulating highly irregular cosmological systems of over 300 million bodies.

The planned adaptation of this hugely successful astrophysical N-body algorithm ("Fast Multipole Method") to the implementation of stress Green's functions on interacting fault systems will represent a major advance, which when completed will provide a unique capability for modeling complex crustal dynamics. It will scale up very naturally to fit our anticipated scenarios of increasing complexity within the existing and increasing computer power available. The expected computer requirements as the model matures will exceed 1 teraflop performance, which will become available through a number of computing centers during this time. Our approach ensures that these resources are well-spent, as the N-body algorithm is both accurate and efficient.

One of the major ingredients in all of problems 1-3 outlined above is the nonlinear friction law used to describe the physics of sliding. In general, the friction models used for GEM will be based upon 1) laboratory experiments, such as the slip-weakening or rate-and-state models; 2) computationally simple models that capture in a simple way important aspects of frictional sliding, such as the classic Coulomb-Amontons law; or 3) statistical mechanics, in which the basic phenomenology of friction has been incorporated on coarse grained space-time scales. Examples of the latter include the traveling density wave model and a variety of hierarchical statistical models arising from Renormalization Group considerations. In simulations of problem #4 above, the macroscopic friction law is a result of the detailed contact interactons of the many granular bodies contained between the rigid walls of the fault.

As our understanding of the physics of the various models increases, we will want to increase the scale and complexity of the codes. Thus the community should expect to begin with a series of pilot projects addressing problems 1-4 described above, using the simplest physics possible. To facilitate ease of use of the models, additional tools (parallel, distributed, or desktop, as appropriate) will be coded to construct and import details of fault geometry and earth rheology, to model details of fault friction and rupture, to perform analysis of the space-time patterns of the simulated fault-slip history, and to obtain, display and compare archived physical and simulated data sets.

These models and associated tools will be immediately and broadly available through a web visual-graph data-flow tool, in the context of a new web-distributed development/ communication/visualization environment. This will enable concurrent refinement and improvement of the intial coarse large-scale model of the target fault system with respect to many levels and kinds of physical modeling. New physical models that represent the details will be rapidly plugged in for trial testing, with many experts at remote sites able to compare and discuss the results. In this way we can most speedily evaluate the impact of new model components as they are constructed, and enable fruitful comparisons of simulated results to the rapidly expanding sets of geophysical data. This approach is the most promising for cinching the loop between physical observations and models. The models will be more swiftly evaluated and imporved based on the data, and the understanding gained by comparing hypothetical models will indicate fruitful new modes of data collection.

Pattern Analysis: Anecdotal evidence accumulated over many years clearly indicates the existence of such space-time patterns in seismicity data. The exact nature of these patterns, however, has so far eluded identification. As part of the project, computational tools will be developed to analyze the information content of both real and simulated earthquakes using new ideas about of pattern analysis and pattern reconstruction in complex systems.

Attempts to identify patterns and use them to forecast seismic activity have been based upon several approaches. One of the oldest is exemplified by the M8 algorithm in which several seismic activity functions are tracked as functions of time. When these attain preset values, a "Time of Increased Probability" (TIP) is triggered, and remains in effect for several years. Another method relies on identification of a precursory "Active Zone" before the largest events that seems to be evident in a variety of numerical simulations. Still another promising approach is the log-periodic time-to-failure method that relies on a characteristic signature arising from an earthquake failure process involving a discrete scale invariant hierarchy of smaller events. Finally, standard pattern recognition techniques based on calculations of correlation dimension have been applied to simulations in an effort to categorize the kinds of space-time patterns that may exist in real data. All of these approaches implicitly assume that space-time patterns do exist in the data and can be discovered through analytical techniques.

More recently, a new "pattern dynamics" (PD) approach has been devised that is based upon representing seismicity patterns as a discrete Karhunen-Loeve expansion in the eigenstates of an appropriately constructed autocorrelation (equal time correlation) operator. This approach assumes that the physical process of interest is statistically stationary, and is also based upon the premise that space-time patterns are encoded within the autocorrelation function obtained from the time history of seismic activity. The existence of such patterns has been demonstrated in realistic numerical simulations of seismic activity, and preliminary work strongly suggests that similar patterns can be detected in real data. However, given the long temporal scales (compared to human life spans) of seismic activityi, it is likely that the Karhunen-Loeve eigenstates or eigen-patterns can never be accurately determined from observational data alone. Instead, it will almost certainly be necessary to construct the eigenstates by the use of realistic simulations that capture the important aspects of the statistics (but not necessarily the detailed temporal ordering) of the real data.

Once the eigenstates have been determined, and their corresponding time-evolution operators reconstructed, they can be used on real datasets 1) to identify in which combination of eigen-patterns the real fault system currently resides, and 2) to forecast into which space-time pattern the real fault system is likely to evolve. A similar approach is currently used in El Ni–o forecasting with an approximately ~ 70% success rate.

Computations: The current environment is exemplified by isolated research groups that specialize in home-brewed software. The move to new, more fruitful collaborations marked by shared resources, we will take advantage of the emerging web-based object technology to create an environment where computing platforms, software, data and researchers will easily communicate. A standard computational framework, consisting of a web-server brokerage system and web-browser based development, data-flow and visualization tools will be agreed to and its components acquired by the collaborating institutions. Because this is a rapidly evolving field, we defer the choice of tools to the initial phase of the work, and expect it to evolve as the collaboration procedes. Promising technologies are described in "Building Distributed Systems for the Pragmatic Object Web (Fox et al., http://www.npac.syr.edu/users/shrideep/book/), and prototypical systems have been investigated at JPL and NPAC.

One feature of this computational infrastructure will be the inexpensive reuse of existing software and the ability to rapidly develop and widely test new software model features. In order to realize these advantages an early activity of the collaboration is the encapsulation of many existing codes within some web-accessible "wrapper". This will allow any collaborator to learn and run any application in the shared resource pool, locally (if language, platform, and proprietary interests permit), or else across the network on the machine(s) supported by the developers of the original software. Technology to do this securely exists now in the form of Java wrappers and Common Object Request Broker Architecture (CORBA)-based process initiation and communication. Collaborators may choose to share their source code and/or executables, or use the distributed object technology to allow others to run the process on otherwise private host machines.

Some applications will require exceptional computational resources, such as regional interacting fault models. Such applications will be written for parallel supercomputers using portable language features such as MPI and high-performance FORTRAN or C++, but will be accessible to all collaborators as web-based client-server objects to be run on the high-performance computers at some of the participating institutions. As some applications are encapsulated and others developed, instructions for use, descriptions of input/output, source code (when supplied) and means for initiating remote execution will be made available to all collaborators via the world-wide web. In this way the web-based computational infrastructure will enbable both resource sharing and rapid prototyping and testing.

Data Assimilation and Model Calibration: Following, and in parallel with, the code development GEM will build and test modules to compare observational data with the models. First will be tests for consistency between models, then data assimilation capabilities will be added so that the models can be calibrated and initialized using real data. Data assimilation techniques have been developed by the climate and ocean modeling community that start with a given model code, from which the "adjoint model" is calculated. The adjoint can be run backwards in time, beginning with current data, to predict the original initial and boundary conditions that led to the current conditions. This computational procedure allows a subset of model variables to be adjusted so that both initial conditions, historical observations, and current data can all be optimally matched. Techniques to fits models to data will also be developed using standard techniques, including least squares, evolutionary programming, and simulated annealing, among others. Self-adaptation methods will be based on the same kinds of back-propagation methods that have been useful in analysis of neural network models. All of these methods pose unique problems which will require innovative new strategies to be developed.

Field data of interest include GPS, InSAR and broadband seismic (TERRASCOPE) data, together with archived and newly developed paleoseismic information in the SCEC database. Relevant model parameters that these data can help to determine include the current geometry of faults; slip rates on any given segment; recurrence intervals and historic variations in slip during earthquakes, leading to estimates of frictional parameters; deformation data leading to estimates of elastic plate thickness and sub-crustal stress; relaxation times; poroelastic stress relaxation in the crust following earthquakes, leading to estimates of drained and undrained elastic moduli; and variations in seismicity, leading to estimates of the variable properties of friction and fault geometry at depth.

Validation via Laboratory Experiments: While the collection of significant new laboratory data is outside the scope of the GEM project, data from laboratory experiments are needed to validate the GEM simulations. Validation can take many forms, but there would seem to be two primary functions: 1) The use of experimental data in verifying numerical techniques and the physics that are included (excluded) in models. 2) The use of numerical models in connection with a concurrent active program of lab experiments to further identify the physical mechanisms that operate in laboratory experiments and earthquake fault zones. However, it must be recalled that laboratory experiments provide information only about a strongly limited set of space-time scales on given materials and conditions. How these results scale up or apply to field situations is unknown.

For item number (1), the question is what level of complexity should be required in the (fault zone) constitutive law for which problems? This is a standard approach in existing models of single faults and interacting faults (e.g., the question is asked: can G-R scaling be reproduced by simple static-dynamic friction on a uniform planar fault for a given model and boundary conditions?), but few studies have addressed more complex issues. For models of fault zone weakness, one key task will be to study a range of boundary conditions, including those used in the laboratory, to verify that the model is capable of reproducing laboratory data at laboratory scales. Verifying that laboratory observations can be reproduced will be a significant step, since it will allow testing of specific assumptions about the underlying physics.

Second, GEM can uniquely contribute to a fundamental understanding of granular mechanics, fault zone physics, and the rheology of brittle fault zones. As part of the overall effort, models should be developed to study the scale dependent effects of particle contact mechanics, mechanical and chemical effects of compaction, and the effective frictional properties of fault zones. These results would have a major impact on understanding the implications and limitations of laboratory data as applied to faulting.

Organization

Project organization will include the following functions:

Planning and Coordination

Modeling and Analysis

Computations

Validation/Data Assimilation

Outreach and Information Dissemination

Because this document is a Strategic Plan, rather than a specific proposal to a funding agency, it does not advocate any particular management structure. However, it would be best if close coordination can be maintained between ongoing projects and their various participants. The optimal means of establishing such coordination is probably by means of management structures associated with the Southern California Earthquake Center, and its successor, the California Earthquake Research Center. For each individual GEM project, a Principal Investigator should be named who would have final authority for all decisions on project direction and fund expenditures. It might also be wise to appoint

an Executive Committee to provide advice and expertise to the GEM project manager. In addition to this internal committee, it might also be advantageous to establish an Advisory Committee whose distinguished members would originate from outside the GEM project, and whose function would be to provide oversight and suggestions that might not be obvious to internal members of the Team.

Priorities

Priorities will be determined by:

1) Intellectual maturity of the problem

2) Interests of potential funding agencies and programs.

Projects 1 to 4 described above have all been the subject of extensive research efforts during the last 5 to 15 years. Various groups, most of them disjoint, have developed numerical and analytical techniques and results that have been published extensively in the literature. Of the four projects, 1,2, and 4 have been the province primarily of geophysicists and engineers. Problem 3 has attracted the interest not only of geophysicists, but also of the condensed matter physics community, who have been very interested in the phenomena associated with scaling, critical phenomena, and nucleation.

Budget

While no detailed cost breakdowns have been formulated, it is possible to envision an extremely basic, "bare bones" GEM project funded at the level of $100,000/year. This level of funding would allow a small amount of model and software development, and coordination and integration of investigators funded from independent, but complementary sources.

Funding for a self-contained GEM project would begin at the level of $1,000,000/year, a level that would allow an active GEM Team to be formed, and to substantively address a number of the tasks described within a period of about 5 years. To make truly significant progress on all of the major tasks with 3 to 5 years would probably require a yearly level of funding in excess of 2,000,000/year.

For comparison, the Japanese have initiated the CAMP earth simulator project, involving the construction of a new NEC-built supercomputer capable of peak performance level of > 30 Tflops. The software budget for the earthquake part of the project is comparable to the higher numbers listed above.