Budget Constrained and not quite enough possible reduction in expected increases $4.5B Program -- $100M to be extracted for NIF NIF at best deployed of 2009 Trying to do too much -- certification in trouble NIF goingforward -- money from base effort (total was $1.2B -- needs 0.75B more) NIF Data plus simulation -- info for nuclear weopens Vision of Reis; Implementation from Weigand Leading edge of Scientific Simulation ASCI healthiest part of stewardship program most important draw for needed good people 100 Teraops is stopping point? Too successful so vulnerable? Is ASCI going faster than needed? Unclassified standalone report Morale terrible at LANL Sandia (Security) Are Milestones Good? No champion left? Paul Messina is it now No metric for value of simulation versus testing of nuclear weopens Impact of DoE on computing Messina Change -- only one lab will have a powerful computer Applications driven Platforms 1/3 Computer Science 1/3 Applications 1/3 Alliances Human Resources etc. 10% Mileposts PSE 2001 and later (2006?) Burn code Dec 31,99 Milepost will be used in production by Livermore Only ran on 1024 nodes as all components needed to be parallel Sandia "Software Engineering" 40 offices at LANL -- High end visualization 40-70 Teraflop machine at Livermore removed from plan Dec 99 Easier to recruit for physics than computer science ASCI must pay part of NIF overrun Almost all of POOMA people gone Reynders to Sun "NTS" (Nevada Test Site) Model What platforms 2005-2010? Rosner -- uses HDF -- not available on Red uses LLNL SAMRAI ANL Jumpshot Globus ANL Visualization OO Framework Some Chicago students hired by Sandia Juan Meza (at HQ on leave from Sandia Livermore) security same people need to meet other milestones SNL Computer Science Institute doing heterogeneous distributed computing Computer Sciednce Institutes Focus on Visitors and sometimes permanent staff````````````````` Sandia LLNL Sierra focussed on Finite Elements POOMA more general Sierra FEM focussed Sierra funded as "applications" $1.6M (6FTE for libraries) Dave Womble (In charge of Institute) Good talk Mesh Trilinos framework ARPACK 2001 Link algorithms to Sierra Framework Heffelfinger Mesh Generation of PZT -- Want to use Hex meshes; only have tet capability LANL *********************************************** Koch Fortran v. Java is a religuous war Obviously doesn't get it In general LANL talks low level Physics people supply modules for "codes" V/V Formal Software Engineering at LANL for Blanca Dendy discusses solvers as an example of algorithms Use UPS -- Application groups uniform parallel software interface Attract New People: salary, allow research LLNL ************************************************ 4 Integrated codes -- two full scale One full scale accomplished Dec 1999 Burn milepost Parallel visualization on Powerwall Machine (Blue Pacific) Saturated ALE3D: Uses MPI and threading -- no PSE ParaDYN: more focussed than Sierra DMF is some sort of I/O capability in PSE ALE3D and ParaDYN are NOT using PSE's really SAMRAI is a research effort applications Physics and Materials Good but no CS relationship V/V realtional database -- unclear if cross program metadata agreed (XML produced no recognition) Need work on metadata Using multigrid algorithms in new ASCI codes Parallel Adaptive Albebraic Multigrid Using Automatic Differentiation 20 Weopens designers at each lab no RIF's allowed at laboratory so can't hire as budgets reduced Software PSE well done but frozen at 1995 technology No life-cycle plan for existing software No new ideas in budget Commercial contracts are sort of trivial (MPI) MPI NOT allowed on each processor of IBM node Kerberos+Globus from DoE Nov 00 Suprisingly low bandwidth between DoE sites Parallel network Sophisticated Computations What is Process/Integration May 5 ********************************************** 3 to 12% Actual realized efficiency Too expensive (human cost; ability to update) to improve Sandia Red best balanced ASCI machine CPLANT moving from hypercube to sophisticated grid Both ASCI Blue machines had interconnect problems Blue Pacific is 3 Sectors -- OK within sector Blue Mountain is HIPPI connected clusters Blue Horizon is San Diego SC ASCI machine Using threading already on Blue Pacific ( 4 processors per node) "Requirements" e.g. 16 bytes/OP for cache has not been relized on any machine Best is YMP 2 reads one write 2 FLOPs per cycle VIEWS could be used in V and V -- not clear if it is VIEWS is Visualization PLUS Data Management No collaboration study Parallel VTK Parallel Visualization Framework Collaborative Facility is a Shared Room Computational Science Fellowship Program Quality is improving $2M to Blue Horizon for access Retraining as no RIF allowed at Sandia 1 year at Sandia Livermore 2 year at Sandia New Mexico Get certificates / Work with ASCI Scientist More applicants than slots CS Research Division at Los Alamos Los Alamos lost 15% of staff since april 10% turnover at Sandia in CS General turnover is 3% $697M in 01 budget after $25M December 99 cut This cut led to cancellation of 50-70 Tflop machine NIF is another $24M reduction Multiply by 5 as 5 year plan What about Blue Gene Do you need multiple codes? Software life cycle cost Technology Insertion Research versus Engineering Increased machine power translates into increased complexity and not increased resolution