Developing Virtual Clusters for Science Gateways and HPC Education

Project Information

Discipline
Computer Science (401) 
Subdiscipline
14.08 Civil Engineering 
Orientation
Research 
Abstract

The project will focus on the development of virtual cluster technologies to be used with HUBzero technology to allow earthquake engineering researchers to run parallel applications (such as OpenSEES) through the NEEShub for the NSF NEES project. Activities will be focused in several areas: 1) create configurable virtual clusters that be defined and controlled through the NEEShub environment; 2) develop secure virtual cluster communications to facilitate placement across campus of virtual machines and live migration; 3) develop strategies and technologies for proactive and reactive fault tolerance to improve the reliability of virtual clusters; and 4) create a development environment for parallel applications through the NEEShub. Students in undergradaute and graduate courses in high performance computing will also be involved in the research in virtual clustering through projects and lectures developed as part of the research.

Intellectual Merit

The proposed activity seeks to integrate virtualization technology, science gateways, large-scale stores of experimental data from the NEES community, and existing and new parallel applications to provide a seamless, responsive, reliable, and easy-to-use parallel computing environment for the earthquake engineering community. The investigator is the NEES Co-Leader for Information Technology, has an NSF CAREER grant focused on developing more reliable parallel computing systems, is Co-PI of the Purdue CMS Tier-2 center, and an established track record of publication and developing production quality cyberinfrastructure. The proposed activity will develop new approaches and technologies for reliable virtual clusters that can be used broadly across a wide variety of platforms and systems, and leverage existing NEES activities at Purdue.

Broader Impacts

Undergraduate and graduate students in the investigator's high performance computing courses will use technologies developed in the project directly in course projects and research. Students this semester are developing individual virtual clusters based on KVM, OpenNebula, and OSCAR. Results from this work will provide the earthquake engineering community access to parallel applications running through the NEEShub (http://nees.org) to speed discovery and facilitate the reuse of earthquake engineering experimental data available today in the NEEShub Project Warehouse. Research and education results will disseminated through publication in conferences and journals (such as SC and ASEE), and made available through the NEEShub.

Project Contact

Project Lead
Thomas Hacker (tjhacker) 
Project Manager
Thomas Hacker (tjhacker) 

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • Network Impairment Device
 
Use of FutureGrid

Deploy VM images, set up a VPN network (from an image level, not network hardware level), emulate long RTT (using netem), run parallel applications. An OpenNebula management environment would be very helpful.

Scale of Use

I estimate that I will need access to between 48 and 128 cores on an intermittent basis. I currently have a lab at Purdue where I can gain access to up to 48 cores, which is useful for small-scale testing, but I need access to something a bit larger through the FutureGrid.

Project Timeline

Submitted
08/28/2011 - 08:18