Course: Cloud Computing for Data Intensive Science Class

Project Information

Discipline
Computer Science (401) 
Orientation
Education 
Abstract

A topics course on cloud computing for Data Intensive Science with 24 graduate students at Masters and PhD level offered Fall 2011 as part of Computer Science curriculum

Intellectual Merit

Several new computing paradigms are emerging from large commercial clouds. These include virtual machine based utility computing environments such as Amazon AWS and Microsoft Azure. Further there are also a set of new MapReduce programming paradigms coming from Information retrieval field which have been shown to be effective for scientific data analysis. These developments have been highlighted by a recent NSF CISE-OCI announcement of opportunities in this area. This class covers many of the key concepts with a common set of simple examples. It is designed to prepare participants to understand and compare capabilities of these new technologies and infrastructure and to have a basic idea as to how to get started. Particularly, the Big Data for Science Workshop Website covers the background and topics of interest as below. Projects include Bioinformatics and Information retrieval

Broader Impacts

This material will generate curricula material that will be used to build up an online distributed systems/cloud resource

Project Contact

Project Lead
Judy Qiu (xqiu) 
Project Manager
Tak-Lon Wu (taklwu) 
Project Members
Peng Chen, Vignesh Ravindran, Santhosh Kumar Saminathan, Lilian Weng, Fei Teng, Nabeel Akheel, Kaushik Chandrasekaran, Arvind Dwarakanath, Dhairya Gala, Abhinav Gopisetty, Swathi Gurram, Shivaraman Janakiraman, Hui Li, Sankarbala Manoharan, Anand Mukundan, Vaibhav Nachankar, Priyank Shah, Prerna Shraff, Doga Tuncay, Magesh khanna Vadivelu, Bingjing Zhang, Bina Bhaskar, Nitya Shankaran, Ritika Sharma, Hemanth Gokavarapu, Prajakta Purohit, Anand Hegde, Xiaoyang Chen, Manish Kantamneni, Yuan Gao  

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • foxtrot (IBM iDataPlex at UF)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
 
Use of FutureGrid

This course will offer programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms and have the opportunity to understand some commercial cloud systems through projects using FutureGrid resources.

Scale of Use

Modest resources for each student

Project Timeline

Submitted
08/16/2011 - 12:13 
Completed
06/22/2012