Course: Cloud Computing for Data Intensive Science Class

Abstract

A topics course on cloud computing for Data Intensive Science with 24 graduate students at Masters and PhD level offered Fall 2011 as part of Computer Science curriculum

Intellectual Merit

Several new computing paradigms are emerging from large commercial clouds. These include virtual machine based utility computing environments such as Amazon AWS and Microsoft Azure. Further there are also a set of new MapReduce programming paradigms coming from Information retrieval field which have been shown to be effective for scientific data analysis. These developments have been highlighted by a recent NSF CISE-OCI announcement of opportunities in this area. This class covers many of the key concepts with a common set of simple examples. It is designed to prepare participants to understand and compare capabilities of these new technologies and infrastructure and to have a basic idea as to how to get started. Particularly, the Big Data for Science Workshop Website covers the background and topics of interest as below. Projects include Bioinformatics and Information retrieval

Broader Impact

This material will generate curricula material that will be used to build up an online distributed systems/cloud resource

Use of FutureGrid

This course will offer programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms and have the opportunity to understand some commercial cloud systems through projects using FutureGrid resources.

Scale Of Use

Modest resources for each student

Publications


Results

See class web page http://salsahpc.indiana.edu/csci-b649-2011/

This class involved 24 Graduate students with a mix of Masters and PhD students and was offered fall 2011 as part of Indiana University Computer Science program. Many FutureGrid experts went to this class which routinely used FutureGrid for student projects. Projects included
  • Hadoop
  • DryadLINQ/Dryad
  • Twister
  • Eucalyptus/Nimbus
  • Virtual Appliances
  • Cloud Storage
  • Scientific Data Analysis Applications
FG-143
Judy Qiu
Tak-Lon Wu
Indiana University
Closed

Project Members

Abhinav Gopisetty
Anand Hegde
Anand Mukundan
Arvind Dwarakanath
Bina Bhaskar
Bingjing Zhang
Dhairya Gala
Doga Tuncay
Fei Teng
Hemanth Gokavarapu
Hui Li
Kaushik Chandrasekaran
Lilian Weng
Magesh khanna Vadivelu
Manish Kantamneni
Nabeel Akheel
Nitya Shankaran
Peng Chen
Prajakta Purohit
Prerna Shraff
Priyank Shah
Ritika Sharma
Sankarbala Manoharan
Santhosh Kumar Saminathan
Shivaraman Janakiraman
Swathi Gurram
Vaibhav Nachankar
Vignesh Ravindran
Xiaoyang Chen
Yuan Gao

Timeline

2 years 16 weeks ago
1 year 29 weeks ago