FG-120
FutureGrid education: Using case studies to develop a curriculum for communicating parallel and distributed computing concepts
Workshop: A Cloud View on Computing
Project Details
- Project Lead
- Jerome Mitchell
- Project Manager
- Jerome Mitchell
- Project Members
- Timothy Holston, Constance Bland, F Doswell, Willie Fuller, Candice Adams, Mohammad Hasan, Natarajan Meghanathan, deshea simon, yenhung hu
- Institution
- PTI, Indiana University
- Discipline
- Computer Science (401)
Abstract
June 6-10, 2011
Cloud computing provides elastic compute and storage resources to solve data intensive science and engineering problems, but the number of students from under-represented universities who are involved and exposed to this area is minimal. In order to attract underserved students, we intend to train faculty members and graduate students from the Association of Computer/Information Sciences and Engineering Departments at Minority Institutions (ADMI) in the area of cloud computing through a one-week workshop conducted on the campus of Elizabeth City State University. This workshop will enable faculty members and graduate students from underserved institutions, who are involved with minority undergraduate students to gain information about various aspects of cloud computing while serving as a catalyst in propagating their knowledge to their students.
Intellectual Merit
The desired competencies for faulty and graduate students to acquire and/or refine in cloud computing are: • Understand and articulate the challenges associated with distributed solutions to large-scale problems, e.g., scheduling, load balancing, fault tolerance, memory and bandwidth limitations, etc. • Understand and explain the concepts behind MapReduce • Understand and express well-known algorithms in the MapReduce framework. • Understand and reason about engineering tradeoffs in alternative approaches to processing large datasets. • Understand how current solutions to the particular research problem can be cast into the MapReduce framework. • Explain the advantages in using a MapReduce framework over existing approaches. • Articulate how adopting the MapReduce framework can potentially lead to advances in the state of the art by enabling processing not possible before.
Broader Impacts
The curricula and tutorials can be reused in other cloud computing educational activities
Scale of Use
15 generic users will need modest resources
Results
The hands-on workshop was June 6-10, 2011. Participants were immersed in a “MapReduce boot camp”, where ADMI faulty members sought introduction to the MapReduce programming framework. The following were themes for five boot camp sessions:
- Introduction to parallel and distributed processing
- From functional programming to MapReduce and the Google File System (GFS)
- “Hello World” MapReduce Lab
- Graph Algorithms with MapReduce
- Information Retrieval with MapReduce
An overview of parallel and distributed processing provided a transition into the abstractions of functional programming, which introduces the context of MapReduce along with its distributed file system. Lectures focused on specific case studies of MapReduce, such as graph analysis and information retrieval. The workshop concluded with a programming exercise (PageRank or All-Pairs problem) to ensure faculty members have a substantial knowledge of MapReduce concepts and the Twister/Hadoop API.
For more information, please visit http://salsahpc.indiana.edu/admicloudyviewworkshop/