Use Cloud Computing environment to develop and test bioinformatics applications

Project Information

Discipline
Biology (603) 
Subdiscipline
11.01 Computer and Information Sciences, General 
Orientation
Research 
Abstract

To use cloud computing environment to process large-scale genomic and metagenomic data. Also to use cloud computing environment to develop bioinformatics pipeline and workflow, and test existing bioinformatics software on cloud.

Intellectual Merit

To develop bioinformatics pipeline to process large-scale genomics and metagenomics data on cloud platform. Also to explore and analyze different bioinformatics algorithms created using MapReduce and Hadoop framework.

Broader Impacts

To analyze genomic and metagenomics data using existing software (eg. CloVR) and developing new software/pipeline on cloud. This work will enable development of new bioinformatics tools on cloud computing environment. This will also allow biologists to run computationally intensive bioinformatics tasks easily and cheaply.

Project Contact

Project Lead
Abhiram Das (adasgt) 
Project Manager
Abhiram Das (adasgt) 

Resource Requirements

Hardware System
  • I don't care (what I really need is a software environment and I don't care where it runs)
 
Use of FutureGrid

Create VM, take image of the VM and develop bioinformatics pipelines using MapReduce and Hadoop framework. Analyze existing bioinformatics software on cloud.

Scale of Use

Initially the use will be minimal i.e. a few VMs and 10s GB of data storage provisioning and access couple of times a week.

Project Timeline

Submitted
01/17/2012 - 16:40