Metagenome analysis of benthic marine invertebrates
Project Information
- Discipline
- Biology (603)
- Subdiscipline
- 40.05 Chemistry
- Orientation
- Research
We are carrying out deep sequencing of environmental DNA from benthic marine organisms that are important components of their community but that have not been extensively examined genomically. In these organisms, symbiotic bacteria are demonstrably critical to host survival. The metagenomes are extremely complex, yet robust assemblies can sometimes be achieved. These properties make benthic marine invertebrates excellent models for NGS technology. In this project, we will use Future Grid resources to carry out de novo assembly of marine invertebrate metagenomic sequence data, a process that requires large amounts of memory and CPU power due the volume of data.
Intellectual MeritThis work will help determine the potential utility of NGS technology, which produces a large amount of data but as relatively short reads, in metagenomics.
Broader ImpactsIn the course of our work we will determine the practical aspects of processing large and complex Illumina sequencing data to obtain de novo genome assemblies of very minor members of the metagenome. This will be of great use to the metagenomics community.
Project Contact
- Project Lead
- Malcolm Zachariah (mzachariah)
- Project Manager
- Malcolm Zachariah (mzachariah)
- Project Members
- Earl Middlebrook, Diarey Tianero, Thomas Waller, Thomas Kakule, Malcolm Zachariah, Jason Kwan, Russell Green, Zhejian Lin, Ashaimaa Moussa
Resource Requirements
- Hardware Systems
-
- india (IBM iDataPlex at IU)
- bravo (large memory machine at IU)
- delta (GPU Cloud)
Future Grid will be used for de novo assembly of metagenomic sequence data generated by Illumina technology. FG will also be used for the analysis of the assembled data - including automatic annotation and large scale BLAST searches
Scale of UseAssemblies using the program Meta-Velvet require a single node with a large amount of memory (~150 GB). Ideally we would be able to SSH into a single node to run the assembly. Long-term we may explore more distributed workflows.
Project Timeline
- Submitted
- 09/02/2011 - 20:43