Hierachical Multidimensional Scaling process Massive Metageonomics data

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

Using various algorithms to process massive metagenomics data through the Twister pipeline.

Intellectual Merit

Advanced Multidimensional Scaling interpolation algorithm which makes clustering dozens of millions sequence possible

Broader Impacts

Introduced a new way to do the clustering for metagenomics data

Project Contact

Project Lead
Yang Ruan (yangruan) 
Project Manager
Yang Ruan (yangruan) 

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • foxtrot (IBM iDataPlex at UF)
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • xray (Cray XM5 at IU)
 
Use of FutureGrid

User FutureGrid resources to do the large data processing

Scale of Use

dozens of nodes which can be reserved for a few days now and then

Project Timeline

Submitted
04/22/2011 - 17:44