Hierachical Multidimensional Scaling process Massive Metageonomics data

Project Information

Discipline: Computer Science (401)
Orientation: Research

Abstract

Using various algorithms to process massive metagenomics data through the Twister pipeline.

Intellectual Merit

Advanced Multidimensional Scaling interpolation algorithm which makes clustering dozens of millions sequence possible

Broader Impacts

Introduced a new way to do the clustering for metagenomics data

Project Contact

Project Lead: Yang Ruan (yangruan)
Project Manager: Yang Ruan (yangruan)

Resource Requirements

Hardware Systems

alamo (Dell optiplex at TACC)
foxtrot (IBM iDataPlex at UF)
hotel (IBM iDataPlex at U Chicago)
india (IBM iDataPlex at IU)
sierra (IBM iDataPlex at SDSC)
xray (Cray XM5 at IU)

Use of FutureGrid

User FutureGrid resources to do the large data processing

Scale of Use

dozens of nodes which can be reserved for a few days now and then

Project Timeline

Submitted: 04/22/2011 - 17:44

»

Login or register to post comments

The FutureGrid project is funded by the National Science Foundation (NSF) and is led by Indiana University with University of Chicago, University of Florida, San Diego Supercomputing Center, Texas Advanced Computing Center, University of Virginia, University of Tennessee, University of Southern California, Dresden, Purdue University, and Grid 5000 as partner sites. This material is based upon work supported in part by the National Science Foundation under Grant No. 0910812.

Futuregrid is a resource provider for XSEDE.