Comparison of Architectures to Support Deep Learning Applications

Project Information

Discipline
Computer Science (401) 
Orientation
Research 
Abstract

Recent work in Deep Learning has found increased performance on benchmark tests from scaling neural networks up to billions and even over 10 billion parameters.  Approaches have included massive parallelism (Le et al. at Google used 16,000 CPU cores) and GPU processing (Coates et al. at Stanford trained a network of 11 billion parameters with 64 GPUs).

This project will utilize multiple FutureGrid resources to compare the time required to train very large neural networks using large memory, GPU and cluster approaches. 

Intellectual Merit

This project has the potential to advance knowledge in an important area of AI research. As readily available computational capacity continues to grow, these types of extremely resource intensive applications could be applied routinely to a variety of tasks.

Broader Impacts

Deep NNs have the potential to contribute to next generation interfaces with computing and communication devices, for speech and gesture recognition. Research in this area could contribute to this body of work.

Project Contact

Project Lead
Scott McCaulay (smccaula) 
Project Manager
Scott McCaulay (smccaula) 
Project Members
Scott McCaulay  

Resource Requirements

Hardware Systems
  • india (IBM iDataPlex at IU)
  • delta (GPU Cloud)
 
Use of FutureGrid

This evaluation would not be possible without access to the FutureGrid Echo machine.

Scale of Use

Use of the full Echo system for several days at a time. Use of GPU systems and an HPC Cluster as well.

Project Timeline

Submitted
04/21/2014 - 16:05