MapReduce applications and environments Time: 5:30pm - 7:00pm Abstract: As the computing landscape becomes increasingly data-centric, data-intensive computing environments are poised to transform scientific research. In particular, MapReduce based programming models and run-time systems such as the open-source Hadoop system have increasingly been adopted by researchers with data-intensive problems, in areas including bio-informatics, data mining and analytics, and text processing. While Map/Reduce run-time systems such as Hadoop are currently not supported across all TeraGrid systems (it is available on systems including FutureGrid), there is increased demand for these environments by the science community. This BOF session will provide a forum for discussions with users on challenges and opportunities for the use of MapReduce. It will be moderated by Geoffrey Fox who will start with a short overview of MapReduce and the applications for which it is suitable. These include pleasingly parallel applications and many loosely coupled data analysis problems where we will use genomics, information retrieval and particle physics as examples. We will discuss the interest of users, the possibility of using Teragrid and commercial clouds, and the type of training that would be useful. The BOF will assume only broad knowledge and will not need or discuss details of technologies like Hadoop, Dryad, Twister, Sector/Sphere (MapReduce variants)