Testing MPI_Barrier Optimized with NetFGPA against Mellanox Core-Direct

Abstract

Parallel programs written using the standard Message Passing Interface (MPI) frequently depend upon the ability to synchronize execution using a barrier. Barrier synchronization oper- ations can be very time consuming. As a consequence, there have been investigations of custom interconnects and protocols for accelerating this operation and other collective operations in parallel MPI programs. We explore the use of hardware programmable network interface cards utilizing standard media access proto- cols as an alternative to fully custom synchronization networks. Our work is based upon the NetFPGA – a programmable network interface with an on-board Virtex FPGA and four Ethernet interfaces. We have implemented a network-level barrier operation using the NetFPGA for use in MPI environments. We will compare our results with the state of the art NIC, Mellanox ConnextX-2.

Intellectual Merit

This project will give us to fairly evaluate our NetFPGA design against widely used HPC interconnect Mellanox Core-Direct.

Broader Impact

We are planning to implement a timing model which precisely measures the time spent for an MPI_Barrier after it is offloaded to the Core-Direct NIC. We ahve a good model for the NetFPGA, and willing to see the opportunity to apply it to the Core-Direct environment and compare those two implementation of MPI_Barrier.

Use of FutureGrid

We are going to use it for testing barrier intensive MPI Applications.

Scale Of Use

I want to run a set of comparisons on entire systems and for each I will need couple of hours but in different time slots.

Publications


FG-359
Omer Arap
Indiana University
Active

Timeline

1 year 9 weeks ago