Testing of Network Facing Services for the Open Science Grid

Project Information

Discipline
Computer Science (401) 
Subdiscipline
11.07 Computer Science 
Orientation
Research 
Abstract

The Open Science Grid (OSG) selects and distributes software for its user community. One of the activities is to validate the behavior of network-facing services beyond the currently deployed scenarios. One of the most difficult things to test is the effect of unreliable networking, so FutureGrid provides an excellent platform for this activity. We plan to make extensive use of the Network Impairment Device to verify the behavior of the main network facing services used in OSG.

Intellectual Merit

The Open Science Grid (OSG) is a production distributed computing environment and relies on several network facing services for its operation. The network traffic arriving to these services is chaotic in nature, being driven by O(1k) users who run O(100k) jobs containing user provided applications and scheduled over extended periods of time. These services thus must be able to gracefully handle such traffic patterns in order to provide value to the users. OSG is actively testing the software providing such services in both expected and edge conditions, to verify their behavior and report any significant problems to the providers of such software. One aspect of such testing is to verify the behavior in the presence of network problems. These tests are very hard to simulate in production environments, while they should be easy with the FutureGrid's Network Impairment Device. The net result will be better understanding of the behavior of the network facing services used in OSG in such edge conditions, and possibly improved software that addresses any problems found.

Broader Impacts

The Open Science Grid (OSG) provides computing resources for a wide breath of sciences. By exercising the critical components of OSG in extreme conditions, and reporting to the software providers any problems found, we will significantly reduce the chance of those services experiencing a downtime in production environment, thus increasing the amount of science being produced.

Project Contact

Project Lead
Igor Sfiligoi (sfiligoi) 
Project Manager
Igor Sfiligoi (sfiligoi) 
Project Members
Igor Sfiligoi  

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
  • Network Impairment Device
 
Use of FutureGrid

Will install the services to be tested on one site, and client software on the other site. Will launch a test suite from the client side while modifying the parameters of the Network Impairment Device.

Scale of Use

Need the Network Imparment Device and a few nodes on each side. I will use the system for a few days at a time, with possibly significant dead times in between.

Project Timeline

Submitted
08/29/2013 - 20:54