Alamo Network outage

FutureGrid Hardware Outage Information

Alamo Network outage

Status
Resolved
Type
Network
Impacted systems
alamo
Start of outage
Tue, 28 Feb 2012, 01:20 EST
Anticipated end of outage
Fri, 02 Mar 2012, 12:00 EST

Description

Router issues are causing intermittent access to the FutureGrid resource Alamo. The TACC Network staff is working on a solution. We are having to update firmware on a number of switches. Login node is available but compute resources are still inaccessible.

Resolution

Problematic switch identified and marked for replacement. It may be the Friday before I get a replacement switch. Alamo cluster is up and running but external interfaces on nimbus partition are not working. NIMBUS partition still unavailable.

Removing the offending switch and re-installing the nimbus partition seems to have fixed the problem with the network crashing. Will continue to monitor closely. Nimbus partition is available again.