Alamo Network outage
Submitted by David Gignac on Tue, 28 Feb 2012, 17:00:49 GMT
FutureGrid Hardware Outage Information
Alamo Network outage
- Status
- Resolved
- Type
- Network
- Impacted systems
- alamo
- Start of outage
- Tue, 28 Feb 2012, 01:20 EST
- Anticipated end of outage
- Fri, 02 Mar 2012, 12:00 EST
Description
Router issues are causing intermittent access to the FutureGrid resource Alamo. The TACC Network staff is working on a solution. We are having to update firmware on a number of switches. Login node is available but compute resources are still inaccessible.
Resolution
Problematic switch identified and marked for replacement. It may be the Friday before I get a replacement switch. Alamo cluster is up and running but external interfaces on nimbus partition are not working. NIMBUS partition still unavailable.
Removing the offending switch and re-installing the nimbus partition seems to have fixed the problem with the network crashing. Will continue to monitor closely. Nimbus partition is available again.