Tue, 15 Nov 2005

Power grid update

On Saturday there was a scheduled power grid outage in the Faculty building in order to replace a power-generator and UPSes. When such things are to be done, the electricians assure us that there will be no outage at all (or only a small one, like in this case), but the reality is usually different.

I came to work around 6am to reboot all servers and switch the most important ones to the temporary power supply. The outage itself was planned to start at 7am. We have gradually rebooted our servers and plugged them to two extension cords which the electricians have prepared and declared that they would be sufficient for an expected load.

When reconnecting the last server, about half of the servers crashed - the circuit was overloaded and we blew out the breaker. So we had to add another extension cord on a separate temporary circuit, and moved some servers to it. However, we have found that even the other cord of the original two cords was overloaded as well - the cable itself was warm, and the rest of it which was still on the reel was so hot that it was not possible to keep a hand on it. It is a pure luck it did not catch the fire.

Fortunately when we moved another three servers to the third power cord, it started to cool down, and it worked during the whole outage. However, it seems the electricians missed the estimation of the load their extension cords can handle by about 50%.

In the evening the main power grid was back, so we started to plug the servers back to the original cords. However, the electricians messed up something while switching on a breaker on a non-UPS circuit, and they blew up one of the power supplies on our SunFire V880. The new power supply costs about US$ 1000 (in the U.S., here in the Czech Republic it is probably even more).

Aside from that, around 9pm all the servers were back on the main power supply. The reconstruction will continue on next two Saturdays, this time hopefully without the outages.

