Linx outage caused by upgrade
A mangled port upgrade caused an outage at Linx yesterday. A spokesman for Linx (the London Internet Exchange) said a routine port upgrade introduced a problem which caused processor usage in a router to surge, leading to instability on one of its LANs. The other LAN, provided by another vendor, took over and members had to do …
THN not LHC
The power problems were in one suite on the ground floor of Telehouse North, not the London Hosting Centre.
The title is required, and must contain letters and/or digits.
The power problems knocked our network connection out as well, but we're in Telehouse Metro.
One of the joys of London data centres really - only takes a small problem in one to cascade to loads of others.
makes sense
might explain the problems i had around that time. i was going to blame bt but....
Look at those stats
Linx is peaking at nearly 600Gbps! That's a lot of traffic. Twice as much as two years ago. Note the ramping up of data in late 2007... Anything to do with iPlayer and iPhones?
V.
Inter-AS ISP Stats
My daily inter-autonomous system ISP top-talker stats suggest it is traffic to/ from AS15169 aka google = youtube. And has been for a long, long time.
What makes sense?
The topology is a redundant L2 ring. A shared IP subnet.
A port upgrade sounds more like a specific IXP router/ Unix eBGP Route-Reflector problem which players whose routing policy permits will inter-connect via.
Tier-1 players are unlikely to inter-connect via this because they want to sell Joe Public Tier-2 IP Transit.
Sounds like a bit of a drama for a specific line card upgrade for the IXP itself. BGP takes care of re-routing anyway via another path for any customers exchanging prefixes via the route-server.
Only speculating...
Broadcast storm
It wasn't a port upgrade, a member's session was being turned up and due to miscommunication between the member's engineer and the LINX engineer a loop was somehow introduced in to the network which caused a broadcast storm and a switches CPU to max out cue packet loss and dropped BGP sessions.
Agreed
But it doesn't take 20 mins to figure out or switch to another topology.
Most of the players at LINX are large Tier-2 or 1 and their engineers hopefully know better than to spew even CDP et al on to that LAN.
