back to article Fire at The Planet takes down thousands of websites

A fire at The Planet's H1 data center in Houston, Texas on Saturday has taken out thousands of websites. In messages posted on the web hosting firm's forum, the company blamed a faulty transformer for the fire. No servers or networking equipment were damaged, but the data centre remains without power, after The Planet shut …

COMMENTS

This topic is closed for new posts.
  1. John Taylor
    Stop

    Even the Status page seems to be down.

    Even the plane't own site seems to be down now as I can't access the status page link from this article either.

  2. Anonymous Coward
    Anonymous Coward

    the forums are up

    Try getting to http://forums.theplanet.com/index.php?&showtopic=90185&st=20

    It's slow, but I bet there are thousands waiting for news.

    I have around 25 sites hosted in the Houston centre, I'm surprised the phone hasn't started ringing yet !

  3. Anonymous Coward
    Anonymous Coward

    The latest update from the Planet is ...

    To keep you up-to-date, here is the latest information about the outage in our H1 data center.

    We expect to be able to provide initial power to parts of the H1 data center beginning at 5:00 p.m. CDT. At that time, we will begin testing and validating network and power systems, turning on air-conditioning systems and monitoring environmental conditions. We expect this testing to last approximately four hours.

    Following this testing, we will begin to power-on customer servers in phases. These are approximate times, and as we know more, we will keep you apprised of the situation.

    We will update you again around 2:30 p.m. this afternoon.

    ###################################

    2:30pm CDT is about 8:30pm BST, so an hour and a half till the next update.

    They also have said :

    "We absolutely intend to live up to our SLA agreements, and we will proactively credit accounts once we understand full outage times. Right now, getting customers back online is the most critical."

  4. Anonymous Coward
    Alien

    Data centres

    I once had a tour of a data centre in the UK and was shocked to find their fire extingusher system was "water sprinklers", not powder or gas. Apparently the insurance company requested that system. You don't see that system on the star ship Enterprise.

  5. Anonymous Coward
    Heart

    multiple sites

    People should spread their servers around colocation sites - Use a provider in Edinburgh as well as London for example.

  6. Damien Jorgensen
    Gates Halo

    lol oh well great network uptime

    So much for a superior network when you cannot keep the power on!

    Need I say SOFTLAYER at all! The Planet is run by muppets anyway

  7. ben edwards

    5 other datacentres?

    And not one of them has a recent backup to load Houston's data so the public can be served?

  8. Chronos
    Thumb Up

    Look on the bright side...

    Phorm's server(s) is(are) there, are they not? :-)

  9. heystoopid
    Pirate

    hmm

    Hmm , crap power supplier not adequately monitoring the grid delivery and sounds to me more likely the system mains power transformer has been running at 115% plus power overload margin for far too long !

    Mind you ,I have seen and heard of at lest three go with a very spectacular bang in my area causing massive local power outages for hours on end !

  10. Brett Patterson

    *twiddles thumbs

    We have a server there: http://www.cardesignnews.com

    Funny we also haven't heard from any users yet... but maybe the nameservers being down will affect (outsourced) mail too? (Our servers are up and running in H2 but our nameservers are in H1).

  11. Peter White

    oh dear, BT's webwise servers have gone down

    oh dear, BT's phorm hosted / controlled server for the webwise system seems to have been a casualty as well, not a very resilient system BT

    peter

  12. tardigrade

    I'm Lucky

    My backup server in H1 is down. I noticed that it timed out last night when transferring from my primary servers. Damn lucky that all I use it for is backup and secondary DNS. I had a server down with 1&s****house1 last week I toyed with the idea of moving those server customers onto the backup server briefly before setting up a new server elsewhere. Glad I bit the bullet and set up a new server straight away instead. Even more lucky is the fact that my experience to date with The Planet has been so good that I nearly set up the new server with them.

    I'm willing to bet that The Planet will get the entire Houston data center back on-line faster than 1&1 can get my server with them back on-line. But then thats not much of a bet the 1&1 server has been off-line for 14 days now! Muppets.

    @Alan

    It's the weekend. Come 10:32am Monday and if you're still off-line you'll know about it, that's how long it takes customers to realise that it's not their exchange server that's the problem.

  13. Steven Raith
    Joke

    DR/Redundancy?

    I'm not a webhosting or datacentre guy, but I was under the impression that there would be procedures in place to guard against this sort of outage - offsite stuff, redundant sites, etc?

    If that's the case, is redundancy what is going to happen to the DR guys? :-)

    Alan - maybe no-one likes your websites and doesn't care? ;-)

    Steven "Only joking Alan" Raith

  14. Gilgamesh
    Thumb Down

    this is a disgrace

    it's nearly monday morning and b3ta isn't working

    people may be forced to do some work if this isn't fixed soon

  15. Alan Potter
    Dead Vulture

    Could have been worse...

    http://news.bbc.co.uk/1/hi/world/americas/7424571.stm

  16. steve
    Flame

    It may be due to rccent downszing of power consumption

    According to http://www.thehostingnews.com/news-dedicated-server-firm-the-planet-data-center-manager-garners-award-4306.html it claims "Mr. Lowenberg conducted a six-month trial to reduce power consumption and increase data center operating efficiency. Initial results demonstrate that while critical server loads increased by 5 percent, power used for cooling decreased by 31 percent. Overall, the company experienced power reductions of up to 13.5 percent through a broad range of improvements. The new green initiatives were conducted across its six world-class data centers." and also The Planet operates more than 150 30-ton computer room air conditioning (CRAC) units across its six data centers. In one data center alone, the company was able to turn off four of the units. The cooling requirement on two of the units was reduced to 50 percent of capacity, while another nine now operate at 25 percent of capacity. The company also extended the return air plenums on all of its down-flow CRAC units to optimize efficiency. "

  17. Anonymous Coward
    Paris Hilton

    @By Chronos

    So if the Phorm/Webwise system was operational, does that mean BT Broadband system goes t*ts up?

    Oh dear BT, this Phorm/Webwise system is going to do wonders for your customer satisfaction. ... NOT!

    BT customers could be leaving in Droves!

    (They possibly will anyhow once they get a handle on the invasive nature of Phorm/Webwise interception of all their HTTP traffic and find out the history of Phorm (121Media) and its nasty spyware products.

    Paris, Because she loves a warm fire in her belly and she frequently goes t*ts up.

  18. Anonymous Coward
    Anonymous Coward

    It's not Planet who should run backups of customers sites

    We host a site of medium-high importance. It has been our plan that as soon as we can financially afford to set up a duplicate rack in a totally different location we will. Both sites will be load balanced and data replicated in real time. It adds at least 150% to the cost but if you need that level of resiliance you have to pay for it.

    Data centres are better than hosting in your office but are vulnerable to outages as many people know well. Our last outage was because some idiot (a data centre engineer?) switched off power thinking it was only going to affect someone else's rack. They may have fire supressing gas and the best security system, but there is no technology to prevent the employment of idiots. And there will always be idiots.

  19. Richard

    give them a break

    come on guys....it was a fire and at end of day its better keepin generators off as instructed by a fire department. servers are safer off and isolated while fires and cabling is checked over

  20. maryna
    Thumb Down

    what a lot of b***x

    While they save up on their electricity bills, small businesses are going to pay heavily!

    all our 3 servers are down!

    they will survive! but will we???

  21. Anonymous Coward
    Anonymous Coward

    you would think an isp...

    ...Would have installed a firewall.

    No 'coat' tag because I can't figure out which to click with the blackberry browser.

  22. Legless
    Unhappy

    B3ta

    Still offline.

    You never know how much you miss something until it's gone. Hope they get everything back, and soon...

    Cheers

  23. Glen Turner

    Fire fighting and computer rooms

    "I once had a tour of a data centre in the UK and was shocked to find their fire extinguisher system was "water sprinklers""

    A dry-pipe water system, Vesda particulate detector, and continuous staffing are the usual approach.

    The Vesda system sets off an alarm and a tech with a fire extinguisher goes hunting for smoke. This allows the usual sort of computer-based fire to be handled with little damage to surrounding servers (usually they just get the power dropped as the tech drops the rack's circuits prior to removing the smoking gear, taking it outside, then opening the box and applying the extinguisher).

    The water system is for the last resort, usually from a fire in another part of the building reaching the computer room. It's not unreasonable for the insurance company to sacrifice the computer room if that saves the building -- anyway, they are paying for the damage to both so it's their call.

    Gas got unpopular when CPUs got small, numerous and hot and computer rooms got very, very large. If you think through the consequences of a cooling gas hitting a modern hot CPU and the problems of venting released gas from a large space you'll see the problems.

    Fixed powder-based systems aren't a good fit to computers. An aerosol-based system would be a better fit.

  24. Dr Trevor Marshall

    I would appreciate help with replication

    Despite the double entendre in the title, that with which I need help is replication of a MySQL database over two servers at differing locations. I have read the MySQL manuals, but I would appreciate a pointer to a tutorial or a book which explains the procedures in more detail. Currently I am working with a 1 gigabyte DB, and I would like to mirror or replicate it so I don't lose everything next time a server-farm disappears...

  25. Seán

    B3333TA!!!

    I think you guys are missing the point, it's not about colocation or backups or power or any of that shite. It's about b3ta. What if there's no backup to the b3ta archive? It'd be like the library of Alexandria over again. 5:30AM on Monday and still nothing. I'm not a religious man but here goes.. Allah wu Akbar, Allah the digital, the compassionate please restore the purple cock and domo.

  26. Matt White
    Unhappy

    Do we have an update on B3ta yet?

    I'm in the office, and can't read the QOTW archive. I also have yet to see a magenta cock today. This is wrong for a Monday.

  27. riverghost
    Happy

    I do have to ask the question

    Why have you bothered to include the link to b3ta when you know fully well it's unobtainable?

  28. Anonymous Coward
    Alert

    @AC: data centres

    That data centre with water wouldn't happen to be in Edinburgh would it? I think I've been there too.

  29. amanfromMars Silver badge
    Coat

    CyberIntelAIgent Alien Beings ....... PolderGeists in NetherLands

    "Allah wu Akbar, Allah the digital, the compassionate please restore the purple cock and domo." .... By Seán Posted Monday 2nd June 2008 04:27 GMT

    Amen and Hallelujah to That Passionate Restore Point of Immaculate Imperfect Relevance, Seán.

    Love ur dDutch. .... Real Get SMARTer IntelAIgents.

    Here's a Virtual IntelAIgents Swap Shop/Treasure Vault ....... http://www.ams-ix.net/

    I'll get my coat ....there's a CAB AI Called.

  30. Anonymous Coward
    Joke

    Damnitall!

    Work's blocking the Internet Archive's Wayback Machine; I can't even see if they've got older versions of B3TA squirrelled away anywhere!

    This is ridiculous. Surely The Planet have backups, disaster recovery, that sort of thing?! Can you imagine if the Emergency Services said "Well we can't actually man the 999 phonelines 24/7/365. We'll need a week off every so often but we'll compensate anyone financially suffering from our unavailability"? Well this is even more serious! 9.15 and B3TA is still down, people!

  31. Anonymous Coward
    Joke

    The sig of the guy in the forum doing the updates...

    ... has a slightly unfortunate link title to it, "How fast is youre network?".

    I don't know about mine, but parts of his network hit about 50mph recently!

  32. Anonymous Coward
    Anonymous Coward

    Rasberry ants

    Houston it's those damned insulation eating ants that did them in I bet.

  33. Wil
    Paris Hilton

    Ahh the fools, the fools

    I was 3/4 the way through coding a P2P message board that replicated the B3ta messageboard and would have mitiagted this, but then I gave up through lack of interest. http://sourceforge.net/projects/b3ta

    Will they never listen, think of the children, apologies for length or lack of.

  34. Busby
    Black Helicopters

    10:37 still no b3ta

    Productivity across UK offices must be at an all time high for a Monday morning this must be some sort of xonspiracy. Helicopter for obvious reasons.

  35. Wil
    Go

    For those about to shock

    There's a temp board set up by Rob here

    http://forum.robmanuel.com/viewforum.php?f=3

  36. Stuart
    IT Angle

    Remember off grid Rackshack?

    An explosion of the local utility transformer took Rackshack's main DC off grid for 4 days a few years ago. Not a minute of downtime was experienced by 17,000 servers.

    The subsequent write-up of the event showed both an amazing amount of pre-planning that initially kept everything going and fast adaption to cope with unexpected consequences to keep it going. A long list of lessons were learnt at Rackshack. Were these all passed on to The Planet when it acquired them?

    And anyone who has a mission critical server without a geographical seperate backup - presumably doesn't understand the concept of backup - or why you have a minimum of two DNS. When those phones start ringing I hope they say "You are fired!". Putting client's businesses at risk (like no email?) is just darn unethical as well as bad business.

    O look forward to hearing any excuses ... from £60/month for a deicated server phrases like a pennyworth of tar come to mind.

  37. riverghost

    Shirley every host would have the same risk of this happening

    It wouldn't take too much for a co-operative to be set up distributing activities between a predetermined number of other hosts until the crisis is over.

    Incidentally would I have a legal case against b3ta for making actually have to do some worth on a Monday morning and the mental anguish caused by this?

  38. Slaine
    Flame

    only if it be the will of Allah

    "Allah wu Akbar, Allah the digital, the compassionate please restore the purple cock and domo." .... By Seán Posted Monday 2nd June 2008 04:27 GMT

    Ensha Allah.

    Or, in layman's terms: the computers were built thanks to Allah, the data was put there by the hand of Allah, the colocation duplication systems were denied by the mighty will of Allah, the fire was started by the great and merciful Allah and the DNS servers are still down thanks to the estemed and bountiful Allah. Allah be praised - and the rest of us thank fuck it wasn't organised by LizardGov.uk otherwise the data centre would probably still only be half built, at half the original spec for quadruple the cost.

    Can we go to stoning now?

  39. Haviland

    Got to love status reports

    Especially ones with "I would like to provide an update on where we stand following yesterday's explosion ..."

  40. Anonymous Coward
    Coat

    Data Cente Outage

    Well I guess thats why we don't have our power transformers indoors then!

    When they blow up they can do it peacefully in the car park while the UPS kicks in and prepares the generators for taking the load. When they kick in you see a mushroom cloud of diesel smoke, god knows what people think has happened when they see it!

    I guess this is just a bad luck story, I can see they are working hard to repair this and get them back online. Would you like to be the one to reboot 9000 servers lol.

    Definitely think they could have had a better disaster recovery plan in place. Seems like they only had a basic one and thats it....

    The way to make it most resilient is to have two buildings kind of co-located (same business park etc) but not physically adjoined. So if one gets nuked the other can continue.

    On the bright side I think they must have saved a bit on the leccy bill.... oops wheres me coat

  41. Smallbrainfield
    Unhappy

    Lunchtime and...

    ...still no b3ta. I may have to go out in the fresh air. Nice to see a few b3tans on this comments page here, though.

  42. Steve
    Boffin

    Unashamed plug for open source DR community

    If our El Reg moderatrix will permit it (pretty please, Sarah), may I invite Dr Trevor Marshall and other interested parties to join us for discussions at:

    http://www.opensolaris.org/os/community/ha-clusters/

    and/or

    http://blogs.sun.com/SC/

    and look particularly for entries related to the Geographic Edition.

  43. W
    Coat

    "...the company blamed a faulty transformer for the fire."

    So is this why the Heathrow security guy made that fella take his Transformers T-shirt off?

  44. Webcrawler2050

    Lets hope

    From the seriousness, that no one was hurt or injured. :)

  45. Alan Davies
    Unhappy

    b3ta

    Productivity has increased tenfold. I hope b3ta/links is back on line soon. I have a rather humorous clip of Rick Astley to post . . .

  46. Anonymous Coward
    Alert

    Im at a loss

    I've done far too much work today, and not enough skiving

    Rob's emergency board isn't quite providing the same fix

    lo, the greebo warrior

  47. Ian Ferguson
    Unhappy

    That explains

    why I spent the whole of Sunday furiously clicking 'refresh' on b3ta.com to no avail, failing to notice anything else around me, or remember to eat.

    It's the /talk people I feel sorry for - at least the rest of us can look at the pretty pictures on 4chan.

    (Peregrin)

  48. Anonymous Coward
    Coat

    Network failure

    I have nothing hosted with them, but it sounds like they were doing fine until the fire dept. forced them to shut down their generators. True, it's not as good as total redundancy, but again, it sounds like they could have coped. Probably the generators weren't even a slight part of the problem--just playing it safe.

  49. Greem
    Coat

    Leet!

    From the status page/forum post:

    1337 User(s) are reading this topic

    Thanks for the compliment, The Planet!

    Mine's the one hanging on that wall over... oh, bugger.

  50. Anonymous Hero
    Paris Hilton

    @AC: Data Centre & @Glen Turner

    The systems commonly used are of the HI-FOG mist fire suppression type. The pipe work and nozzles are often mistaken for 'sprinklers' but in fact discharge a very fine mist that puts out the fire and is safe for humans and the hardware. Very common for DC and Telecoms applications.

    Gas discharge systems are expensive and the older CO2 fueled systems can be lethal to humans in areas where there's no ventilation.

    Paris....'cos she enjoys a good sprinkling every now and again.

  51. Jason Law

    Watermelon

    The emergency board is pants.

    I need to see some badly shopped pictures of kittens damn soon - it's been more than 24-hours since I've seen a bandwagon, domo, teh quo or crudely drawn cocks.

    My productivity is through the cranberry roof.

    (linbox)

  52. Matt White
    Thumb Up

    @Matt White

    Argh christ there's another one of me!

    That said, I wholeheartedly agree with your b3ta & purple cock related statement dear clone.

  53. Richard

    and again

    guys this is nothing to do with poor disaster recovery. the transformers taking the power from the grid have blown up. this damaged the lines in the building and the floors going to the racks. no matter how good your disaster plan is...it wont allow for this scale of event. they have to replace power cables, etc and the servers are offline until it is save to turn them back on. this is a serious failure of power...not simply generators not working or a fibre cable being cut

  54. anarchic-teapot

    Has no-one thought to enquire

    What the PFY was doing at the time? A transformer exploding and taking out three walls sounds deeply suspicious to me.

  55. Bagpuss
    Unhappy

    an entire day without magenta cocks

    that's an entire day at work with no magenta cocks or TOAP image challenge entries.

    I feel funny.

    I do appear to have done some work, but tomorrow I will re-check all my data and almost certainly find critical errors.

    Who should I sue?

  56. Matt White
    Flame

    @ Matt White

    Who are you calling a clone? I'm the original!

  57. Steve

    @Richard

    "nothing to do with poor disaster recovery"

    Yes it is. Good DR requires that you have a backup installation sufficiently far from the primary site to withstand events like 9/11, New Orleans, Chernobyl etc.

    If a few exploding transformers that take out some racks and cables put you off air, you do not have a valid DR plan. A UPS and generators might provide some measure of local high availability (HA), they don't cut it for DR.

    And yes, DR costs more than HA. Just like insurance, you have to pay for adequate protection, or pay the price. There are a number of reports around which show that ~40% of businesses without a DR plan go bust after a disaster. The rest have a very painful few years.

  58. Robert Brockway
    Linux

    @Brett Patterson

    The nameservers for a particular domain really should seperated geographically and logically (network-wise). Getting a secondary nameserver is free or dirt cheap.

    I sometimes hear people say "it doesn't matter much anymore". This is rubbish. Having all of your nameservers down is much worse than just having a service like your website offline. With all of the nameservers down mail to the domain won't queue, it will bounce and people visiting the website will see a message akin to "This domain doesn't exist". Non-technical users might be excused for thinking a company had gone out of business.

    Run multiple namservers in different parts of the world. It's cheap, easy and saves a lot of hassles.

  59. Anonymous Coward
    Black Helicopters

    @ all you B3tards

    It's perfectly obvious that this fire story is all a cover-up: what's REALLY going on is that the governments of the English-speaking world, having awoken to the very real dangers of the impending recession, have struck pre-emptively and taken steps to increase office productivity by shutting down all known havens of timewasting. Mark my words, icanhascheezburger is next

  60. Dick
    Happy

    @Steve

    "Good DR requires that you have a backup installation sufficiently far from the primary site to withstand events like 9/11, New Orleans, Chernobyl etc."

    DR does not mean uninterrupted operation, it means a plan to get back in business within an acceptable amount of time. You have to be realistic and match your DR plans to the level of service you are offering otherwise you will be out of the highly competitive lower/mid end hosting biz very quickly.

    This is a host with 50K servers, they lost 9K to this event. I believe they have 6 data centers, AFAIK they are all in the Dallas area of Texas taking advantage of the low power costs there. Following your logic they should have one or more data centers sitting idle in another state just in case of a catastrophic event such as this one. There's no way they could do that unless they were selling a much higher grade of service.

    They are recovering from their disaster. Last time I checked something like 2/3 of the servers are back up, or in the process of getting back up (and that's in less than 48 hours) and there's a plan in process to temporarily get power to the rest that were directly affected by the explosion.

    I have a website that's hosted at one of their other locations, I am critical of their design that put management servers for my location in the center that suffered the fire, causing unnecessary disruption of service that would not have happened if the centers were independent.

  61. tardigrade
    Happy

    My 1st floor server is back on-line.

    It came back at 8.32pm BST. Apparently there are not too many more to go now before all the servers are back. Looks like B3ta will be left until last. LOL.

    The second floor is running on mains power again, but due to damage to the underfloor power conduits the first floor is all on generator power and will be for the next 10-12 days. Ouch! Hope they've bought plenty of diesel and a mechanic.

  62. Steve

    @Dick

    I agree that appropriate DR needs to be matched to the service they are selling. If their customers are happy with an SLA that allows a 48+ hour outage then that's fine. The people I deal with will get upset (putting it mildly) over a 2-hour outage.

    There's no need to have an idle data centre elsewhere, though. It could be doing useful work with some spare capacity ready to pick up the load from a site that fails, giving reduced service rather than a full outage. As with all HA/BCDR solutions its a tradeoff of cost versus RPO/RTO matched to the service agreement that you're selling to your customers. The likes of the Nasdaq or the NYSE will have very different DR requirements to a small company that will be only mildly inconvenienced by a two-day outage.

    Personally I wouldn't trust my business to a company with all its data centres in one city, though. There are way too many possible common-mode failures there.

  63. Anonymous Coward
    Happy

    Time to skive!

    You can get to B3ta through http://207.44.242.20 so server's definitely back online. Seems a bit slow, though!

  64. Alan Davies
    Happy

    b3ta is back

    http://207.44.242.20

    Err WOO YAY! . . . .. ?

  65. Gareth

    Like a Parisian courtesan

    Firefox can't establish a connection to the server at www.b3ta.com.

    Firefox can't establish a connection to the server at 207.44.242.20.

    Balls.

  66. Tim
    Thumb Up

    Thank Dog for that ...

    Yes I'm was hit by the H1P1 debarcle .... just happy enough to be back now !

  67. Josep Cabanes

    Nightmare is finally over for me...

    I was also hit by the H1P1 debacle... all my servers affected. Half of them in second floor, recovered on monday, and the rest in first one, just recovered some hours ago. Kurt.-

This topic is closed for new posts.

Other stories you might like