back to article The silence of the racks is deafening, production gear has gone dark – so which wire do we cut?

Hit reset on the working week for Friday has arrived and with it another entry in The Register's long list of on-call shenanigans. Today's story, from a reader The Reg's patented anonymiser has elected to call "Jon", is a cautionary tale for those tasked with keeping the data centre lights on. Jon's employer had added some …

  1. Anonymous South African Coward Silver badge

    UPS plugged into itself

    We received two RTC UPS'es (3kVa) which we need to set up with new equipment that was going to be shipped out to a remote site.

    One UPS beeped to indicate that it was unplugged from the power, then a click and then silence as equipment stopped working.

    Some doofus plugged the UPS back into itself, causing it to trip.

    A quick Gibbs slap to the back of the head of the offender, a >clickety<>click< and an unplugging later everything was fine.

  2. Anonymous Coward
    Anonymous Coward

    The big red button

    Back when I was a PFY a company I was working for had a server room with about 50 physical servers powered by a large UPS. One day, the UPS was having its' annual maintenance. Myself and the resident BOFH were escorting the UPS tech. The BOFH encouraged me to press the big red button labelled "EMERGENCY POWER KILL SWITCH". The BOFH assured me that it would do nothing as the UPS was in bypass mode.

    So I lifted the guard and pressed the button. Not only did the server room go offline, the entire building (about 350 staff worked there) was shut down. We quickly hid, only to reappear when the rest of the IT team arrived to investigate.

    We were offline for about two hours as the 'Elf'n'Safety people would not allow us to reset the old fashioned breaker in the basement until an electrician had checked the system. Luckily we had a few sparkies on staff so it didn't take too long.

    They never discovered my dark secret.

    Anon because.....

    1. chivo243 Silver badge

      Re: The big red button

      A colleague and I were resetting the network card on an APC, used the wrong serial cable, darkness, silence, and "It wasn't me!" shouted back and forth... We powered everything back up, and owned up to our F/U once back to the office...we still laugh about it.

      1. GlenP Silver badge

        Re: The big red button

        used the wrong serial cable

        I'd forgotten that "feature" of APC UPS's.

        1. Doctor Syntax Silver badge

          Re: The big red button

          A client had a SCO box on a UPS with a serial link to start a shutdown. I can't remember the details but doing something which shouldn't have touched the UPS at all started the count down to power-off. The phenomenon was repeatable. I was never able to resolve it but it seemed as if something in the IP stack was also affecting the serial link.

          1. handle handle

            Re: SCO SLIP?

            Perhaps a SLIP interface?

        2. Antron Argaiv Silver badge
          Mushroom

          Re: The big red button

          9 pin sub-D connectors should be used ONLY for one thing: serial cables.

          When I do designs, I spend an inordinate amount of time, trying to insure that all the connectors are different, and that if some have to be the same, disaster will not ensue if they are interchanged.

          I also have a great reluctance to bring DC power out to connectors if it's not current limited in some way. I've seen too many sparks.

          1. NXM

            Re: The big red button

            I do that too, but it doesn't stop the site "engineer" (ie some bloke with a hammer and an nvq in bashing things with it) from removing the header from the board and plugging it back in the other way round. A dead board ensues, which I won't fix on the warranty.

          2. jj_0
            Happy

            Re: The big red button

            9 pin sub-D connectors should be used ONLY for one thing: Joysticks!

            1. SteveCoops

              Re: The big red button

              9 pin sub-D connectors should be used ONLY for one thing: CGA!

              1. Anonymous Coward
                Anonymous Coward

                Re: The big red button

                No, not CGA!

                Monochrome - choice of green or orange text

                1. J. Cook Silver badge

                  Re: The big red button

                  Ah, the memories of running WfW 3.11 on a hercules monochrome monitor...... Yep that's all of them.

                  AOL looked only slightly better on a 16 grey-scale display... (aka still shite)

                  1. The Dark Side Of The Mind (TDSOTM)
                    Pint

                    Re: The big red button

                    Have a pint, you brave soul. Hercules on WfW 3.11 was a nice trick to do back then, more so without proper documentation.

            2. Vincent Ballard

              Re: The big red button

              I used to know the pin-out by heart, but I'm drawing a total blank now. Remind me to make an appointment to talk to my GP about Alzheimer's.

              1. Fungus Bob Silver badge

                Re: The big red button

                Why would you want to remember that?

          3. the hatter

            Re: The big red button

            That's what makes the APC design all the more special. It *is* a serial connection, everything looks normal with the rx/tx/rtx/cts/ground pins all where you'd expect on a DB9 with levels you'd expect. Just they chose to do.... something weird with one of the other pins (I think DTR, but could be any of the remaining ones) which means that if you're not using their own serial cable, some common line is at the wrong level, and that signals the UPS to shut down the instant you connect a a computer to it with a regular serial cable.

            1. mistersaxon

              Re: The big red button

              APC can have an optional module that isn't RS-232 but is 9-pin d-sub. Typically used by IBM AS400s which should therefore be the *only* thing plugged in to the UPS, as the design spec is that the system monitors UPS battery level until a certain level is reached then shuts down the server *and the UPS* so the batteries will recharge faster when the power is restored. It was a design that worked well when the AS400s had their own internal UPSes but didn't translate to the bigger world of "other devices".

              For the terminally curious: https://www.apc.com/bm/en/faqs/FA159551/

        3. Captain Scarlet Silver badge
          Mushroom

          Re: The big red button

          Bloody APC Serial Cables, yup got me to when I suddenly found someone had swapped my APC Cable (I used to keep a bag of connectors with one of each and their guides).

          Thankfully for me the secondary kit which wasn't considered worthy of being powered by the UPS were still online.

          Only found out after replacing the damn thing and testing stand alone. Replaced a perfectly healthy UPS because of a £2 cable.

          1. Anonymous Coward
            Anonymous Coward

            Re: The big red button

            Fricking APC. One series of UPS had a cable for Unix and a different cable for Netware/windows systems.

            Regarding TFA: this is why I like systems with dual power supplies. Put one on UPS, tbe other on utility power. The UPS line saves the day when the utility power drops, the utility power line saves the day when the UPS drops. In some areas the utility line is the reliable one in this equation.

        4. Stevie Silver badge

          Re: used the wrong serial cable

          See your UPS serial cable and raise you a box of visually identical but mutually incompatibly wired Sun SPARC null modem cables.

          None labelled.

          Every flippin' time remote console access was required there was a ten minute mix-n-match cable to server fiasco. Nearly drove me nuts. I offered to label the cables and was told in no uncertain terms not to do so "in case the labels caused confusion".

          Same crew kept the departmental dry erase markers in a ziplock baggie with permanent markers. Every presentation started with a ten minute shout fest as eveyone warned everyone else in the room about the pens while the one who wanted to draw a f*cking diagram madly searched for the one usable dry erase in a bag of unfit-for-purpose.

          I eventually bought my own pens (4 bux from Staples) and an eraser (about the same) and my next visual presentation was given to an incredulous repeating chorus of "you bought your own pens????" that drowned out my narrative.

          All Unix SAs of course, regarded themselves as the bees knees but were seen by everyone else as the Laurel and Hardy of the enterprise. I never met a less agile or more reactionary team.

          They achieved new highs of "popularity" when they informed the Grand High Muckety Muck that they could not shut down our Unix server farm "by application, in a rolling fashion" because "that was a windows server methodology". Truth was no a single person on that team had the faintest idea of who was running what applications on each server since they never did any kind of assay when provisioning a new server.

          All changed now. New brooms at the top levels (along with that impossible rolling shutdown debacle) enforced more control and monitoring. You probably heard the wailing on the other side of the Atlantic.

          1. Francis Boyle Silver badge

            "in case the labels caused confusion"

            Didn't know the University of Woolloomooloo had an engineering department

            1. JJKing Silver badge
              Go

              Re: "in case the labels caused confusion"

              Re cables, I just had several rolls of different electrical tape and a chart stating what each colour was. We didn't have a label machine so the tape was the next best thing. One of the best presents I was given many years ago was a LapLink cable and the whole thing was coloured yellow. It was easier to locate than even the tape coloured one.

              Didn't know the University of Woolloomooloo had an engineering department

              Is it staffed by guys called Bruce?

        5. Anonymous Coward
          Anonymous Coward

          Re: The big red button

          It took us a few site relocation to discover the wonderful;l feature.

          We had installed networks in remote offices over a few months which were then closed down we brought the kit back to the office stacked the UPS's for future use bundled the cables nicely and put them in a 'ups cable' box all very tidy.

          Sure enough when we came to re-install the kit in new locations some UPS's wouldn't fire up correctly, some worked fine and some had very strange failures.

          It was pre internet days, our equipment supplier was a box shifter and APC didn't answer the phone so it took trial and error to understand which cable really worked with which model of UPS. Needless to sayt we started labelling the cables up with the UPS Serial numbers and the rule was not to mix cables.

      2. Anonymous Coward
        Anonymous Coward

        Re: The big red button

        Yup, done that too. I can't articulate how much I hate APC.

        All their UPS's seem to fail spectacularly, their data centre environmental monitoring products crash and give erroneous readings, their cameras have the quality of a waterlogged webcam from the 90s, and for that matter all their web interfaces seem to have the processing power of a Casio watch and the crypto standards for the early 90s too. Add to that, everything is hideously expensive.

        I can't understand how they still exist. Then again, my boss laughs when I moan about them, sooo...

        1. disgustedoftunbridgewells Silver badge

          Re: The big red button

          "all their web interfaces seem to have the processing power of a Casio watch"

          I assumed that was just the absolutely terrible APC IP PDU we had about a decade ago. It eventually decided to just switch everything off for no reason one early AM, which was fun. It ended up in the bin.

        2. katrinab Silver badge

          Re: The big red button

          And the competition isn't that much better.

          Can't we have something that is a bit like a laptop power supply except with 20+ x the power capacity?

          1. disgustedoftunbridgewells Silver badge

            Re: The big red button

            I've always wondered why you can't buy PSU's with built in batteries.

            It seems like it should be really simple, well integrated and a few minutes of runtime surely wouldn't be expensive.

            1. AnonymousCustard

              Re: The big red button

              http://www.cpspowertech.com/c14.html

              That do? :)

              1. SImon Hobson Silver badge

                Re: The big red button

                Interesting that they have a patent on that, I wonder what elements are patented ? Thing is, I can remember seeing ads for pretty much the same things, but something like 25 or more years ago - so any patents from then would have expired.

                1. The Dark Side Of The Mind (TDSOTM)
                  Coat

                  Re: The big red button

                  I once had an "internal" UPS on ISA bus with a lead-acid battery (industry standard by then) stuck inside the mid-tower PC case and secured with 2-sided adhesive tape... It ran flawlessly for almost 6 years until the electrolyte dried a few weeks before a power outage... And it had almost no monitoring capabilities (more than 20 years ago)... That piece got to rest a while in a "spare" equipment for 2 more years before some beancounter decided that the residual value was NIL.

              2. DropBear Silver badge

                Re: The big red button

                No, it won't. That does not look like an actual product. Concepts are a cent a googol. Where's the "buy now" button...?

              3. willi0000000

                Re: The big red button

                why do the cps tables compare so many things to the efficiency of the United States Post Office?

                [ i've worked there . . . efficiency isn't really their thing ]

                1. Kiwi Silver badge
                  Pint

                  Re: The big red button

                  why do the cps tables compare so many things to the efficiency of the United States Post Office?

                  Yes. I also wouldn't have one in my house, given their penchant for going postal and destroying valuable nearby things...

                  (--> Beat me to it by 11 hours!)

              4. Kiwi Silver badge

                Re: The big red button

                http://www.cpspowertech.com/c14.html

                Interesting.. Under "Backup Extendibility" , for the UPS column they list "Additional costs for unnecessary parts" (wonder what they mean by that?) but for their own device they just list "Yes". So they give you extra battery capacity for free?

                Would be neat if there's a standard plug on the back of it that'd let you plug into an external battery pack, esp if it works at standard voltages..

                1. No Yb
                  Facepalm

                  Re: The big red button

                  I liked the http://www.cpspowertech.com/about.html page claiming California, but showing pictures of factories in (probably) China.

            2. tcmonkey

              Re: The big red button

              Maplin in the UK (or maybe it was Jaycar in Australia, this was nearly 20 years ago...) used to sell a combined PSU/UPS unit that housed battery packs in spare 5.25" drive bays. You could daisy chain as many as you had case room for.

              I looked for it recently and could find no evidence of it ever existing, but I'll swear blind that it did. Or I'm insane, one of the two.

              1. Kiwi Silver badge
                Boffin

                Re: The big red button

                I looked for it recently and could find no evidence of it ever existing, but I'll swear blind that it did. Or I'm insane, one of the two.

                I think it existed. I recall finding some of the packs somewhere, interest drawn by that they clearly went into drive bays but were not drives. I also recall someone having a couple of very tall tower cases with more drive bays than could possible be filled even with 3 mobos in the case (which I am sure it could fit!), and being told it was for internal UPS.

                Really all you need to do is have the mains go into a charge controller that keeps the batteries topped up, then another circuit off the batteries that runs the internal voltages. Maybe I have a new weekend project...

                (Or just get a decent laptop which comes with its own inbuilt UPS anyway :) )

            3. Kiwi Silver badge

              Re: The big red button

              I've always wondered why you can't buy PSU's with built in batteries.

              Compaq Deskpro 386 - a couple of hulking great caps in it's PSU, enough to keep the thing up for at least a few seconds of power failure.

              During the mid 90s I was running a BBS. Mt Ruapehu was having a hissy fit and as the national power grid runs nearby, the ash on the wires was causing some headaches for much of the rest of the island, including regular (several a day) momentary power outages of a second or two.

              We'd get a few hours of clean leccy, the users would start to trust it'd be OK, start connecting.. The lights would flicker, monitor go dark, modem hang up as the other end's PC shut down, and a few seconds later the lights would be on again I'd be able to turn the screen on and the PC would still be happily running.

              (That computer had another great feature I really miss - the ability to lock out the mouse and keyboard via a KB shortcut that was built into the bios. The machine could boot and do everything normally but you couldn't use the KB without the PW - no screen saver or dropping back to a login prompt. Think I still have it somewhere stuck in a closet, wonder if it still runs...)

              1. Anonymous Coward
                Anonymous Coward

                I take your volcano and raise you Eat Midlands Electricity Board

                It did finally get me the budget for a data centre UPS (£80,000) but a week of intermittent power caused several days downtime as power spikes and drop outs caused individual mainframes and peripherals to drop out or fail. The piece de resistance, having had the entire DC powered down for 5 hours waiting to be told power was stable escalating the call to a senior engineer and then starting the power upo process was for them to drop the power just as several bamhs of HDA's were spinning up. These are the Main Frame Belt Driven Disk Drives of legend. Whilst I was fortunate enough that we didn't get a head crash we got thrown/ broken belts on several devices this lead me to have 2 mainframes on their backs waiting for engineers to come and fix these while the mini computer / unix sys admins were preening about how reliable their machines were (for a change).

        3. Montreal Sean

          Re: The big red button

          Don't knock Casio watches, those Databank watches were the first smart watches!

          I loved my Telememo 30 (Casio DB-31) back in the late '80s.

        4. Bruce Ordway

          Re: The big red button

          >>UPS gear was removed.. hate APC

          I don't remember any occasion where UPS actually helped anything.

          I do however, remember many occasions where a UPS was the cause of a problem... and mostly APC.

          1. rototype

            Re: The big red button

            If your power was as prone to dropping the main breaker as ours is here you'd be happy for ANY UPS - just added another on one piece of kit that wasn't previously protected and was a pain in the arse when it had to be restarted (sometimes 3 dropouts in a day, and at random times upto and including 3:30am) Last one was 11:20 last night.

        5. Anonymous Coward
          Anonymous Coward

          Re: The big red button

          Where I work now there are quite a few Armenian Potato Clocks UPS laying silently in stacks with their batteries removed. A brave soul decided that Eat-On (intended spelling) was a better choice. Not arguing with that...

          Anon because I (still) like that workplace.

        6. Alan Brown Silver badge

          Re: The big red button

          " I can't articulate how much I hate APC."

          Their racks are ok. The rest sucks.

      3. Anonymous Coward
        Anonymous Coward

        Re: The big red button

        Ah, the old 'wrong serial cable shuts down the APC UPS' problem. Yes, I've had that, followed by a very loud "WTAF?". Luckily (?!) it was in my own company offices and not that of a client, but still :-(

        A/C because they still think I was nowhere near it.

    2. macjules Silver badge

      Re: The big red button

      Great story.

      In the mid-eighties at a "secret government installation", or "Whitehall" as the rest of us called it, I was once asked to audit some of our security protocols for emergency shutdowns. The server room shutdown control was a suitably impressive big red button, on a bright yellow base with a sign saying "Do Not Touch Except in Emergency" plus various threats as to what would happen if you did. Back then there was a tendency to colour halon controls in green, so one might expect a separate green button on a yellow base with a similar sign above it.

      But no, this is government contractors at the lowest price we are talking about. Thankfully nobody had ever had to hit the red server room power button, as the halon door lock and gas release had also been wired into it. Checking the evacuation protocol for around 5 admins it would appear appear that the Halon lockdown would have killed all of them since it gave less than 3 seconds to evacuate the room before engaging the door lock and releasing the flame suppressant.

      What most distressed the department was not the fact that it could have killed 5 admins, but that it was in clear breach of HSE guidelines, and most of all in breach of a new piece of legislation, Reporting of Injuries, Diseases and Dangerous Occurrences Regulations 1985, or RIDDOR. Typical civil service.

      1. Sgt_Oddball Silver badge
        Trollface

        Re: The big red button

        I didn't know Simon and the PFY did government contracting...

      2. Imhotep

        Re: The big red button

        You just weren't cleared to read the documentation on how to remove entrenched civil service employees, five at a time.

        "Fred, good news: You've been promoted to Admin! Now sort out this problem in the datacenter."

        1. macjules Silver badge

          Re: The big red button

          Well I certainly had the clearance. I just used to wonder why there was such a high turnover of "ICL-qualified" staff moving onto the commercial sector, often without saying goodbye. Now I know.

        2. Anonymous Coward
          Anonymous Coward

          Re: The big red button

          Ahhh, reminds me of that old British military axiom, "here's to wars and sickly seasons", as the fast ways to promotion.

      3. Swampie

        Don't die on the Job

        Big Sign on the Door of HR: Due to new Govt Regulations; Do NOT Die on the Job. The paperwork is terrible, and we don't have the man power budget to process everything. Please die at home!

        1. shedied

          Re: Don't die on the Job

          Don't forget

          Employees are not encouraged to donate any internal organs. You were hired with everything intact, so giving one of your kidneys away would mean a reduction of your salary ( even if said organ was taken under unusual circumstances in your recent leisure trip to Vegas).

      4. Mark 85 Silver badge

        Re: The big red button

        What most distressed the department was not the fact that it could have killed 5 admins, but that it was in clear breach of HSE guidelines, and most of all in breach of a new piece of legislation, Reporting of Injuries, Diseases and Dangerous Occurrences Regulations 1985, or RIDDOR. Typical civil service.

        Indeed, to hell with the people, the paperwork is a nighmare for the equipment.

      5. John Brown (no body) Silver badge

        Re: The big red button

        "Checking the evacuation protocol for around 5 admins it would appear appear that the Halon lockdown would have killed all of them since it gave less than 3 seconds to evacuate the room before engaging the door lock and releasing the flame suppressant."

        Where they wearing red shirts?

  3. chivo243 Silver badge
    Facepalm

    When your stomach sinks to your shoes

    I was once next to a fully loaded rack that went into silent mode, I never knew the AC was so loud! And yes there was an Armenian Potato Clock at the bottom of it all...

    1. Phil O'Sophical Silver badge

      Re: When your stomach sinks to your shoes

      I never knew the AC was so loud!

      It's the way that my footsteps echoed so loudly off the floor tiles as I walked back to the door that used to get me.

      1. kev4d

        Re: When your stomach sinks to your shoes

        I've been in caves a few hundred feet down that were not as eerily silent as a recently running data center.

    2. Dr Dan Holdsworth
      Pirate

      Re: When your stomach sinks to your shoes

      There are worse things than noisy AC and silent racks. One of these is silent AC and noisy racks, because the blissful silence of the lack of AC is very soon punctuated by screams of panic and the sound of big unix kit being emergency shut down.

      Yes, this happened at a site I know of. It is quite an old centre for computing excellence, which once produced a book on why outsourcing was a bad idea right at the same time as an outsourcing attempt was going wrong...

      1. kev4d

        Re: When your stomach sinks to your shoes

        Worse yet if the chiller unit shut down because it iced up... and the DC "designer" thought putting it above the racks would be an efficient use of space.

        Tarps! Buckets! My kingdom for a sponge!

        We may hold the unofficial record for uninstalling and moving an EMC Isolon SAN.

        1. Richard 12 Silver badge
          Facepalm

          Re: When your stomach sinks to your shoes

          We used a shower curtain.

          Turned out there wasn't actually a roof on yet, so when it rained (once every few years) the ~1000A rated power distribution system and data racks got a nice sluicing down.

          Marble floors though, which was nice.

  4. big_D Silver badge

    Site services

    My desk was moved across the room. The raised flooring had power and networking in tanks. Site services moved the box across to the new position, removed the tile that was there and rammed the tank into place. I didn't see them do it, but I'm pretty sure that they had used a lot of force, because they had rotated it 90° and when I plugged in my PC and turned it on, a huge spark shot out the back and smoke wafted out of the fan grill.

    Somehow they had managed to swap live and earth and it go BANG! It is a shame I didn't turn the monitor on first, because it was blurry and flickered... Hey ho.

    Another time, a DEC engineer came out to do a memory upgrade on a Vax. There were several machines in a row. They moved all the jobs to the next one, ran the shutdown sequence and the DEC engineer disappeared behind the Vax to throw the mains switch on the wall. Only he "missed".

    The ops at the terminal looked up from the console with a puzzled look as he reappeared, whilst screams started emitting from the next machine in the line - the one that had received all the jobs and users from the machine being shut down. Yep, you guessed it, he threw the wrong switch!

    1. Steve Davies 3 Silver badge

      Re: Site services

      Ah yes, I remember those days very well.

      Being tasked with doing things like upgrading ram on Vaxen from time to time, I took a lesson I'd learned from an ex USAF Avionics fitter with me. In those days, there was a key switch on the front panel.

      Once the correct system was located (with site staff confirming that it was indeed the right machine) we'd switch the system off. Then I'd attach an 'Engineer working on system' sign to the front and then I'd tie one end of my 'get the right cabinet device to the key. This was a 10ft long piece of string with one of those rubber quoits tied to the other end. This was tossed over the top of the cabinet so that you would know which was the correct one to work on at the back.

      There was hell to pay one day as we went to lunch (and to warm up as those DC's were cold) and came back to find that the 'Engineer' sign had gone as well as my bit of string/quoit tool. My Field Service toolcase was also missing. The system we'd been working on was also powered on.

      It turned out that a PHB was unable to play 'Adventure' in his lunchtime so came into the DC to investigate. He unilaterally decided that the job was done and tidied everything away.

      He was still pissed off when we returned as he still could not get to the server. Well, as it had zero RAM it couldn't run very well now could it. That didn't cut the mustard with him one little bit.

      I got a lot of verbal from him all the time I was trying to finish the job. I found my quoit tool in a skip outside the building.

      I refused to go back to that site. The company was taken over a few months later and that PHB was almost the first to be given his pink slip. (this was NYC)

      Such was the numbskulls we encounterd on an almost daily basis.

      1. phuzz Silver badge

        Re: Site services

        I don't know which manufacturer first thought of those ID lights that most servers have these days, the ones that when pushed, start flashing a light at the front and back of the machine, but they're a genius idea and have surely saved me from many an otherwise embarrassing screw up.

        1. KLane

          Re: Site services

          IIRC, it was Dell PowerEdge series servers.

          1. Sandtitz Silver badge

            re:Identification light on servers

            "IIRC, it was Dell PowerEdge series servers."

            I'm pretty sure it was Compaq DL380 G2 that introduced the UID button, back in 2001. (?)

      2. Claverhouse Silver badge

        Re: Site services

        Both the PHB and the insolent scum who unilaterally threw away your quoit, if different, should be marooned around less-travelled ways of the Galapagos with enough tins of baked beans to keep them from starvation for 10 years.

      3. Alan Brown Silver badge

        Re: Site services

        "It turned out that a PHB was unable to play 'Adventure' in his lunchtime so came into the DC to investigate. He unilaterally decided that the job was done and tidied everything away."

        That would be grounds for closing off the job "as-is" and leaving the site citing unauthorised interference with work in progress as well as theft of tools (missing toolcase), along with abuse (the verbal)

    2. Stevie Silver badge

      Re: Wrong Server Back Panel

      So many racked $$$$ servers.

      So few $ label printers.

  5. A K Stiles Silver badge
    FAIL

    failure to think it through

    In a small org in the mists of history, they'd gone to the trouble of putting each of the servers onto its own UPS battery, which would allow enough time to determine whether a power cut was likely to be a short spike or a prolonged affair which required an orderly, but manual, shutdown.

    The screen for the server was not on the UPS, but no worry, as we could RDP in and run the necessary commands, thus also handling out of hours power cuts.

    Fine for a couple of brief instances (<5 seconds) of power loss, but the first time it looked like being a longer outage it was suddenly apparent that none of the network equipment was powered, not a switch, not an access point, and no spare UPS sockets available into which a screen could be plugged...

    1. Woza

      Re: failure to think it through

      Had something similar. Small company, "server room" was a couple of racks in a small room with no emergency light. Power cut happened, cue frantic beeping from our Potato Clock, and a mad scramble (using phones as torches) to try and plug a monitor in to perform orderly shutdowns on the servers before UPS drained completely.

      1. Venerable and Fragrant Wind of Change

        Re: failure to think it through

        Aha. Your self-training should include thinking through how to perform shutdown blind, using just relevant keystrokes. Login/unlock sequence, "get me into a shell", and shutdown itself. In the case of a server, the absence of any GUI simplifies things.

        Now where can I find a keyboard for this server? Damn, no USB?

        1. Hans Neeson-Bumpsadese Silver badge

          Re: failure to think it through

          Aha. Your self-training should include thinking through how to perform shutdown blind, using just relevant keystrokes. Login/unlock sequence, "get me into a shell", and shutdown itself. In the case of a server, the absence of any GUI simplifies things.

          Reminds me of a machine that I had years ago with a dodgy display adapter - if you switched the monitor off, then when it was switched back on again the display output was garbled to b*****y..

          During his late night walkabouts, our security guard was wont to switch off any monitors left on...including mine, despite instructions not to.

          I often left jobs running overnight, so it was a problem when I came in the next morning as I couldn't use my machine.

          I was reluctant to get rid of the machine in question as it was the only Pentium in the office (I said this was years ago). Consequently I had to learn the keystrokes to do a blind orderly shutdown of WinNT, so that I could then reboot my PC which would make the display work again.

          Happy times.

          1. Claverhouse Silver badge

            Re: failure to think it through

            During his late night walkabouts, our security guard was wont to switch off any monitors left on...including mine, despite instructions not to.

            Was there any compelling reason his services weren't dispensed with on the third occasion ?

            1. Alan Brown Silver badge

              Re: failure to think it through

              "Was there any compelling reason his services weren't dispensed with on the third occasion ?"

              I was thinking the same thing.

              We had a security guard who would turn off AC units left running - including in labs or server rooms.

              That stopped when the billed the (outsourced) company for the downtime.

    2. Nunyabiznes Silver badge

      Re: failure to think it through

      We as IT are (of course) in the basement. The facilities guru has decided in all of his wisdom that we are the only office not to have access to the generator or emergency lights. When the power goes out we have to fumble around for our cell phones (most of us have our own rechargeable LED flashlight at our desks now) so we can find our way over to the data center so we can shut down the whole shebang right quick because although the racks go through UPS to the generator, he refused to have the HVAC for the area added to the generator. Reason? The generator he specified didn't have enough power to run the C-level HVAC and our HVAC. At least they'll be comfortable twiddling their thumbs while waiting for the power to come back on so they can get back to work.

      1. J. Cook Silver badge

        Re: failure to think it through

        Heh. We had *just* finished migrating all our servers from one data center to a new one, when we had a power outage at the new site. The big 3 phase UPS worked fine, the generator kicked in, but we couldn't get into the data center. Why? because the door controllers that operated the keycards and locks were off-line, and the locks were all 'fail secure'. To add injury to insult, the cubicles that were put into the building just outside the data center were also not on generator power, so we were kinda screwed until we could get to a different site and remote into the servers to make sure they were OK.

        That got fixed rather quickly.

      2. Alan Brown Silver badge

        Re: failure to think it through

        "When the power goes out we have to fumble around for our cell phones (most of us have our own rechargeable LED flashlight at our desks now) so we can find our way over to the data center "

        Notify your HSE bod - and if that's ignored, kick it up to the local council HSE people.

        watch how fast things get fixed when things like "criminal prosecution" and "Jail terms" get mentioned.

        1. Nunyabiznes Silver badge

          Re: failure to think it through

          You would think. Our egress isn't legal for emergencies (windows are too small and open out into the window wells, which are not kept clear of snow for the 6 months that's an issue) but when that was reported to the local fire marshal our entity got a pass - apparently because our space isn't supposed to be used as office space we don't rate egress?!? This is by the same fire marshal that wouldn't sign off on a building occupancy permit until we had 2 different alarm systems with separate vendor phone lines for the building and the steel beams had to be coated with flame resistant coating. Which was an open gazebo made entirely of steel. He got quite upset when I suggested that anyone that couldn't figure out which way to escape out of a gazebo on fire probably should be cleansed from the gene pool.

          Something is rotten, but I'm not going to be able to fix it. Hopefully I get out of here before this rock pile falls into the hole I work in.

  6. seven of five

    Been there, done that.

    No shirt, of course.

    Few things in a datacentre are as load as the silence of a rack where noise should be.

    1. stiine Silver badge

      Re: Been there, done that.

      Have had that happen. We discovered that the wall clock was very loud indeed when it was the only noise in the room.

  7. OGShakes

    Small business support

    I worked for a company that did small businesses as a choice. The boss had a nasty habit of forgetting to include a new Potato Clock, i mean UPS, or battery refresh to the old one with every new server install. He also had a habit of thinking that 'its not that old' when he had installed it 4 years before. This came to a head when we had a run of 3 or 4 UPS units fail due to their age followed by a power cut at a customer who had completely out grown the 750 unit they had protecting 2 large servers, a firewall, PoE+ Switch and by extension the 12 phones in the office. Some how he managed to turn this in to a positive with all the customers and sent out emails offering to do a power requirement assessment on all UPS including age check for a small fee. The other 'new' policy was not to plug both power supplies, if there were 2, in to the same UPS, making sure the one not in the UPS only ever went in to a surge protected supply.

    I lost track of the number of UPS and Surge protectors we sold in the following month, in the bosses defense they were sold almost at cost since he was charging for our time. The person I feel sorry for in all this was the Delivery Driver who carried them up to our 2nd floor office...

    1. SImon Hobson Silver badge

      Re: Small business support

      The problem with the "one PSU in the UPS, the other direct to mains" is that most people don't understand the details and what it means in terms of load capability and runtime.

      During normal operations, only half the power goes through the UPS, so the UPS will test the battery, do some runtime calculations, and they will be out - not just by a factor of two, but maybe by a factor of 5. This also means that any "run until the battery reaches x minutes" shutdown config settings are at best questionable.

      Bu the real biggie is when people look at the lights, see that the UPS is only (say) 40% loaded and add something to it. Everything still works fine - until the power goes off, the UPS goes into overload and turns off.

      I have actually witnessed someone who I previously thought knew better, when adding a server to what could only euphemistically be called a rack, put one mains lead into a mains feed, and try the other lead in each of the UPSs in turn till he found one that would take the extra load without beeping. Yup, all the UPSs were at or very close to 100% when only supporting 50% of a significant number of the servers and were guaranteed to just go "clunk" if the mains went off. Mind you, they were all Armenian Potato Clocks (I'd never heard them called that before, love it !) with dead batteries ...

  8. Anonymous Coward
    Anonymous Coward

    Cisco 5500

    Once upon a time, we had to insert a new 48 ports card into an almost full C5500.

    It was of course running production for many systems.

    I was lifting the forest of already cabled cables when the network dude was slowly inserting the card.

    Then there was a big ZAP and the whole switch went silent !

    2 hours of unplugging the thing, unracking the switch and replacing it by a spare one, kindly provided by

    our DC provider (Thanks, James !). Re-plugging all and reconfig of the switch and we were back in business.

    The post-mortem analysis revealed that:

    - the power from chassis to card was provided by a male connector

    - since this was recycled HW, and apparently someone had unplugged it quite brutally in the past, said connector was bent

    - when we slid the card in, the bent male connector created a short, killing the chassis and its 2 PSUs

    I remember the then "CIO" (random idiot having the right nationality) requesting those LAN incident to stop, to which I replied:

    "this would need to use only new kit for LAN and no longer recycled kit, therefore funding". Never got a reply.

  9. Anonymous Coward
    Anonymous Coward

    Many years ago in a previous life I used to set up servers and one of the options was to have a UPS, apc with a cable. Didn't really have any trouble with them till one day we got a call from an irate customer saying it hadn't worked. It seems someone thought that they worked using magic and when reinstalling your operating system you didn't need to add the software for it to do the safe shutdown. I wouldn't mind but it was Linux so you would have thought they would know.

  10. TonyJ Silver badge

    Ahhh UPS issues...

    About 10/12 years ago, I was a regular on site to a particular council in Devon.

    It was a relatively small council in the grand scheme of things, but it does still have a higher-than-average anecdote count, compared to other sites I worked at.

    Anyway, the company I worked for had put in a load of new racks, with new servers and a meaty HP UPS - a 6.5kVA unit if I recall.

    The council sparky had run the requisite 40A commando socket and the UPS was installed, followed by the big-as-houses batteries (they certainly felt that big and heavy!).

    All final checks done, power to the UPS and... the main breaker to the entire building was tripped.

    Uh oh.

    Sparky does some checks. Nothing apparently out of order. Power back on, power to the UPS...tripped.

    Turned out to be a faulty UPS.

    But now the fun and games began, including tracking down the voicemail server.

    Eventually someone from the post room that had been there for years and years overheard and happened to ask if we meant a computer? Yes..yes we did, why?

    He took us into the post room and pointed us to a huge pile of hessian sacks in one corner.

    Underneath it all (literally buried under about 4ft of them!) was an ancient Compaw ProSignia server.

    We powered it on - Netware 3, baby, which was old even then.

    Most things came up but the VM system was still down.

    The head of IT called the company who supported the machine only to be told they'd basically no idea who we were, and did they really support it? Gosh!

    Eventually the admin password was traced (a very techie ncc1701d) and we were able to log on and start the VM NLM.

    We tried to work out, but we believe that that server had survived with near zero air flow for around 7 years.

    I loved working at that place, even if the driver there and home was an absolute bitch.

    1. stiine Silver badge

      Re: Ahhh UPS issues...

      "a higher-than-average anecdote count"

      Have an upvote for that turn of phrase.

      1. The Oncoming Scorn Silver badge
        Pint

        Re: Ahhh UPS issues...

        Tell us more.... As you used the turn of phrase "commando" I'm thinking its somewhere in the proximity of Lympstone.

        Icon - Pints In The Saddlers, The Maltesters (Woodbury), The Puffing Billy (Exton) & The Nutwell Lodge.

  11. Richard_Sideways

    The smoke is its soul leaving its body...

    Had an engineer swear blind that it was ok to hot swap memory on an HP-C240 (it wasn't...POP!).

    And once walked into the build room to the piquant aroma of cooked electronics, with the PFY claiming that Dell had sent us a dodgy batch of new workstations as none were firing up... PSUs were all switched to 110v not 230v.

    1. BigSLitleP Silver badge

      Re: The smoke is its soul leaving its body...

      I once worked with a PFY called Richard that blew a stack of Dell's one after the other because they were set to 110v.........

      J'accuse!

      1. Antron Argaiv Silver badge
        Mushroom

        Re: The smoke is its soul leaving its body...

        I will only use universal supplies in my designs for that reason.

        Believe it or not, I ran across a switchable supply from a major vendor just recently. It was $10 cheaper than the universal one. Go figure.

        1. Alan Brown Silver badge

          Re: The smoke is its soul leaving its body...

          " I ran across a switchable supply from a major vendor just recently. It was $10 cheaper than the universal one. "

          The difference is frequently less than that and I won't have them if avoidable for exactly the same reasons others have pointed out.

          Amongst other things I've had systems "mysteriously fail" after staff were refused an upgrade - to find when arriving onsite that the 230/110V switch had changed position.

    2. J. Cook Silver badge

      Re: The smoke is its soul leaving its body...

      Heh- We had a contractor installing what amounts to a small television distribution system in our dataa center a handful of years ago. The satellite receivers were not industrial or commercial style units, but rather the run of the mill set top boxes you'd find in a home. Said units were not auto-ranging, and 110VAC only. (we run 208 in our DC)

      The contractor was kind of dense, and only noticed the smoke and sparks after plugging the third one in. We were just glad he didn't set off the pre-action or smoke alarms...

  12. fozzy73

    working at a smallish provider.

    ssh'd into a server, big sun stuff 3500, 10000. connection lost.

    call the DC, somebody answers, strange... but mainly why can i understand him?

    A new tank was needed in the floor, to do it a drill was used, too deep.

    Took 6 people the afternoon, the eveving and a good part of the night to repower everything up.

    Still Good times

  13. Anonymous Coward
    Anonymous Coward

    I always find it surprising how often items meant to ensure business continuity are the cause of business "in-continuity".

    From a Raid system borking, to a HA a pair corrupting data as it tried to switch over, to a DR site not coming on line properly during a test or an Active/Passive pair turning into Active/Active unintentionally and creating their independent two sets of data.

    I ran a Netware server for about 12 years, with a single power supply and non-Raid with no loss of data. Even cloud services have given me more time out in recent times.

    P.S. I'm not saying we should all switch back to single unprotected servers, btw.

    1. Doctor Syntax Silver badge

      There are a couple of reasons. One is that they involve processes and kit that are seldom used. When they do come to be used they haven't had the shakedown that daily use involves so are likely to be more fragile and, in the case of equipment, possibly expired of old age. Of course that leads to a situation where regular testing is avoided due to fear which just makes the situation worse.

      The other is that they only come into play in extreme situations which might be so extreme as to exceed their capabilities, e.g. a lightning strike that took out all the thyristors in the building UPS leaving the non-UPS power unaffected.

      1. Anonymous Coward
        Anonymous Coward

        Three years of patching, updates and incidents all occur on the same day, just when you need it most.

  14. Oengus Silver badge
    Pint

    Spanner in the works

    Setting the Way Back machine...

    In the late '70's I was working shift in the state data centre for a major OZ bank. This was in the days before ATM's were prevalent and there was no such thing as EFTPOS. This particular day I was on Morning Shift (06:00 start). The manager of the centre had scheduled some work on the main power distribution board so we only started essential systems. The ATM network was scheduled to come up at 07:00 and the Branch On-Line network was to be active by 08:00. The sparky arrived at 06:30 and was directed to the distribution board. He opened the doors and looked inside. He had a large spanner in his hand and when he went to throw the main breaker somehow he managed to short the main power leads with the spanner. There was a huge bang and the flash lit up the room brighter than the brightest welding arc I have seen; then the room fell into a deathly silence. The sparky was thrown across the room and slammed into the back of the tape units. The room went dark (except for the emergency lighting). All of the systems and the air conditioning shut down. We raced for the emergency torches and called for an ambulance. When we reached the sparky he was unconscious but breathing. While we waited for the ambulance the sparky regained consciousness but was saying he couldn't see anything. The ambulance guys said he was lucky to be alive. The hospital reported that he had received flash burns to his retinas. He did recover his sight eventually. The spanner was found later that day and was missing a full half inch of metal from one of the lugs.

    Power was restored at 09:00 and the systems bought up. Distribution board maintenance was never again scheduled for a week day.

    Beer - because after that shift we really needed it to settle our nerves.

    1. Anonymous Coward
      Anonymous Coward

      Re: Spanner in the works

      I remember a training video on arc flash, can't remember where I saw it.

      Security camera footage, looking from an off angle past a large industrial cabinet.

      Two guys thrown back when the flash happens, smoke, fire, arc, the works. "How many people were involved with this accident?"

      The correct answer was 3, the two thrown, with one vaporized/skeletal in the cabinet.

      Arc flash suits are no joking thing, even if they always smell like a locker room.

      1. Anonymous Coward
        Anonymous Coward

        Re: Spanner in the works

        I have that training every few years. Luckily I don't actually have to ever work on anything over 240 V. We had some sparkies here who made themselves a lovely secret break room with chairs and a table inside an equipment room, behind a steel door with a big "Arc Flash Warning! Authorized Personnel Only" sign on it. Yes, the entire inside of the room was technically in the zone where a protective suit is required.

        They're no longer employed here.

        1. Anonymous Coward
          Anonymous Coward

          Re: Spanner in the works

          1300V, de-energized at the substation, was the highest I messed with.

          480V polyphase I'm fine with, as long as the source can be unplugged, or breaker is off and I can LOTO.

          220V/110V single, I have been known to live-wire wearing nitril gloves if the branch is 10A or less, like a dimmer switch replacement.

          1. Doctor Syntax Silver badge

            Re: Spanner in the works

            Great respect for the guys who live jointed the faulty 3-phase in our road a few months ago; that would rate at a huge amount more than 10A. About 12Ω showing in the neutral. It buts onto the section that was replaced with similar problems a few years ago. I think the entire cable is being replaced in 12 metres stages.

        2. Alan Brown Silver badge

          Re: Spanner in the works

          "Luckily I don't actually have to ever work on anything over 240 V. "

          50V DC (as used in telephone centres) is quite sufficient - some of those busbars can be 500mm each side, carrying upwards of 10,000A

          Stories of dropped spanners between busbars are legion - and from experience witnessing one, usually understate the fireworks.

    2. dave 76

      Re: Spanner in the works

      >

      and when he went to throw the main breaker somehow he managed to short the main power leads with the spanner. There was a huge bang and the flash lit up the room brighter than the brightest welding arc I have seen; then the room fell into a deathly silence. The sparky was thrown across the room

      The hospital reported that he had received flash burns to his retinas. He did recover his sight eventually. The spanner was found later that day and was missing a full half inch of metal from one of the lugs.

      >

      I had an almost identical situation working for a Visa Data Processing Centre in NZ - those antipodean sparkies! We had a couple of massive generators outside for weeks while the power distribution room was being completely gutted and rebuilt but at least no one died

  15. Stratman

    Now happily retired, I used to work for a world famous UK based broadcasting corporation, in TV outside broadcasts. Way back when we had live limited overs cricket on Sunday afternoons, we were happily well into the first innings when all the fans started rhythmically speeding up and slowing down and bulbs got brighter and dimmer. I glanced up at the incoming voltmeters which were swinging up and down like a fiddler's elbow. The AVRs on the input couldn't keep pace with the swings so there was only one thing for it. On a live, non repeatable broadcast I reached up and hit the breakers, turning off the technical supplies to the kit and removing cricket from thousands of tellies up and down the country. The dividing door between us engineers and the production gallery popped open and an inquisitive looking head poked through.

    We explained if we hadn't done it we'd have lost the programme completely as every fuse would have blown whereas now, once we get to why it happened, enough kit will have survived to get something on air. many OBs are powered by a portable generator (think the size of a bin lorry) so our capo di capi went off to pass the time of day with the genny man. It turned out he had a minor problem with the stabilisation control system on the generator so though he'd use the time fault finding. He bypassed the system completely, then grabbed hold of the engine speed controller and started to give it a good workout. While the error of his ways were explained to him we'd been on our hands and knees, scrabbling around behind dusty bays replacing fuses. We lost about five minutes of airtime in total and came out quite well in the subsequent enquiry.

    1. KarMann
      Facepalm

      Things you read wrong

      I must admit, since you were talking about cricket, when you said 'all the fans started rhythmically speeding up and slowing down,' I pictured something quite different from what you'd intended, and was trying to figure out what would cause that behaviour from the crowd.

  16. BigSLitleP Silver badge

    Not that many moons ago, I started at a small IT provider. On my second week, I get in to the office and i'm the first one in. Or so i think. I settle in, log myself on and one of the directors walks in looking a little tired. "Do you know much about UPS?" he asks. "A little bit, yeah" I reply. "Come with me", he says.

    We trundle off to the comms room. He walks over to the cabinet and point to an Armenian Potato Clock. There are many error lights flashing away and he asks what i think is wrong with it. I walk around to the back of the unit and I advise that it would probably be a good idea to unplug it. The director asks why and i answer that the popping noises, crackling and smoke coming out of the back of the unit are probably bad signs.

    We unplug the unit and disconnect it from everything and make sure that it isn't going to explode. I enquire about a warranty on the unit but the Director says that's unlikey.

    "How come?" I ask

    "We bought that one second hand off ebay"

    ...

    ...

    ...

    1. Antron Argaiv Silver badge
      Facepalm

      I enquire about a warranty on the unit but the Director says that's unlikey.

      "How come?" I ask

      "We bought that one second hand off ebay"

      YOU WOULD THINK...that, having made it to Directorship, the bloke might have learned something along the way...like the old adages "you get what you pay for", or "nothing lasts forever" or, "there must be a reason they're selling this". You would be sadly mistaken.

      Apparently, from the stories above, Directors are a special breed, having been placed (by God?) in positions of responsibility without having been supplied with the tools required to do the job.

      1. Imhotep

        I worked once for a consulting company whose servers/switches/routers were all old units purchased on EBay. It was getting somewhat challenging to find and buy replacement SCSI HDs for the aged RAID controllers.

        On the bright side, those old Dell Proliant servers and Cisco routers/switches were really well made work horses. But eventually even the best workhorse needs to be taken out and shot - I mean retired out to pasture.

      2. Stevie Silver badge

        eBay

        I had a colleague who would tease our Sun engineer into apoplexy in hardware provisioning meetings by suggesting that we buy switches on eBay.

        He would have cost comparisons, printouts etc and could do the sincere suggestion thing with an absolutely straight face.

        Poor old Max (name changed) would melt down every time.

      3. Mark 85 Silver badge

        Apparently, from the stories above, Directors are a special breed, having been placed (by God?) in positions of responsibility without having been supplied with the tools required to do the job.

        Wrong thinking. Directors are there for the profit bonuses and keeping shareholders happy. So cutting costs is a priority.

      4. Doctor Syntax Silver badge

        "without having been supplied with the tools required to do the job"

        The sole tool may well have been enough cash to buy a slice of the business.

    2. Dyspeptic Curmudgeon

      Not that many morons ago.

      I read the first line of this as: "Not that many morons ago,...."

      It was a couple of lines later that it sank in, that I had mis-read something!

  17. WanderingHaggis

    I was working in France and had to call to get a replacement cell for our aging APC unit. I phone the customer support and chatted with the person who answered in my Glaswegian french. Half way through the call it registered in my brain that the APC person had given a distinctively Irish name and that APC had a call center in Dublin. I was sorely tempted to say thank you in Gaelic to her but chickened out feeling a bit daft talking French when I didn't need to. She seemed like a nice colleen.

  18. Tim99 Silver badge

    I took out a block of 20+ residential units with a cheap domestic level Potato Clock that had a built in block of domestic power outlets. I was tidying up our home office and thought that it would be a good idea to plug my mobile phone charger into the UPS to give me a spare power point above the desk. The charger was faulty - The resulting bang temporarily deafened me, tripped the house power board, blew the 50A fuse to the house, and took out the main circuit breaker for the site. Our electrical contractor fixed everything past the breaker in less than half an hour but we still had no electricity as the main breaker was dead. He called the power company. We had to wait for another 2 hours because the power company engineer did not have a spare breaker so he had to go back to base. I admitted what had happened, but I don’t think the engineer believed me until he saw what was left of the phone charger.

  19. This post has been deleted by its author

  20. Mine's a Large One

    Years back, our IT building (and the rest of the Regional HQ it was tacked-onto) was backed by a couple of fairly substatial diesel generators, which could apparently power everything "almost indefinitely" should the need arise.One day everything in the office suddenly went dark and very quiet, then before anyone could say anything, bright & noisy again. Yay. About 5 minutes later, we're dark and quiet again but this time it stayed like that. I looked-up to say something to a colleague and noticed the thick plume of smoke from the generator house... We were sent home later when it was clear the power wasn't coming back!!

    Talking to the maintenance guys next day, I was told that workmen had killed a local transformer up the road, which had triggered the first gennie to start. When it wasn't fully delivering power a few minutes later (although it was running), that triggered the second gennie to start which promptly started spewing fuel over everything in sight, including the exhaust of the first, which was now quite hot causing lots of smoke (they never said if it started a fire but the general consensus was it had),

    1. Anonymous Coward
      Anonymous Coward

      The weekend shifts at the control room of a bit of national infrastructure did not know that the indicator light on the wall meant diesel generator active. It worked really well, so well they didn't notice any difference. Until it used up its tank of diesel.

      Apparently you can't just call the AA to get an emergency fill up. The fuel has a special biocide blended in as it night be sat around for months. Somehow Business Continuity managed to get the Army which had a tanker of the stuff on standby.

    2. Anonymous Coward
      Anonymous Coward

      Back in the 1970s my old man worked at a large factory/works in Shrewsbury which had its own behemoth of a diesel engine on site to run the place during power cuts.

      The problem was, this vast diesel mill wasn't in its first flush of youth, so when it was run every now and again to test a little oil and unburnt diesel would make its way through the cylinders to the exhaust.

      Then, one day, the big beast was fired up in anger, and ran up to full temperature. This warmed up the oily crud in the exhaust to the point where the diesel/oil mixture self-ignited.

      The blowtorch like flames were apparently over 10ft long!

  21. Mr Sceptical
    Facepalm

    Pretty Dodgy Units

    Never mind the Armenian Potato Clocks, I've had more issues with loose/poorly inserted IEC leads in PDUs suddenly dropping a rack of switches.

    Won't somebody think of the 'el cheapo' kettle leads!

    1. Alan Brown Silver badge

      Re: Pretty Dodgy Units

      Ideally IEC sockets should be locking ones. If not, then a collar will make sure the plugs fit tightly and a lockwire makes sure they STAY in.

      What? you don't have those? My condolences.

  22. Swampie

    Let out the magic smoke.

    There I was, it's 2003 in deepest, darkest Iraq, the rebuilding, the successful in removal of Goddam Insane... you would think smart things are about to happen... but we are here.

    Rebuilding/ cleaning up/ sorting out an old Iraqii Army Base, which had been bombed three times in 3 wars... 2 x by my own US Air Force! Well, building is rebuilt, new power, new everything... And we build out an important ( read big budget ) Office building for the CPA ( Coalition Provisional Authority!). Well, in Iraq, they love England... so electrical power and bits and pieces mirror old blightey... well, not with US Trained techs/ and engineers... all the computers bought from and shipped from CDW in USA... ( read US Power supplies and cables!). Well, one emergency phone call later and a late night flight from USAF/RAF Mildenhall... we had a box of 50 electrical plugs converters US <--> UK. ( hint, hint... they only convert the physical cable!).

    Well, My buddy sets up 3 nice Dell computers on new desks... three monitors... all into one US Power strip! Oh, and he uses a US/UK converter at the end of the power cord! Way cool, fire up this puppy!!! Pop, pop, pop... all the magic smoke gets let out, and the powerstrip catches on Fire to boot!

    The only savings grace was that the monitors were auto sensing and didn't get zapped... but all the power supplies on the desktops were hard switch 110/220 using a small flat head screw driver... and that was beyond his experience!

    Fun times!!!

  23. eldel

    Potato Clock alternatives

    Sometimes the El Reg timings are eerily aligned with my reality. One of my home office potato clocks died this morning and I feel disinclined to purchase another. Any advice from the commentariat regarding a suitable alternative?

    1. Anonymous Coward
      Anonymous Coward

      Re: Potato Clock alternatives

      A regular set of backups

  24. devilsinthedetails

    Remember to plug in the servers

    I recall visiting a customer site and the IT Director was very proud of the brand new computer room and shiny new equipment with a UPS in each cabinet until I pointed out that the large expensive multinational IT service company who had fitted it all for him had only plugged the Monitor and KVM for each cabinet into the UPS, no protection at all for the servers and network gear.

  25. Hotears

    Silence of the fans

    Oddly enough, the times my mistyping or forgetfulness has brought down kit, either the customers were quite happy to be told before they noticed, or the resulting loss of smoke was not expensive enough to worry about. Green fire yes, but nothing worse than that.

    There was the night I arrived at work expecting to find that the flaky router we were trying to figure out had fallen over again, and tried to get on wireless. No luck. Very odd. New plan, prop open the door to the switch room and plug directly into the core - only the links to the server room in question were all dark.

    The sensation of walking through what should have been an operating server room, alone and at night, with the bit of sand you have under your shoes crackling loudly, was kind of neat - though it took four hours I would rather have spent in bed to get the Aperiodic Pixie Converter with its failed bypass relays bypassed.

  26. Ribfeast

    I have accumulated a stack of half a dozen potato clocks over my years working for various MSPs and enterprises. All were replaced due to calendar life or failed batteries, so I take them home and put in fresh batteries and power my rack with them. I had a 5kVA one pop and let out the magic smoke. Similar story a few years later with a 3kVA one. I've now put another 3kVA one in place, but I've cobbled together a homemade giant battery pack for this one, 48V, using every SLA battery I could find. What could possibly go wrong...

    I must say the runtime is now excellent, but I fear for the lifespan of the charger inside the unit.

    I think the best learnings from it all would be to have two potato clocks installed, one per power supply/PDU, at no more than 30% load per unit. Allowing for 60% load if one pops, and preferably have each UPS on a different switchboard circuit.

  27. Dabbb Bronze badge

    And that's why kids

    you don't put your production stuff into Tier 1 Datcenters.

    /story

  28. khenault

    New job

    We had a new computer operator start in the data center. First day, first hour, he hits the exit button near the door to leave the computer room. Only it's not an exit button like it was at his last job. No, it's the emergency power off button.

    An investigation was launched, his former employer was contacted, and sure enough there was a button near to door that you had to hit to exit at his old job. Next day a clear plastic cover was added over the EPO button. Oh, the new operator did get to keep his new job.

  29. StargateSg7 Bronze badge

    For your UPS power needs may I suggest this one:

    https://www.ballard.com/fuel-cell-solutions/fuel-cell-power-products/backup-power-systems

    AND

    the 200 Kilowatt version of a Ballard Fuel Cell which can run off of Pure Hydrogen, Methanol OR with a bit of modification PROPANE !!!

    https://www.ballard.com/fuel-cell-solutions/fuel-cell-power-products/motive-modules

    The fuel cells are UTTERLY QUIET with NO generator noise at all !!!

    With the right fuel source they can act as prime power too!

    We modified the 200 KW Ballards into a 20 Megawatt stack for the Northern British Columbia data centre running off propane. The proton-exchange membrane isn't really targeted for the much higher molecular weight of Propane since they USUALLY run off hydrogen, BUT some judicious mods on our part fixed that problem lickity split! When you're running 160,000 of 125 watt CPUs/GPUs, ya kinda need that 20 megawatts to be full independent of BC Hydro...what a yearly propane bill though!

    One litre of Propane will power 7,124 Watts for one hour so 20 megawatts uses 2808 litres of propane per hour and 24.6 million litres per year and at 25 cents Canadian Dollars for a bulk purchase rate, it's around $6.2 million CAN per year to run the place BUT IT IS FULLY OFF-GRID and has NEVER had to rely on outside power. And when you use high pressure tanks (600 PSI) at 250,000 litres each you only need 40 underground full insulated and isolated tanks to run for TWO YEARS fully off-grid!

    .

    Finance and Science people are very weird! They blow hundreds of millions of dollars on huge data centres and the fully off-grid fuel systems to power them BUT only put in a 10 gigabit Ethernet connection to Vancouver, New York, Berlin and Toronto? Are You Kidding Me? 260,000 CPUs and who knows knows how much global system RAM and enormous amounts of SSD storage and you put in a TINY 10 g/bit connection? WTH?

    .

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019