back to article Sysadmin left finger on power button for an hour to avert SAP outage

Welcome to the seventh instalment of Who, me? The Register's new column in which readers share stories of the times they broke stuff without any help at all from users. This week, meet "Jeremy" who back in 1999 scored his first "real" IT job "as part of a team sent out to run the IT at a big publisher." Said team was working …

  1. highdiver_2000

    Why didn't you eject the CD player using Windows Explorer before touching the server?

    1. pstones578

      I will give you that one given the timeframe but if this was SAP then I would guess it was not running on Windows. These days servers never come with CD drives anymore

      1. Lee D Silver badge

        Nope, but they do come with ID lights.

        It's a really dumb thing to press the button on the wrong server. And... if we're talking about an era where holding in the power button doesn't kill the machine hard in 5 seconds, and where NT is running, and where it doesn't auto-power-off on the Turn Off Your Computer screen, then we're back in the age of floppy disks and maybe even pre-CD in your average server.

        But whatever era, there will have been a better way to indicate what server you mean rather than just guessing.

        1. Anonymous Coward
          Anonymous Coward

          a) PL1000 / 1500 did not come with CD ROM per default - they were optional and expensive!

          b) the Y2K Updates were done by Floppy - ROMPaq Updates

          c) Hostnames were put on brownish labels and consisted of 8 positions: 2 letters for the city and then 6 numbers. No clue what kind of server it actually was - the lists were all you had.

          After several hours of too many dB, temperatures between 15 and 40 °C (Depending on where you were in the DC) - one tends to get a bit "unfocussed" - as was the case with my pal J here.

          (Yes, I post this anonymously - but I am said "Jeremy" ;)

          1. Emmeran

            Jeremy spoke in class today

            This same sort of thing happened to a friend and co-worker and we did indeed make him stand there holding the button in until we could the users out and the apps shut down.

            I still recall that forlorn look on his face as he stood there alone in the data center, it brings a smile to my face.

          2. Marco van de Voort

            Moreover the default cdrom was not exactly standard. It was connected on the onboard SCSI (the IDE was connected to the floppy ?!?!) and the system firmware could only boot from devices that had a special (512byte sector emulation?) jumper on and used floppy emulation. In the mid 2000s the only distro that booted was Slackware 8.1

            1. Alan Brown Silver badge

              "Moreover the default cdrom was not exactly standard. "

              No, but ejecting it makes it clear WHICH server you want rebooted.

              1. Marco van de Voort

                ejecting

                I know, my point was more that it was not a given that would work. The first Proliants were mightly quirky beasts, and the CDROM BIOS support was minimal and ancient.

        2. Anonymous Coward
          Anonymous Coward

          "[...] there will have been a better way to indicate what server [...]"

          Head came round the door "All yours". So I headed off to the machine room to do my testing on the cold stand-by comms processor. The console was mounted on top of the unit. Hit the keys for debugger mode - machine stops. Sudden howl of anguish behind me.

          There were two comms processors. That day they had decided to use the stand-by one for the official acceptance time trials. Thankfully the presiding government official allowed that as a genuine mistake that did not affect the acceptance criteria - and the repeat run was ok.

          After that there was a large notice on whichever one was the live machine.

        3. Dave K Silver badge

          Well, I'm assuming that as these were Y2K updates, we're talking of a server from around the mid-90s. I still saw plenty of AT (ie, mechanical) off switches in those days.

          As it is, hindsight is a wonderful thing. However, mistakes do happen...

  2. Anonymous Coward
    Anonymous Coward

    The weather

    I did turn it back on, but we had to cut back to the sofa for a few seconds while I figured out what I'd done.

  3. Anonymous Coward
    Anonymous Coward

    I did that once, not on a production server but home computer back in the days when a power off could kill it forever or severely mess up your next reboot. It teaches you a lesson about computer placement that you never forget. These days I have two pieces of cardboard wrapped in black tape over the top of the buttons because they are on top of the case.

    1. TechnicalBen Silver badge

      New PC case.

      I have a wonderful mini-itx PC case. Only problem is the power button is on the top, just where I may rest something for a moment. Like a game controller or whatever.

      I only did it the one time. ;)

      1. Boothy

        Re: New PC case.

        I like my current Antec Tower case (had it years now, triggers broom ya knows).

        It has a full height door on the front, hiding things like 5.25 and 3.5 bays (all unused these days), but it also hides the Power and Reset buttons.

        I don't know if by design, or accident, but the edge of the door also has large (finger sized) air vent holes from top to halfway down, the bottom one of which lines up quite nicely with the buttons.

        So no way to hit them by accident, but you can still use them without having to open the door.

      2. d3vy Silver badge

        Re: New PC case.

        "I have a wonderful mini-itx PC case. Only problem is the power button is on the top, just where I may rest something for a moment. Like a game controller or whatever."

        I used to work in an office where there were two banks of desks fed from two sockets (with extension leads - but thats not the wtf here..) the sockets were at about the same height as the head rest on your average office swivel chair and positioned right behind someones desk.

        The number of times that the power got knocked off started to get daft so the managements solution... Not to move the socket... not to change the sockets so there was no switch... Shove a few old PSUs under the desks to feed the PCs if the switch gets hit.

        1. JimboSmith Silver badge

          Re: New PC case.

          At home I have spike protected power strips on all my equipment. I did once have something that was fried when the power came back on after a power cut and am now slightly paranoid. Most of them have either no power switch or a recessed one to prevent you switching things off accidentally. The one place it does have a switch is the living room entertainment area (TV, DVD, Blu ray, Satellite receivers, CD, Amp etc.) My housemate was (out but) recording something she wanted to see and I didn't, so I thought I would tidy up the cables around the back. I was busy doing this when I heard the TV power off and standby lights which were reflected by the coffee table go out. I then spotted the switch that my knee had just hit which turned the blasted strip off.

          The recording was now stopped and it would take a couple of minutes to get everything back up and running. So I switched off and then back on the power to her room at the circuit breaker and claimed it was a power cut. She only lost approximately 4 minutes of the tv prog but but I vowed then and there to replace that strip. I now have a 19 inch rack unit which has proper power distribution strips (with spike protection) screwed to the back. These have sunken switched to prevent accidental presses.

    2. ma1010 Silver badge
      Alert

      Cats can be a problem

      Years ago in college, we worked in groups, and one of my group mates had a cat. One Sunday he was at home compiling all his hard work and watching the cat play around the computer, "Aww, how cute..."

      Then the cat nosed the big, red RESET button and thrashed his work. "Get out of here, you damned thing!" Not so cute, then, apparently.

      1. DJSpuddyLizard

        Re: Cats can be a problem

        @Cats can be a problem.

        Toddlers too.

        On our PCs at home I had to bypass the power and reset buttons and install switches with a key.

        Good old cold-war missile bunker key switches on both PCs, just in case.

        1. elDog Silver badge

          Re: Cats can be a problem

          Yeah - animals and kids and red buttons. Sounds like the state of affairs in the US right now.

          Wonder if he'll press it, change his mind, and keep it pressed in until the B-52's can be recalled.

    3. Oh Homer
      Facepalm

      Re: I did that once

      My only fatal "power off" incident involved a PC with a mechanically failing drive (bearings failure, I believe). It was borked anyway, but in my desperation to recover at least some data I made the mistake of turning it off to try some other method. It never spun up again.

      Then there was the classic where I formatted the wrong drive, back before I knew that drives could be "un-formatted". Lost everything that day. I also gained a greater appreciation for backups.

      There was one happy tale, though.

      I had an Amiga 4K with a CyberPPC SCSI controller, which I'd been using without issue for years, until one day I decided to meddle with settings I didn't understand at all, in CyberPrefs (the SCSI controller's firmware settings). I came in first place for the Darwin Awards that day, and ended up with an unrecognised drive.

      Just to absolutely prove how stupid I was, for some reason I didn't make the connection between my meddling and the fact that my drive had mysteriously disappeared. As my brain sunk deeper into hibernation mode, I gave up completely and bought a PC - my first ever in fact, from the sadly long defunct First Computer Centre in Leeds. So in a way, I have that SCSI controller (or my stupidity) to thank for many happy years playing Doom and Quake, then eventually getting fed up with Windows and switching to Linux.

      Years later I fired up that Amiga 4K, went straight into CyberPrefs, changed back the incompatible setting, and rediscovered my "missing" drive, along with my long-lost youth.

      1. ds6 Bronze badge

        Re: I did that once

        I wish I could have recovered my digital childhood, if even it was mostly poor attempts at MS-DOS scripting, those mail-order screensaver bundles, and directories of disgustingly quantized GIFs... Frankly, I was too dumb to know what a backup was.

  4. Anonymous South African Coward Silver badge

    Nowadays the "hold-the-button-in" trick won't work, because of ACPI, especially on a modern PC.

    1. Steve the Cynic Silver badge

      Nowadays the "hold-the-button-in" trick won't work, because of ACPI, especially on a modern PC.

      "A modern PC"? The first one I ever saw that had the necessary bits was in 1996. Twenty-two years ago.

      1. imanidiot Silver badge

        I guess that just goes to indicate his age. Now get of my lawn!

      2. Anonymous Coward
        Anonymous Coward

        1996? I doubt that. Probably 1997 or 1998 for you.

        ACPI only got released in December 1996, and the first PCs with ACPI sold in 1997.

        Widespread adoption was only in 1998/99.

        Windows 95 had no ACPI support, Win 98 came with disabled ACPI. Only Linux 2.6 and Windows 2000 and onwards supported ACPI. And those OS even disabled ACPI on pre-2000 hardware, as ACPI v1 was quite buggy.

        https://en.wikipedia.org/wiki/Advanced_Configuration_and_Power_Interface

        1. JBowler

          Huh?

          >1996? I doubt that. Probably 1997 or 1998 for you.

          You know Steve, the Cynic, then, Anonymous Coward?

          >ACPI only got released in December 1996

          Duh..... duh..... Like, someone developed it dude.

          Quoting from Wikipedia just proves you work in a troll farm for putin.

          I don't know Steve, the Cynic, but I do know what I was doing in December 1996 and it certainly wasn't released until some time in 1997.

        2. Steve the Cynic Silver badge

          Hmm. Well, it was around that sort of time, possibly 1997 and certainly NOT 1998, that I ended up with a computer at $JOB that switched itself off from Windows 95. Possibly not full ACPI, but not ordinary pre-ATX either.

          1. StargateSg7 Bronze badge

            My old Compaq 386-series did that and that was late 1987 or so. You could shut the computer itself down completely via software by some pushing some values into the x86 registers and calling an interrupt which was NOT part of the MS-DOS or IBM PC-AT BIOS standard INT calls. AND if your terminal display was SCART compatible like ours were (they were basically industrial-grade 20 inch Sony Trinitron TV's used as computer displays with 800x600 pixels of resolution), we could even shut down the monitor from software in 1987! ..SOOOO.....this isn't new technology.

      3. Shart Tank

        I protest! Anything past trumpet winsock is modern!

    2. Anonymous Coward
      Anonymous Coward

      "Nowadays the "hold-the-button-in" trick won't work, because of ACPI, especially on a modern PC"

      It'll work fine. You just need the users logged out, and everything shutdown within 4 seconds. A challenge, but perfectly achievable for a true boffin!

      1. Montreal Sean

        It'll work fine

        "It'll work fine. You just need the users logged out, and everything shutdown within 4 seconds. A challenge, but perfectly achievable for a true boffin!"

        Bah! Users should be saving their work every 10 minutes or less.

        If they didn't, too bad for them.

        Ok, I may be a bit of a bastard...

        1. Anonymous Coward
          Anonymous Coward

          Re: It'll work fine

          I worked in a college with a CTO who thought like that. On multiple occasions I saw him cheerfully reboot terminal servers, kicking all users out without warning, to 'fix' stuck print queues. On examination days.

          I'd then get lumbered with the job of helping anxious teachers fill out the 'extenuating circumstances' paperwork for the roughly 10% of kids who'd basically been given an instant exam resit. Bafflingly, the college never lost its exam centre status, and the CTO was never disciplined in any way, as all the other bosses seemed to view it as an act of God.

      2. Alan Brown Silver badge

        Nowadays you don't need to hold the button in, unless you REALLY mean to switch it off.

        An accidental press won't shut things down. Unless you set it up that way.

  5. alain williams Silver badge

    Typed 'Reboot' where ... ?

    Telnetted into various Unix machines, wanted to restart the one in the server room. Whoops - I forgot which machine I was logged into and typed 'reboot' to a machine on the other side of the planet. It did not come up, had to wait until teatime for the guys there to come in and push a button :-(

    1. wyatt
      Thumb Up

      Re: Types 'Halt' where ... ?

      I can hold my hands up to that one as well. SSH to a server then a workstation, 1 letter (c/s) difference between them and I wasn't on the client.. Fortunately it was a reboot rather than a shutdown and it also happened during another major outage so the impact was minimal.

      1. Chris King Silver badge

        Re: Types 'Halt' where ... ?

        I was on the receiving end of that once.

        An acdemic had moved to another uni, and my opposite number at the new uni was helping him transfer his files from our OpenVMS machine to theirs.

        In another telnet window, the IT bod was logged into his test OpenVMS machine, preparing to test a patch for a nasty little crash bug that anyone with telnet/SSH access could trigger - no extra privileges required.

        Yes, he got the two windows mixed up, and our box dropped dead.

        Boom, crash dump and P00>>> prompt at the system console.

        Fortunately, it happened in the middle of a change window, ironically to install and test the very same patch. Still, it's not really the sort of thing you want to see when logged into SYSTEM at the console and installing patches.

        The other guy phoned me a couple of minutes later and 'fessed up to his mistake.

    2. Aitor 1 Silver badge

      Re: Typed 'Reboot' where ... ?

      A work colleague (admin) lost his job that way many moons ago.

      Hethought he was putting my code (well, a version update of the project I lead) into the integration environment.. but put it into production, as he had both terminals open, and made the huge error of pulling from command line. I had told him before to put into the server, and execute from the server.. as a friendly suggestion.

      He was lucky in the sense that there were no bugs in the code, so in a sense the systems kept working, unlucky in the sense that this was in the client/server era, so the decision was to push the updated client. 45 minutes down time for 50/100 ppl (dont remember well).

      He lost his job for a single mistake in two years, I am still a bit angry about that.

      1. Evil Auditor Silver badge

        Re: Typed 'Reboot' where ... ?

        He lost his job for a single mistake in two years, I am still a bit angry about that

        Angry that he made this single mistake or angry that he lost his job?

        Depending on the type of business 45 minutes downtime may or may not be reason for dismissal. Apparently, more than 5 hours of downtime for approx. 90% of the staff of about 20k (a bank) was no reason for dismissal. Than again, it wasn't due to an operating error but a management decision to implement a half-baked release.

        1. dotdavid

          Re: Typed 'Reboot' where ... ?

          He lost his job for a single mistake in two years, I am still a bit angry about that

          Indeed, seems a bit of a stupid decision to me, especially as the fired guy is definitely going to be the one person you are certain would never make *that* mistake again.

          1. Cynic_999 Silver badge

            Re: Typed 'Reboot' where ... ?

            "

            Indeed, seems a bit of a stupid decision to me, especially as the fired guy is definitely going to be the one person you are certain would never make *that* mistake again.

            "

            The same might be said of a driver who accidentally hits the accelerator instead of the brake and ploughs into a bus queue. But I guarantee he would lose his licence at the very least, and be lucky if he escaped jail. Usually it is the severity of the act that is punished, but sometimes the consequences of a simple mistake are so severe that they are taken into account as well.

            1. Adrian 4 Silver badge

              Re: Typed 'Reboot' where ... ?

              The punishment isn't for the guy who caused the problem. It's a warning for the rest of you.

              And it's only applied by the sort of manglement that values numbers, not individuals.

            2. Wayland Bronze badge

              Re: Typed 'Reboot' where ... ?

              Cynic_999 "accidentally hits the accelerator instead of the brake"

              It's not the same at all because the driver is using all the controls constantly with no problem. To catastrophically make three errors all at one and persist with those errors until people are run over is nothing like being a bit late on the brake peddle.

              There is a video where a man is chased into a layby onto the pavement by a bus which smashes through the front of a shop. The man managed to escape but the driver claimed he hit the wrong peddle.

              From what I can remember about driving (have not driven since yesterday) you don't hit the peddles with your feet you gradually press them to cause the amount of acceleration or deceleration you need. You begin doing this in plenty of time and you can press the peddles harder if you need more effect.

              In a rack of servers it's an easy mistake to be looking at the wrong server, hence the little button that lights up so you can figure out which one you want to work on.

              1. Cynic_999 Silver badge

                Re: Typed 'Reboot' where ... ?

                "

                From what I can remember about driving (have not driven since yesterday) you don't hit the peddles with your feet you gradually press them to cause the amount of acceleration or deceleration you need

                "

                It is easier than you might think.

                Imagine that you are closing slowly with the car in front. So you press gently on the brake but you see that you are still closing with the car in front. So you press a bit harder - and see the gap is now closing *really* fast, so you panic and jam the brake pedal full to the floor. Only later do you realise that your foot had been on the accelerator rather than the brake.

                Or while stopped you start reading a text on the phone in your lap when out of the corner of your eye you suddenly see that your car has started slowly rolling forward because you forgot to set the handbrake. Sudden adrenaline rush and panic, you stamp hard on the brake to stop the car before it rolls into something - except it isn't the brake.

              2. Cynic_999 Silver badge

                Re: Typed 'Reboot' where ... ?

                "

                There is a video where a man is chased into a layby onto the pavement by a bus which smashes through the front of a shop. The man managed to escape but the driver claimed he hit the wrong peddle.

                "

                You really think the driver did it deliberately? You have obviously never reacted in a panic.

        2. I am the liquor

          Re: Typed 'Reboot' where ... ?

          @Evil Auditor

          Depending on the type of business 45 minutes downtime may or may not be reason for dismissal.

          If 45 minutes downtime is that much of a problem, then sacking the tech who caused it by a simple finger fumble is nothing more than scapegoating. More reasonable would be to sack the executive who failed to put in place systems ensuring a simple human error couldn't cause such a serious problem.

          1. Mark 85 Silver badge

            Re: Typed 'Reboot' where ... ?

            More reasonable would be to sack the executive who failed to put in place systems ensuring a simple human error couldn't cause such a serious problem.

            In a perfect world, yes that would be the right thing to do. In the real world, the execs protect each other and everyone else is cannon fodder and/or scapegoats.

          2. Shart Tank

            Re: Typed 'Reboot' where ... ?

            You must be new to this....

          3. Evil Auditor Silver badge

            Re: Typed 'Reboot' where ... ?

            @I am the liquor

            I fully agree with you.

      2. Dave K Silver badge

        Re: Typed 'Reboot' where ... ?

        It's unfortunate if he was otherwise a good admin. After all, if there's one thing you do know afterwards is that you have an admin who isn't going to make that mistake again in a hurry...

    3. Anonymous Coward Silver badge
      Facepalm

      Re: Typed 'Reboot' where ... ?

      Boss: "Client z has a problem, so I'm just rebooting server x"

      Me: "OK, so why have I just had a notification that server y has rebooted?"

      Boss: "Oh shit"

    4. Uplink

      Re: Typed 'Reboot' where ... ?

      apt-get install molly-guard

      Then you get asked: "you want to reboot what?"

      1. keithzg

        Re: Typed 'Reboot' where ... ?

        molly-guard has definitely saved me more than once. I don't have it installed on *all* the servers, but I sure as hell do on the servers where it would matter . . .

        (That being said, if things are fragile enough that a clean reboot is a big problem, things are probably too fragile.)

    5. phuzz Silver badge
      Facepalm

      Re: Typed 'Reboot' where ... ?

      I have to admit to this one as well.

      Now I check very carefully which machine the prompt is for.

      (I've also be auto-logged out of a machine, and nearly run the command on the machine I was tunnelling from instead.)

      1. Boothy

        Re: Typed 'Reboot' where ... ?

        I used to use PuTTY a lot on Windows, into *NIX boxes, back then we had direct access to boxes. So we just set up each environment with it's own custom colour and Widows title settings. Green, you're on a Dev box, Red - prod, etc. Nice and easy.

        These days we have to go via jump boxes, and are usually on Linux laptops. So it's all basically one shell, same colour for text etc. But someone did tweak all the Red Hat boxes, so if in a live environment, the user and server name all have a red background colour at the prompt. (The name@server: bit).

        Still doesn't help if you're on the wrong prod box, but at least you are less likely to run something not prod friendly.

    6. Anonymous Coward
      Anonymous Coward

      Re: Typed 'Reboot' where ... ?

      Yeah, I feel your pain on that one.

      I once managed to balls-up a firewall rules tweak on a Linux machine in our new branch office in Sydney, accidentally removing a critical rule and therefore cutting on my own comms from said machine. Muppet.

      Had to wait for local staff to arrive in the office and talk them through restoring remote access via the console.

    7. GrumpenKraut Silver badge
      Facepalm

      Re: Typed 'Reboot' where ... ?

      Been there, done that, sadly a "power off" in my case. But a machine I could (and had to) drive to. A Friday evening spoiled.

      Since that day remote access terminals have color background.

      1. Stevie Silver badge

        Re: color background

        Agree. I have profiles for my terminal software for about a dozen or so foreground/background combinations. If I have something that cannot be the subject of a mistake, it goes in the white on red window.

        The youngsters in my office laugh at this and would rather use other (unapproved) software that either doesn't offer a way to store multiple profiles easily or that they can't be bothered to learn how to use properly.

        One of the bright young things, working in a forest of white on black consoles, restored pages from a test database over our production database and caused a complicated partial outage that lasted a week while we sorted it all out.

        Another Young Genius obliterated a QA cluster under the impression he was working on a dev system.

        Yep. The problem is The Old Guy doesn't "get it".

    8. regadpellagru

      Re: Typed 'Reboot' where ... ?

      "Telnetted into various Unix machines, wanted to restart the one in the server room. Whoops - I forgot which machine I was logged into and typed 'reboot' to a machine on the other side of the planet. It did not come up, had to wait until teatime for the guys there to come in and push a button :-("

      Who hasn't done this one, I wonder. Happened to me as well: wanted to reboot my SUN workstation, so typed "reboot", then I had "end connection" on that very window ...

      Got me quite pale for a moment: I didn't know which system I so rebooted and I was logged to quite a lot !

      Then colleagues told me every workstation had frozen: I was logged to the NIS server, which, fortunately came back 30 s after ...

  6. chivo243 Silver badge

    Little fingers

    Once I was working (playing a game actually) in the office, and my boy (I think he was 4 at the time) comes rolling by, and and says what does this light do papa? as he's pushing the power button!! Needless to say, my gaming session was ended, and that PC never seemed right after that incident.

    1. Anonymous Coward Silver badge

      Re: Little fingers

      Similar, but a little girl and the office UPS.

      Suddenly everything went very quiet.

      1. Anonymous Coward
        Anonymous Coward

        Re: Little fingers

        I think I can top that one - at security in the airport waiting to get on a plane - my little one accidentally shut off the entire scanning line when he turned off a power bar - had to wait 15 minutes while everything rebooted and then they had to rescan everything....when I left homeland security was looking at the power bar and talking about how to prevent such an incidence again...

  7. &rew

    Fast fingers

    I recall for old PCs that used an actual mains voltage power button, if you pressed the power button in, and then really quickly popped the switch out and in again, there was enough smoothing in the power supply to cover the momentary blip. True, I would not be willing to attempt that on a company server, though...

    1. Oz

      Re: Fast fingers

      I have saved myself from a thorough dressing down doing just that back in the mid 90s. I held the power button down on a server to force a power off, realised it was the wrong server, thankfully before letting go again and, after several minutes of holding the button in and deliberating, was able to release and re-press the button before the power dropped out.

    2. Alan Brown Silver badge

      Re: Fast fingers

      "there was enough smoothing in the power supply to cover the momentary blip"

      On home PCs yes. Not so much on swervers.

      Of course if it was critical it would have had redundant PSUs that had to be individually switched off.

  8. Tim99 Silver badge
    Facepalm

    I guess that beats my post

    From last month: my idiocy was only going to trash my work...

  9. Remy Redert

    Ever since I had a cat induced computer outage when one jumped onto the case and sat on the power button, I've taken to the simple expedient of not connecting any of the buttons on the case, setting the machine to start when the power comes on. The big switch for the power bar is much less sensitive to cat induced failures.

    On a related note, which idiot of a designer decided that buttons should be put on the top of the case, where they're hard to reach if the case is in any kind of enclosure and easy to set off accidentally if they're not?

    1. DuchessofDukeStreet

      Which Idiot of Designer?

      The one who recognised that most office users would end up with a large box sitting beside their legs under their desk - buttons on top are the most accessible from a seated position (assuming you're talking about a vertical unit).

      For horizontal ones, on top still makes sense as it prevents them being knocked accidentally for objects being pushed around the desk surface.

      But also one who doesn't own/is owned by a cat, and doesn't recognise their tendency to jump onto any available (and inconvenient) surface, particularly one that's radiating heat.

      1. graeme leggett

        Re: Which Idiot of Designer?

        I have exactly that sort of machine (a Dell OptiPlex "designed" for office use) sat beside me and occasionally I nudge the power button with my knee. Fortunately this is set to initiate a hibernation rather than shutdown.

        I have experimented with putting some of those flippy lid button covers over the switch - held on with double sided tape due to location at the top corner of the front bezel. Short of dismantling the front and getting busy with glue and screws the fix is far from permanent.

        1. d3vy Silver badge

          Re: Which Idiot of Designer?

          "I have experimented with putting some of those flippy lid button covers over the switch - held on with double sided tape due to location at the top corner of the front bezel. Short of dismantling the front and getting busy with glue and screws the fix is far from permanent"

          Pull the side off and disconnect the button completely.

          Then either buy a replacement button that can be positioned at the back of the PC or set the machine to wake on keyboard so you no longer need a physical button on the case.

          I have mine set to boot on power resume and everything on the desk is plugged into a 5 way surge protector so that when I flick the mains switch everything comes on at once.

        2. Alan Brown Silver badge

          Re: Which Idiot of Designer?

          "Fortunately this is set to initiate a hibernation rather than shutdown."

          Assuming Windows, go into the power settings and change "when the power button is pressed" from whatever it's set to, to "ASK"

          It's not that difficult really - and there are similar settings in most *nixes (even if you have a CLI-only system)

          It won't help you if you have an old style single PSU server with a real power switch on the front, but the "switch" on ATX systems is merely an input device and you can change its functionality.

          Just don't do what someone I know did and swap "power" (big button) with "reset" (needed a pencil to press). Reset means RESET and having a wayward cat hit it is more of a problem than having the power go off.

      2. Prst. V.Jeltz Silver badge

        Re: Which Idiot of Designer?

        The one who recognised that most office users would end up with a large box sitting beside their legs under their desk - buttons on top are the most accessible from a seated position (assuming you're talking about a vertical unit).

        For horizontal ones, on top still makes sense as it prevents them being knocked accidentally for objects being pushed around the desk surface.

        Nah , sorry but all of that is bullshit . Buttons go on the front of things - end of . A user with a tower box under there desk will of course instictively look on the front of because - thats where buttons go . Yes , it may be *physically* easier to put it on the top , but its still bloody stupid. cos: cant put anything on top of it , what if theres a shelf above it . people dont look there, just as easy to accidentally push, etc , ad infintum. (this is why top loading VCRs died out? )

        Your middle paragrah makes little grammatical sense but I think the gist of what you were getting at is covered above.

        Your 3rd paragraph is of course correct, cats will sit on warm things , they will also jump on the desk itself and get between you and your game of Farcry in an effort to get fed. The more fiendish ones will do this by standing on F5, which you have assigned to "Load last saved game" :(

        1. Prst. V.Jeltz Silver badge

          Re: Which Idiot of Designer?

          The power button on my home box , apart from being in prime position get get toed when resting foot on the shelf its on , has become a bit sticky and will tend to stick in when used which causes a kind of hernia / stroke in the BIOS . It takes a skilled touch to use it now - im not looking forward to having to explain that to someone over the phone in some sort of emergency .

    2. John Stirling

      @automatic power on

      ...I've taken to the simple expedient of not connecting any of the buttons on the case, setting the machine to start when the power comes on. The big switch for the power bar is much less sensitive to cat induced failures....

      I used to do that, until the local power company decided to have an outage, which came back on 1 minute and 45 seconds later, and then went off again at the 2 minute mark, before repeating. For 26 hours over the weekend.

      Which taught me a couple of things;

      1) think hard before enabling auto on after power outage;

      2) always use UPS on anything you care about.

      3) Fridges also benefit from UPS.

      Surprisingly a large percentage of the dozen of so PCs survived that little incident, although a number did not - and the Fridge needed a new motherboard!

      1. Alan Brown Silver badge

        Re: @automatic power on

        "Which taught me a couple of things;"

        Due to many such episodes, $orkplace has a trips on all the server room power to ensure that if the power goes off, it STAYS off until manually reset. There are similar setups on all the AC systems. You have to manually power up.

        In the old days I would have put any critical (must be up) systems on a startup timer of 5 minutes or so to ensure the power was stable before booting (that includes UPS inputs, I've seen a couple fried by dirty power when it was restored)

        Whilst you can do this using bios delay timers it's not ideal in a lot of cases (drives don't like being spun up/down repeatedly) and there are smart distribution panel controllers around these days which take it a few steps further, with things like a selectable startup delay coupled with longer lockouts if they detect several power failures in a row.

  10. Bob Wheeler
    Facepalm

    Repetitive work on multiple servers

    I was working on a 16-node Novell Cluster, updating drivers. A process that had been done many times and non invasive and with no loss of service so deemed by management as safe to do in working hours.

    The process was simple, take a node out of the cluster - “CLUSTER LEAVE”, copy the new device drivers and then reboot that node - “SERVER DOWN”, wait for it to start up and re-join the cluster, and move onto the next node.

    By about the 14th or 15th node, after typing the same commands time after time, instead of typing “CLUSTER LEAVE” to take the node out of the cluster, I typed “CLUSTER DOWN”.

    It should be noted that Novell does NOT ask “Are you sure?” when you type such a command, and it does what the command suggests it does – instantly. All users, potentially some 4,500 of them suddenly lost their file shares, email, printing, internet access – the works.

    My only saving grace was it was late afternoon on a Friday so there was not that many users actually affected.

    1. Anonymous Coward
      Anonymous Coward

      Re: Repetitive work on multiple servers

      wow , that would have qualified for a "who me?" article ! I think i read they are running short - send em in folks!

    2. Alan Brown Silver badge

      Re: Repetitive work on multiple servers

      "All users, potentially some 4,500 of them suddenly lost their file shares, email, printing, internet access – the works. My only saving grace was it was late afternoon on a Friday so there was not that many users actually affected."

      We have a policy of warning users when work is happening. They're a lot more forgiving if they've been given a heads-up

  11. JeffyPoooh Silver badge
    Pint

    What about Power Failures?

    "UPS" you scream.

    No, I'm referring to the power failure caused by the UPS catching fire, ...again.

    A well designed database would have journaling at the transaction later, and more journaling again at the FS level. Oh, sorry. SAP.

    My buddy runs the IT for a company. He tells me that the server can have its power cord yanked out, and the backup server in his basement at home will complete the transactions, transparent to the users. They run in parallel and his done something clever at the networking level.

    1. Yet Another Anonymous coward Silver badge

      Re: What about Power Failures?

      Compaq used to run an ad 20+ years ago of a cluster when you destroy one server (shotgun, drop a safe on it, wrecking ball etc) and the system keeps goings

      1. Anonymous Coward
        Anonymous Coward

        Re: What about Power Failures?

        cluster when you destroy one server ... and the system keeps goings

        HP Non-Stop. Check out the price, then come back after you've recovered.

        BTW us in telecoms have had active / active standby for a very long time, its how we roll.

        Upgrades? No problem, upgrade "non-live", flip, upgrade old live.

        100% of calls and systems still live.

        1. imanidiot Silver badge

          Re: What about Power Failures?

          And then comes a Who, Me? with the basic storyline of

          Upgrades? No problem, Upgrade "non-live", goes wrong but don't notice, flip, upgrade old live, goes wrong and all hell breaks loose...

        2. Alan Brown Silver badge

          Re: What about Power Failures?

          "BTW us in telecoms have had active / active standby for a very long time, its how we roll"

          Which works really well, until it doesn't.

          At which point you may discover that whilst the running systems were ok, what's in the configuration (and has been backed up to tape for the last 2 years) is scrambled. So if you reboot one controller after the other when applying your y2k fixes, you find your NEAX-61E has forgotten that it's a telephone exchange - and that after spending 2 days finding a working backup (3 years old), you then have to replay every update made from that point - which takes 6 weeks - and means that a large number of your customers can't be sure from day to day what their phone number might be - or even if they'll have dialtone.

          Yes, it happened.

  12. OzBob

    Came close myself just today

    What bright spark decided to allow keypresses on VSphere Client to perform menu actions? So if you don't properly focus on the console, you can type away and get prompted for "do you want to shutdown"? Fortunately I looked up and saw that before I got too far, but it was close.

  13. ysgubor anhysbys

    database reboot

    Our sys admin was doing some maintenance on a replicated database, he had stopped the slave and made the necessary changes and then hit the power button to do a hard reset... unfortunately, the power button belonged to a different server - the live database master. Some how we got lucky and our 3TB of data survived.

  14. Anonymous Coward
    Anonymous Coward

    Probably my fault for being unclear

    I used to manage "the UK's Most Dubious Beowulf Cluster", 80-some Pentium 4s running a scheduling job one each of them that waited for a text file to tell them what simulations to run. Not the world's most brilliant solution (especially since they used a regular user's account), but it worked well enough.

    One day, I was having trouble with my email, probably because Outlook Exchange was a delight back in the day, and our Scottish helpdesk were very helpful, doing all the things they needed to do to fix it until, without warning, they said "Right, your new password is...".

    While I was logged in to 80-odd Pentium 4s that suddenly had outdated credentials and thus, no LAN access. Cue me and a room full of KVM switches, re-logging dozens of machines and restarting failed simulations.

    On the plus side they did fix my email.

    1. Korev Silver badge

      Re: Probably my fault for being unclear

      Which uni was this?

    2. HPCJohn

      Re: Probably my fault for being unclear

      Talking about Beowulf clusters.... A several of years ago I was at a customer site in a big UK company which may or may not build jet engines.

      Stood at the console of said machine, I wanted to reboot one of the servers in the cluster. I was telnetted into one of the servers in the cluster and wanted to reboot it. I go ahead and press the Vulcan Death Grip - ctrl-alt-del. Only the whole shooting match went down, not the server I was logged into. Cue red face from me. But they were very good about it.

  15. Greg Stovall

    Silence is NOT golden...

    Back in the 80s, I was on a coop term at a major telecommunications manufacturer. My assignment for the summer was to port a wire wrapping program from an DG Eclipse to an HP 3000. It was a very enjoyable exercise writing a converter from RATFOR to Fortran 77.

    The factory floor was quite a noisy place with all the manufacturing equipment. Since I was new to the HP 3000, I spent a little time exploring. Discovered that as administrator, I could actually poke any memory location directly. I experimented with this...then noticed it was quiet --- too quiet. Panic filled my soul when I realized that the HP3000 I was poking on was the same one that ran all the manufacturing equipment -- and I had crashed it in the middle of the work day.

    I learned NOT TO POKE memory on the HP 3000...

  16. Anonymous Custard Silver badge
    Mushroom

    The hardware version...

    I take your server shutdowns and offer you a colleague doing it on a semiconductor manufacturing machine (of course in the middle of running 150 production wafers). Needed to power down machine A in a bank of them to work on it, so goes around the back and accidentally hits the power button on machine b beside it. Bye-bye 150 product wafers towards the end of their production flow, in all worth a many thousands of dollars.

    We are now strictly verboten from even touching any machine which doesn't have clear ID labelling (customer responsibility to add those, the ones above didn't) and even then we have to point and say plus buddy-check. This is not to say that it hasn't happened since these measures were introduced of course, given some of my colleagues and the old adage about idiot-proofing...

    1. Anonymous Coward
      Anonymous Coward

      Re: The hardware version...

      I know that feeling...

      I work for the supplier to a semicon lithography systems maker. They use 4 number (hex) machine identifiers. They're not sequential but can be quite similar and a typo is easy to make when remotely accessing into a system. I may or may not have shut down the wrong system for service at some point... Luckily this was at the manufacturers fab and not a field system though. Working on field systems always makes me nervous given the dollar amounts involved.

      From experience it's also not easy to explain to a customers line manager at 9pm that you broke his system some more instead of fixing it like you were supposed to.

    2. HPCJohn

      Re: The hardware version...

      Do you work at ASML?

      1. Anonymous Coward
        Anonymous Coward

        Re: The hardware version...

        Not ASML itself, we supply several of the important sub-modules for several generations of systems. Including the new EUV systems. Pretty interesting stuff.

  17. Rufus McDufus

    Did this myself

    DEC AlphaServer, also around 1999, working for well-known internet-based retailer. Went to power cycle some server, accidentally pressed power button on adjacent server (probably a rather critical NFS server). Boss came in after 10 minutes and laughed at me stuck there.

    1. Anonymous Coward
      Anonymous Coward

      And me!

      Dec Alpha workstation, while visiting a physics department in Oregon in the early 90's. Having finished up late while running some simulations, I confidently reach down to the power bar and remove the power brick for my portable CD player .... and the workstation suddenly goes off.

      I'd also nudged the adjacent switch at the same time. Oops.

  18. Rufus McDufus

    Emergency power off

    First job working in the comp sci department at a well-known technology-focused university in London. Annually we'd show prospective students around the facilities including the server rooms. There were big red emergency power-off buttons in various places. A particularly tall budding student decides to lean back against the wall and... These were the days of IBM 4331s, various DEC servers, a big ICL mainframe and others. Generally things didn't tend to work well after a sudden power-off.

  19. Anonymous Coward
    Anonymous Coward

    April 01

    haha

  20. Anonymous Coward
    Anonymous Coward

    Toggle power switch

    This story is strikingly similar to an anecdote from colleagues at a previous job, when one of the ops guys went to power off a server and was informed as he pressed the switch that it was the wrong machine. Although in roughly the same time period, I'm certain its not the same incident because out site never ran SAP.

    The box in question was running end-of-day batch processing, so could not be allowed to power off otherwise carnage would be caused.

    Unfortunately the recessed nature of the switch meant that nothing could be jammed in to replace his finger without also releasing the button at the same time - so he was forced to stand there in the comms room for the next two hours or so, with the end of his finger going blue, waiting for the batches to finish so that the machine could be gracefully shut down.

    In a separate incident, another colleague at the same site had apparently stepped into a hole in the floor of the comms room where a tile had been removed ('elfin safety??) - reached out instinctively to stop his fall, but found he'd hit the emergency power-off button on the side of the AS/400 ... oops!

  21. AndersBreiner

    The Big Red Button

    I was once working on a web application, back in the .com boom. We had a production server which was heinously unstable. We'd test on our dev server for a week and then send stuff over to the production one. The production one was administered by another company and we'd call them and tell them how to do stuff. Either this was before the days of VPN or they didn't want to allow that, for reasons that will become clear.

    Anyhow I was in an interminable call with them.

    "Ok, you've got the files unzipped"

    "Yep"

    "Right click on the .reg file and add it to the registry"

    "It crashed"

    "What do you mean crashed? Did you add it?"

    "No it crashed when I right clicked"

    "How did it crash?"

    "It said explorer.exe performed an illegal operation"

    "Well that's odd, isn't it. Try to open up this folder"

    "It crashed again"

    "Ok press the Windows keys and R and type"

    "Crashed again"

    "Let's try a restart"

    At this point I hear a load clunk, then a pause then another loud clunk

    "What was that?"

    "We're restarting"

    "Don't you do that through the start menu?"

    "No, it always hangs when we do that, do we just use the big red button"

    And then I worked out why nothing new ever working in production and why things that used to work stopped - the server was so addled at this point that it couldn't reboot without someone power cycling it. It'd probably gone through hundreds or thousands of hard power cycles. This was NT so it was somewhat robust but you did lose data on a hard crash - any files that were open for writing would be corrupted and sooner or later you corrupted something vital.

  22. Joseph Haig

    Going Dutch?

    Is this what the Little Dutch Boy is doing now?

    1. ssharwood

      Re: Going Dutch?

      Yeah I did think of that but couldn't remember which dike would get me in trouble

  23. wallyhall

    I remember doing that once

    It wasn't on a server though - just when we were kids at secondary school. Guy sat next to me thought it'd be funny to "hold my work to ransom" by pressing and holding the power button on my PC. I quickly pressed my finger onto the button next to his, and discovered that if you release and press it again *really* quickly, charge in PSU survives the very outage without turning off. :-)

  24. EddieD

    Me, that's who...

    Many years ago I went to the small server room we had (this is the late 90s), and there was a KVM that connected to all the machines.

    I pressed the relevant button for the NT server I administered, and, as I always did, hit <ctrl><alt><del> to give me a login prompt, as the monitor was waking up. Unfortunately, the KVM had been "rationalised" by my boss, who hadn't updated the switch labels, and I'd just connected to a Linux server console session, which immediately shut down.

    The team that were using it for data analysis were a tad miffed.

    1. Trixr Bronze badge

      Re: Me, that's who...

      Belatedly, but whoever made the decision that Linux should simply restart with no prompt if CTRL-ALT-DEL is done in a console session should be shot. It's not as if Windows hadn't been around a fair while when Linux actually became more than a bare kernel.

  25. Jay 2

    My first proper job, I was on the console of a test server (running UNIX) which I needed to reboot. So I typed in the shutdown command, pressed return and then wondered why instead of seeing the running commentary of the server shutting its services down I was faced with a disconnected telnet session message and the prompt of the test server...

    ...I then realised one of my collegues has rather stupidly logged into the prod server on the test server's console. So I'd just rebooted the prod server, as stupidly for me I hadn't checked whoch server I was typing the command on. I got away with it as we were a bit of a law unto ourselves, it was a stunningly good education of how not to run a data centre. Once I moved elsewhere I realised things were very different!

  26. sisk Silver badge

    getting approval for emergency downtime.

    Isn't emergency downtime something you normally ask forgiveness for rather than permission?

    1. I ain't Spartacus Gold badge
      Happy

      I suspect they were foolish enough to be honest. "Person has made error is holding button we must reboot." So people act slowly. And argue.

      If they'd said, [clickety excuse-o-matic clickety] "NASA has reported an incoming solar flare - we expect to lose computer performance in 8 minutes emergency shutdown and reboot to solar-wind hardened crisis mode." Maybe they'd have had more luck.

      Or they could have just said they had to reboot to reverse the polarity of the neutron flux...

      1. Shart Tank

        "Forced windows update"

      2. Anonymous Coward
        Anonymous Coward

        "Jeremy" here.

        To be honest: I have no clue, what our service manager told our customer.

        I just recall me running out of the DC and through the building to alarm our team lead and the service manager.

        No arguing, within seconds people dropped everything they were working on and started sending out emergency alerts. it even went out via the PA!

        (That only happened once after that: We got the "all clear" after a bomb threat, due to an American VIP visiting us to present her biography)

        I tend to think the arguing and discussing futilities started a few years later, where everybody started "managing" instead of working.

  27. Anonymous Coward
    Anonymous Coward

    Wrong server

    Back in the days of IBM SP/2s, each shelf in the frame could take one 'wide' node, or two 'thin' nodes.

    The numbering of the nodes was such that was such that if only wide nodes were installed, only the odd-numbered nodes (1, 3, 5 etc) were present. You only had even nodes if there were thin nodes present.

    My team leader at the time remotely shut down the OS on node 3, and then went to physically power off the node. Starting at the bottom, he counted up the shelves one, two, three, click... and powered off node 5 (because of the missed numbers, node 3 was on the second shelf up not the third).

    As this was a commercial bank, he had turned off the main trading server for one of the main trading rooms. He kept his job, because at the time the applications occasionally took the systems out due to paging issues, so he passed it off as an instance of that.

    Anon, to protect the guilty.

    1. Alan Brown Silver badge

      Re: Wrong server

      "Starting at the bottom, he counted up the shelves one, two, three, click... and powered off node 5 "

      THIS is why I insist on labelling Front and _REAR_ of systems along with their power and network cables (at both ends).

      Seriously, if you think you (or anyone else) might be stumbling around an unfamiliar rack, then it's worth spending the time to make sure everything's labelled.

      I'd rather have the place looking like the 1960s Batcave than have people not knowing what they just reset.

  28. Paul Hovnanian Silver badge

    Was his name ...

    ... Hans Brinker?

  29. Anonymous Coward
    Anonymous Coward

    Many moons ago a colleague phoned a small Scottish school to diagnose a problem with their dial-up modem link. Nothing obvious, so he got the woman to try a 3-pin reset... "turn the power off, count 5, then turn it back on again". She put the phone down while he waited on the phone. He suddenly went white as a sheet and swore as he heard in the distance "well the man told me to do it"... yes folks, she had gone to the consumer unit and turned off the power to the whole building... luckily it was a one-room school house and only the lights and a single PC were affected (PC just rebooted)

  30. 2Nick3 Bronze badge

    You can do this with NIC settings as well

    Had a customer complaining that the admin (and only user-reachable) interface NIC in our appliance had autonegotiated at 100Mb, where it was on a 1Gb switch. He was making a big fuss about it, getting the account team involved (they were negotiating another purchase), being a bit of a pompous arse overall. I told him if it had dropped the speed there was likely a real reason for it (the call-homes showed it changed outside of a reboot), and we should look at the logs on the system, and probably the switch, to figure that out before he did anything.

    Like hard-code the speed to 1Gb. Which he did.

    I had to send a tech onsite to console into the box to reset the NIC to autonegotiate. The next day the port on the switch was replaced and the interface went back to 1Gb.

  31. vincent himpe

    Cleanroom suits and power breakers...

    Picture this. a cleanroom where integrated circuits are made. A massive multimikllion dollar ion implanter. High voltage, deep vacuum, ion beams, Cry pumps. Magnet power supplies feeding 3000 amperes...

    All hanging of a three phase lever switch mounted on the wall. One of those big 'clunk' type rotary levers that are gas-spring operated to shoot the contacts open.

    Plant and facilities is called for a small water leak in the service area. The tech goes in and looks at the leak and gets ready to put a small pan underneath while he goes out to get a new piece of teflon tubing to replace. Before crouching down he adjusts his cleanroom bunny suit ( those are uncomfortable if you have to bend over or kneel down. ) while doing so his belt snags at the big power breaker handle.

    As he kneels down he feels the snag but it is too late. Ka-lunk : the whole machine goes dark

    Vacuum isolation valves lose control pressure and pop open. The 6 meter long beamline sucks in air, pulverizing the poor wafer that sat in the interlock. Ion gauges blow their filaments exposed to the inrushing air. The crypumps lose vacuum and immediately freeze over shattering the traps.

    The tech ,scared witless by all the banging and clonking turns around and does the unthinkable...

    He grabs the big lever of the switch. and re-engages power to the machine...

    ...

    It took 2 months to overhaul the machine into back up and running.

  32. Anonymous Coward
    Anonymous Coward

    Big cooling tower cluster burns down, hundreds of server shut down by hand

    At a TOP2000 company in 1999. Three on-site datacenters (one main, two backup) were powered by an on-site big oil power plant, and additionally connected to the power grid with two (redundant) power lines. The data centers had a cooling tower cluster building near by. The cooling tower ran out of water and the rotating parts in it caused the towers to caught fire. To aid the on-site firefigthers, the internal power plant had to be shut down, they basically shut of all electricity on site, but completely forgot about the data centers. The data centers automatically fall back to battery power that would last for 15 to 20 minutes. The whole complex got evacuated, the admins refused to obey the order and stayed in shut down hundreds of servers one by one, to prevent serious problems with SAP R/2, SAP R/3 clusters and Oracle databases. The cooling towers building burned down to the ground, and with it many cars on the car park near by.

    Unfortunately, they learned little from the incident. They rebuilt a carbon copy of cooling tower cluster building, and the admins kept the non-automatic monkey patching method. 15 years on, almost the same incident happened again. This time the power lines to the grid got overloaded, the lines burned through, the power plant turned off automatically, the data centers switched to battery. The on-site telephone system was now Cisco IP phones, and had no backup battery. So no phones. The cell phone tower was on-site, powered by the same power line, so no cell phone coverage as well. So the only communication method were a few analog walki-talkies of the firefighters. Admins had to shut down servers by hand one by one again, this time even more in hurry, as the backup batteries only would last for 15 minutes and lots of more servers and virtual servers were added in the meantime. Obviously, this time some servers were gracefully shut down including what admins thought was important (SAP) but not Oracle, Microsoft, Cisco, etc. Ups.

    Have the changed a thing, probably not.

  33. cutterman

    Been there, got the T-shirt…

    Replaced the press-button switches with switches that need you to insert & turn a key.

    All fine until you lose the keys…

    Mac :-(

  34. Anonymous Coward
    Anonymous Coward

    Negotiation tactic

    Had a similar event back in the late 90s. Worked for a small computer/networking shop that handled IT for larger companies. One had a fleet of servers (mostly Novell, but a few SCO and other odds and ends). Their "rack" was a large multi-shelf workstation with multiple KVMs that was loaded up with tower style PC cases for their servers. This rack had grown over the years and developed a massive ratsnest of power cables, phone cables, CAT5 and even a smattering of Twinax at the back.

    We decided to pull and all-nighter to re cable the entire rack. Redo network wiring with better lengths, tidy eveything up, label all the wires, etc. Around 3AM I was in the back of the rack and I heard an "oh shit" followed by a long pause at the front, followed by a request to come over to that side of the rack. My boss was standing there with a finger on the (AT style) power button of a rather critical Novell server.

    We were working out of production/business hours, so downtime was fine, but some of those machines didn't take kindly to losing power abruptly. Our process up to that point has been to down the server, then power down. He was doing that, but, being a bit punchy at that time of day, pressed the button on the wrong server. Worse, the KVM was out of his reach.

    He asked me to down the server for him... I hinted that I might need to go get a coffee and take a break first. If I were smarter, I would have renegotiated my hourly wage at that point instead!

    Anon, cause we got through it (among other incidents) without the customer suffering at all. No harm, no foul.

  35. TheSkunkyMonk

    If it was me and I had requested a short break from my colleague and was told no I would of slipped. Sorry, whats the point of being a team if their is no teamwork to speak off... and punishing people for a mistake is just mean... Never read Mobby Dick.

  36. Clunking Fist Bronze badge

    My Cooler Master Silencio case has a reset switch on the top. Colleagues would come over for a natter, and my machine would begin to shut down. They would rest their elbow on the edge of the case, right on the reset button.

  37. The Oncoming Scorn Silver badge
    FAIL

    On a (Happier) Contract A Few Years Ago

    Already told the story of....

    the offshore support insisted on the former plant Sysadmin hitting the plants BRB, pictures were sent to the remote guy via email, he confirmed that was the button he wanted to be pushed & goodness gracious me it was going to be pushed. He was advised again of what it was that would be pushed & the consequences, the plant manager dutifully informed of what was required, what the offshore wanted & what would be the fallout.& so it came to pass that the BRB was pushed on the word of the Technically Competent Support representative (Paraphrasing here........)"Goodness gracious me, Why s your plant disappearing from network?""

    The other was the building shutdown for the energy traders (same company just in downtown Calgary). I was advised from on-high that there was no need for me to be on site to do anything (Mistake number one), that anything that was needed like putting the very very large UPS into "Passthrough" before they did a controlled shutdown could be done (but wasn't) by the site contact.

    3am messages from the Sysadmin team in India left on my desk phone at my normal work location didn't get to me for some strange reason of me not sleeping at my desk on a Saturday night (as opposed to normal Monday - Friday mid afternoon stupor (No booze involved) until Monday morning.

    On arrival at the site Sunday morning & ringing into the bridge as the sound of silence from the servers was deafening, the building power was on, but the surge had tripped the UPS breakers & each battery had to be checked by the third party techs, before we got the go ahead to bring up the UPS & then bring everything else back up.

    Still Sunday double time my minimum 4 hours turned into 6 hours or so which wasn't too bad.

  38. aldolo

    installed os/2 2.0 with a finger on the power button

    22 disks, a 1 finger in the wrong place.

  39. Black Betty

    RECOVER C:\*.*

    Having zero knowledge of FAT disk structure, I managed to use a copy of Norton Disk Doctor to find the directory entries on the disk and partially reverse engineer how files were stored. Enough to find the lost sub-directories but not properly suss out how deleted files were actually marked.

    So I rebuilt the root directory by hand, and hand patched rest of the disk back onto it the hard way. Come the Monday this APPLE ][ fondler lashed out on a copy of Understanding MS-DOS and discovered just how much easier recovery could have been. But by then the machine was already back in service.

  40. Murray C

    Booby-trapped Cisco

    Can remember a few near misses with remote work on Cisco branch office routers - if we were changing say, the crypto-maps on the WAN interface, we would typically do a 'reload in 10' command at the console to reload the router with the last saved config in 10 mins time if we somehow managed to bork the change & lose the link.

    ...just don't get distracted & forget the 'reload cancel' command afterwards if all goes well!

  41. nwillc

    Back in the day

    I was giving a new admin a tour of our "state of the art" machine room. There was was a red button about the size of a cantaloupe marked only FPO. They asked what that was, and I jokingly said "I don't know, hit it." They did.... and the room... yes the entire room proceeded to "Full Power Off". Even the UPSes... they asked "What do we do now?" I could only offer "run".

  42. Anonymous Coward
    Anonymous Coward

    sysprep wrong server

    my biggest cock up was where I was remotely logged into a hyper-visor and setting up a virtual machine, but I was inadvertently running the commands against the hyper-visor instead of the VM, and took the entire hyper-visor and all the vm's offline as a result. The worst thing was it was a SYSPREP command I was running.

  43. Goit
    FAIL

    My second day working as a Systems Admin for a large law firm back in mid 2000's, asked to take out a couple of servers that had been decommissioned and free up the rack space for a SAN that was going in.

    4 power cables, going into two servers, walked around the back of the rack and yanked all the cables out. Except... I had went to the wrong rack and yanked the cables out of both of the exchange cluster servers, bringing down the mail system for about an hour >.<

    It was so tidy and symmetrical it looked identical to the rack I was supposed to take the servers out of... Surprisingly they kept me on for a third day :D

  44. Anonymous Coward
    Anonymous Coward

    Workstation Power buttons

    Workstations -- real workstations, that is -- typically had power buttons on the front. Press it and the box gracefully shuts down. I did this by accident once. No damage, of course, but a lot of lost time. Interesting, the latter releases of the OSes put up a box asking if you really wanted shutdown.

  45. bexley

    Label your servers!

    This is one of the reasons why you have to label servers but almost nobody who works in corporate IT ever does.

    Label the damn things and then you can tell what it is at a glance.

    1. HPCJohn

      Re: Label your servers!

      I totally agree.

      Just flagging up - Dymo lables from office type label guns are useless and will dry up and flake off

      You need proper labels on cables and servers.

      Server manufacturers - listen up. Put a transparent plastic fixture on the front. 1cm x 5cm

      The user can slip a printed label behind this.

      The original sun designed pizza boxes had lights out controllers which had a dot matrix display.

      Designed so you could display the server name on them.

      I installed racks of them at Nottingham Uni. I never did have the guts to print out rude words on the displays.

      1. imanidiot Silver badge

        Re: Label your servers!

        From experience there is a HUGE difference between the cheapy Dymo labels from the "around the office" handheld label printers and the proper Dymo kabel labels. The proper ones will last for eons but require the proper desktop printer.

      2. Alan Brown Silver badge

        Re: Label your servers!

        "Dymo lables from office type label guns are useless and will dry up and flake off"

        Laminated type labels work well, stick like shit to a blanket and stay put for decades. There are even anti-tamper types, ultra sticky ones for hot areas and ultraflexible types for putting on cables.

        At an installed cost of around 6p each they're cheap insurance

        Brother and Dymo both make them (I prefer the Brothers) and there are a few others floating around.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019