back to article 10 things you need to avoid SNAFUs in your data centre

Despite my apparently youthful good looks, I've been in the IT industry since 1989. Which means I've been around the block a bit, and have learned rather a lot of lessons – some of them the hard way. To avoid you having to find them out yourself, here are ten to be going on with. 1. Always carry a torch in your laptop bag …

Page:

  1. fruitoftheloon
    Thumb Up

    my ha'penny

    Very helpful indeed!

    My fave relevant war story relates to 2001ish when I was a pm responsible for upgrading a region of HMRC (the UK version on the IRS) desktops to the latest whizz bang version etc.

    Every morning I would lug my coffee machine into the relevant office and would dish out the (usually) still warm pastries to my crew that were swapping out the kit, it helped to get them in at f'ing early o'clock EVERY DAY.

    At a regional progress review meeting a senior HMRC manager had a major strop about public money being used to provide breakfast for contractors.

    I pointed out that it was my machine, and I paid for the coffee and pastries out of my own pocket...

    The look on her face as as most of her colleagues laughed at her was a joy to behold.

    Cheers,

    jay

    1. AMBxx Silver badge
      Windows

      >>desktops to the latest whizz bang version etc

      I don't suppose they've changed anything since!

    2. Fatman
      Joke

      Re: my ha'penny

      At a regional progress review meeting a senior HMRC manager had a major strop about public money being used to provide breakfast for contractors.

      Was that mangler named Ima. Grouche, by any chance?

      Does she bear any resemblance to Rosa Klebb??? (http://www.imdb.com/name/nm0502322/?ref_=tt_cl_t4)

      1. fruitoftheloon
        Thumb Up

        @Mangler: Re: my ha'penny

        Mangler,

        I don't recall exactly but she lost an argy-bargy with a whole forest of ugly-trees..

        Cheers,

        Jay

  2. Velv

    #11. Baseline before you make changes. Test it's working before you change it. Reboot, restart, power off/on or similar. You don't want the blame for a failed update because someone else has been messing with it before you.

  3. Anonymous Coward
    Happy

    Dual Port NIC'd

    What is this, the year 2000?

    3, or more please.

    1. Anonymous Coward
      Anonymous Coward

      Re: Dual Port NIC'd

      Maybe but more than one physical card please, because if the only physical card you have takes a dump it doesn't matter how many ports it's got, it's off the network.

    2. TheNeonSpirit

      Re: Dual Port NIC'd

      Redundant interfaces on a server = Yes.

      Massively over engineering connectivity = NO!!

      I have lost count of the number of times someone has said "I need to connect this server into the switch" to which the rely is "which switch", ohh you mean the one that was speced for 6 servers and currently has 8 with no spare capacity.

  4. Anonymous Coward
    Anonymous Coward

    Or maybe you've never done the task on 1 March in a leap year before

    Oh, how true.

    Back in the 80's the old BT X.25 packet switching network fell over the first time it encountered February 29th. Took the UK cheque processing system with it.

    My wife told me that she once filed a bug about new software that didn't work on Wednesdays. The developers laughed at her, until she demonstrated. After investigation they shamefacedly admitted that Wednesday is the only day of the week with 9 letters, and a buffer somewhere only had room for 8.

    1. Anonymous Coward
      Anonymous Coward

      A prototype new computer model had passed all its engineering tests. The instruction set etc was identical to the existing models that had been in production for some time. It was then tested successfully using a copy of the operating system generated for the existing model. At last the day came when it received its own official copy of the operating system - and it crashed.

      It turned out that there was a critical difference - the name of the program file that contained the operating system. The naming convention was "AAxJ1000" - where "x" was a letter representing the model type. The existing model was "AAGJ1000" - the new model was "AAKJ1000".

      A programming error had been addressing the wrong, but consistent, byte in memory since its creation - and testing a particular bit. The difference between "J" and "K" was all it needed to send it down an untested code path.

    2. kain preacher

      How the hell do you make that mistake?Not working on Wednesday.

    3. Bob H

      I worked for a large European telco, we had a control system (PC) which managed some complex broadcast encoding and multiplex systems. The software was written by the vendor because the customer was huge but the vendor subsequently decided the software wasn't what they needed and they depreciated it. It took several years of using it before we found out, to the cost of the poor Ops Director (covering for the Ops Manager's holiday) who was doing some configuration changes, that if you made any changes on the last Friday of the month it would corrupt the database! Then if you reset the PC it would also reset the attached equipment until the control software was back up, but the database was corrupt so millions of people missed their Friday night TV while the Ops Manager rebuilt the database!

      1. Pookietoo
        Headmaster

        "deprecated"

        (no body)

  5. DJV Silver badge
    Joke

    Arse!

    "I've had LAN switches delivered to US data centres with European-style plug on the cables, which was a pain in the butt"

    Conversely, LAN switches delivered to UK data centres with US-style plugs on the cables are a pain in the arse!

    1. Flocke Kroes Silver badge

      Re: Arse!

      That is because you are doing it all wrong. You are not supposed to insert the plugs there.

      1. Sir Sham Cad

        Re: not supposed to insert the plugs there.

        Been tempted on many occasions to inset the plugs in that very location of the idiot who shipped us kit with the wrong bastard power cables.

        1. TRT Silver badge

          Re: not supposed to insert the plugs there.

          And wrong bastard sized threads on the 19" rack nuts...

          1. Will Godfrey Silver badge

            Re: not supposed to insert the plugs there.

            Argh! Don't remind me - to say nothing of the occasional self tapper.

            Then there's the vast range of head profiles that all seem to appear on one sodding rack case.

            NE-way, good article.

    2. Sir Sham Cad

      Re: US-style plugs on the cables

      I have been saved on many an occasion by having some spare C15 female - C16 male power cables left over from a UPS refresh project. We've got tons of old C15 kettle leads about so with the cunning combination of the two plus some electrical tape = my 3750-X can now be switched on despite the worst intentions of the tin distributor.

      1. Richard Altmann

        Re: US-style plugs on the cables

        My bloody apprentices, while i was on leave, threw away all "outlandish" power cables. They are toothbrushung the server room floor right now.

    3. Bob H

      Re: Arse!

      I was once baby sitting a vendor who was doing an install at a client site, as I new I was just supposed to be supervising the vendor I got VERY drunk the night before. The vendor turned up and they had put a UK BS1363 plug instead of a IEC60309 (Caravan) connector! The vendors engineer said she wasn't allowed to change the power connector so the day was a washout, I called my boss, told him and he said "get it done". Fortunately I know what I am doing with electrics but unfortunately I had a royal hangover. Not a problem, I'm just wiring a plug aren't I? It all went well until I was about to flip the breaker and the vendor engineer said to me: "You're pretty hungover, are you sure you wired that right?". My confidence was immediately dealt a blow, but I was too hungover to de-construct the connector again to check, so I just plugged it in and decided to let the breakers deal with the consequences! Luckily I am apparently a good engineer even when hungover and it all worked fine.

    4. Jason 24

      Re: Arse!

      It's ok, the IT lot before me here had the marvelous idea of ramming a 2 prong EU plug into a 3 prong UK socket.

      Amazingly the firewall ran for 8 years before it finally burnt out (literally smoking) last week.

    5. Alan Brown Silver badge

      Re: Arse!

      I make a point of keeping a few cartons around of power leads in various lengths. Even when supplied with the "right" plug, they're often the "wrong" length.

      Anyway, a decent rack power bar provides C12/13 and C19/20 connectivity, so what the vendor supplies is usually irrelevant. Just make sure you use the locking versions of the plugs/sockets, as a partly-out C12 is difficult to spot.

  6. Sir Runcible Spoon
    Mushroom

    Tidy Cabling

    this is a real pet-peeve of mine.

    Not that long ago I had to perform an audit in preparation for a live system migration during the run up to Christmas for a major drinks distributor (yeah yeah - not my idea - but it certainly focuses the mind! :) )

    The rack cabling was so bad that at the end of the audit there were four cables which proved totally impossible to trace!

    1. John Brown (no body) Silver badge

      Re: Tidy Cabling

      "The rack cabling was so bad that at the end of the audit there were four cables which proved totally impossible to trace!"

      I spent hours installing a new cab, cables all routed and labelled, it was a work of art! Went back a month later and the bloody door wouldn't even close. It gradually got worse over time as they messed about with it more and more. I have NO IDEA why they were continually swapping cables around.

  7. Stevie

    Bah!

    Re: Number 10.

    I had a new furnace and water heater put into my house on the Wednesday before Thanksgiving. Thanksgiving morning, about 12 hours into the new furnace's life, no heat.

    I called the vendor and told them I'd gotten it to work by fiddling with the electromechanical bits, but someone needed to come round and sort it out. They grumbled but said they'd send someone as soon as they could. I countered with another offer: since I could manually start the thing I was willing to wait until Friday so their staff could have their Thanksgiving dinner in peace, provided they were ready to start fixing it at 9am Friday. They agreed.

    10:30 am Friday I called the vendor, and was connected with a disagreeable woman.

    Me: My furnace isn't working. Where are the people who promised to come and fix it?

    DW: Do you have a service contract with us?

    Me No.

    DW: Well, we don't service furnaces unless

    Me: I have a furnace installed by your crack team which broke down less than a day after it was installed. I, out of the goodness of my heart, agreed to wait until today to get the thing properly installed. I expect someone around here before noon to do that.

    DW: I don't think

    Me: The work done by your company is under warranty. The furnace itself is under warranty. My wife works in an office with fifteen attorneys and we can get all the legal muscle we need for free as long as we need it.

    I think it was the last one that swung the deal myself. I had this same chat every year for the duration of the warranty, until I got a young guy who simply disconnected the never-working but legally mandated electromechanical "damper" that went wrong every 12 months.

    Now I have to reboot the furnace if we have a *really* bad windstorm while the furnace is idling (the CO monitor shuts it down because of backdraft) but at least it starts when the weather gets nippy without the need for a Brummy Screwdriver.

  8. Anonymous Coward
    Anonymous Coward

    Ever threaded the old thick Ethernet AUI cables under the false floor or in the ceiling - then after restoring all the tiles you found the connectors were the wrong way round?

    1. TRT Silver badge

      *shudders*

  9. Message From A Self-Destructing Turnip
    Boffin

    Easy fix, lash a spare cable on one end with gaffer tape and pull through. You didn't lift all the tiles again did you?

    1. Anonymous Coward
      Anonymous Coward

      "You didn't lift all the tiles again did you?"

      This was back in the days when the cables were very thick and relatively inflexible. Add to that they snaked though a forest of supporting pillars and other equipment's cables and filter boxes. No way could you risk pulling a cable back and leaving a trail of destruction in its wake - even assuming the connector didn't get stuck at some point.

      1. Yet Another Anonymous coward Silver badge

        That's the nice thing about thick ethernet.

        Just pull the cable (a tractor is helpful here) and the tiles and desks/chairs/any server not made by DEC will lift along with it.

  10. Anonymous Coward
    Anonymous Coward

    Two slight modifications from a network guy

    6. Spend the money and double-connect everything

    If you are going to duel connect your servers to multiple network switches (good idea), make sure the switches support the method you use. Switches don't like seeing the same MAC address appear in two different parts of the network - the step you took for high availability may have just created a problem delivers a few hours of hard to trace poor performance.

    7. Record every change

    Don't just record the changes you make, have a system (config backups, tripwire, monitoring tools, management software etc) that automatically grabs configs regularly and ideally provides a way of running diff on them so you can find what has changed when your initial checks don't show anything unusual. In my experience, the change that Roger made to system X for VIP Y didn't go through change control because it was really critical and Roger will be busy elsewhere when it breaks.

    1. Anonymous Coward
      Anonymous Coward

      Re: Two slight modifications from a network guy

      "make sure the switches support the method you use. " - Please Mr Network Guy, make sure you are stacking your switches and my bonding will work ... I do both (networks and systems) and I know exactly where you are coming from. It's even more "amusing" seeing people simply plugging multiple NICs into the same VLAN and expecting a magical speed up without bonding or worse only doing one end. Also that e on the end of Cat5e is important. On the other hand that a after Cat6 gives me flashbacks to things like Twinax.

      I instantly (when I've found some scissors or whatever) cut the ends of any cables I find at Cat5 only and any missing/broken clips - it pisses me off to see cables hanging out or delivering awful performance. Sorry, I fill in an emergency change req, bollock the customer and then go in with the scissors.

      1. Alan Brown Silver badge

        Re: Two slight modifications from a network guy

        "expecting a magical speed up without bonding"

        Or expecting to see a magical speed up for traffic on one process, running between 2 systems.

        LACP bonding doesn't work the way many people think it it does. It's not equal-cost multipathing, it's "this traffic between these hosts/ports goes down that link cable and stays that way until one of the cables gets broken"

        To go faster than the wire speed between any 2 given hosts you need to go to a faster wire speed.

  11. Anonymous Coward
    Anonymous Coward

    The System Test area at English Electric's Kidsgrove factory was known for having its 3 phase mains cables incorrectly colour coded. Unfortunately this was only discovered after it came into use for the production line of KDF9s. It was deemed too disruptive to change - so it stayed that way even when the next generation System 4-70 computers were being built there.

    The first prototype 4-70 worked very well - as did the second. Then it was found their exchangeable disks were incompatible. The mystery persisted for a while until someone noticed that the disks on the first prototype were spinning up in the wrong direction. The heads were symmetrical so a disk formatted on that machine worked perfectly - on that machine - and all the data had been primed from tapes.

    The reason was that the first set of disks were installed by engineers from the disk factory at a weekend - and no one told them about the idiosyncratic 3 phase colour coding.

    1. Phil O'Sophical Silver badge

      Ah, the days when disk drives needed 3-phase motors :) Probably held all of 100MB?

      1. Anonymous Coward
        Anonymous Coward

        "Probably held all of 100MB?"

        8MB in 1967 - although ICT eventually had 200MB ones.

        By 1970 ICL had 600MB in a fixed disk unit. Two stacks of platters each holding 300MB side by side - whose heads operated simultaneously in opposite directions to balance the forces. The cabinet was probably over 2 metres high and nearly 3 metres long and over a metre wide - sitting on a specially reinforced section of the false floor. It had water-cooled bearings. It took 8 hours to archive the data to magnetic tape.

      2. Anonymous Coward
        Anonymous Coward

        Probably 20MB...

  12. Terry 6 Silver badge

    Spot on

    Even doing my small scale support job with a dozen or so PCs on a tiny network all this has been true for me and the engineers who came in to do the big jobs.

    Yes the torch.

    Yes the cables tangled.

    Yes the two unrelated jobs that made the whole network fall over ( luckily not me that one).

    But along the bit about the documentation being on the device that's causing the problem for small workplaces I'd splice knowing who is responsible for retaining the docs ( especially where more than one team is concerned) as well as where.

  13. Doctor_Wibble

    Alternative torch and other things

    I got one of those cheapie extendable mirror things with a light on the end from the local hardware shop and it's been absolutely fantastic - though I did need to add a small strip of plastic to stop the lights (two of them, it's dead fancy) from dazzling me - if it's dark this is enough to stop you seeing that all-important mis-set jumper.

    Also can be done with a radio aerial taped to a small torch (as the handle) pointing at a sufficiently light makeup compact (mirror with handy hinge for angling it) at the far end and of course prodigious quantities of the proverbial sticky-backed plastic. If you want to be posh then use two aerials to stop it turning to point at the floor every time you move it.

    And all these years I hadn't twigged that 'kettle lead' was actually an ambiguous term with serious career-limiting capability! I had only ever seen the old-style round ones (on actual kettles) and the modern ones just seemed to be everywhere as well as kettles so I only ever saw them in terms of 'obsolete' and 'standard'.

    Edit : and cabling - if it's not possible to get it neat then at the very least make damn sure you label BOTH ends before you plug it in and have a 'cables page' in the cabinet AND in the on-call folder. We aren't always in the position to tell others how to do their cabinets but we can limit the accidental damage by making sure we get our own stuff right at least.

    1. Anonymous Coward
      Anonymous Coward

      Re: Alternative torch and other things

      "And all these years I hadn't twigged that 'kettle lead' was actually an ambiguous term [...]"

      There are two variants to the modern "kettle" lead. One is intended for low power equipment like PCs. The other is intended for high power devices like kettles. The difference is that there is key notch that stops you plugging a low power one into a high power device.

      I was caught out on that when I needed a longer than usual lead for a hot stone cooking device that you put on the table. It's a bit like a serve-yourself miniature barbeque. A PC supplier was advertising 3 metre "kettle leads" - perfect! Except it didn't have the notch that is required to mate with a high power device like a kettle.

      1. This post has been deleted by its author

    2. Anonymous Coward
      Anonymous Coward

      Re: Alternative torch and other things

      "[...] label BOTH ends before you plug it in [...]"

      Mains adapter "blobs" are used for quite a few devices these days. Always label them with their device type - often they are not branded. Saves frying a device - or blowing an inadequately rated "blob".

      If the device doesn't indicate its voltage, current, and any polarity then label it with that information. You may need to use a generic multiple adapter with it in an emergency.

      1. Will Godfrey Silver badge

        Re: Alternative torch and other things

        It's not so much the power rating, it's (far more importantly) a temperature spec. Kettle leads have to withstand being regularly cycled to a much higher temperature than the one you plug in your rack.

        1. TRT Silver badge

          Re: Alternative torch and other things

          A borescope comes in handy sometimes.

  14. Jock in a Frock

    Cable labelling

    I see people label one end of a cable with the local end details, not the far end (and the same at the other end!)

    We now mandate that the label at both ends has details of both ends.

    1. Anonymous Coward
      Anonymous Coward

      Re: Cable labelling

      If there is any potential ambiguity between adjacent racks then use different shaped and coloured labels so you only have to match them on the plugs and sockets.

    2. Christian Berger

      At vocational school...

      ...we had to cable one of the computer rooms at the school (which was rather questionable). Well we pulled in the cable and labelled them in 2 teams, both armed with duct tape. One took a cable and made rings around it. One ring for the first cable they got, two for the second and so on. The other team also took some cables and made little flags on them. One flag for the first cable they got, two for the second and so on.... So you had both ends labelled... just not in a consistent way.

  15. Anonymous Coward
    Anonymous Coward

    Another suggestion - throw a few packets of peanuts (or similar long-keeping, space-efficient, and filling) snacks in the laptop bag too...you never know when "just one quick thing" is going to turn into an extended mission. And when it does, you can absolutely guarantee that you can't leave your post and you'll be in the middle of bloody nowhere.

    1. Anonymous Coward
      Anonymous Coward

      Jordan's cereal bars are a more balanced, slow release food. Always carried a few with me.

      A hotel in Manchester near Piccadilly Station was one of the few places to stay where you could still get a proper meal very late in the night. The one with the over-the-top decor and a reputation for accommodating other late night services.

      In another hotel - just as dinner time arrived they announced the place was in lock down. No one was allowed out of their room because the Prince of Wales was due to pop in for some refreshments.

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like