proceed with this nonsense at flank speed!
Great read and great idea for a column, El Reg. Makes me feel better about my on-call experiences.
In the spirit of our popular eXpat Files column, we've decided to chronicle some of the weird things that have happened to readers who work on-call. This week's story came from Reg reader, Paul, whose story goes like this: I used to work in telecoms in London and we had quite a few major foreign embassies and overseas banks to …
I worked in a gilts interdealer broker at the time of the big bang. We didn't have problems with flooding although the office was just off the south end of London Bridge. Our problem was rats chewing the cables from the trading stations. I don't blame the rats, it was the traders throwing their used food containers under the desks.
We had an issue with rats chewing through fiber 8 or 9 years ago here at the University where I work. They got into the fiber patch/distribution box in a building and just had a party, chewing through a dozen or so fiber pairs one night. What was ironic about the whole thing was that this particular fiber box was in the back of the kitchen of the campus dining hall. It's still a joke in this department that the dining hall food is so bad that even the rats would rather eat plastic and fiberglass.
So, basically any hour after midnight beginning with T or F, hope your pay is good!
In an different industry, I used to get burned more often than not, a good band, swanky party or poker night always fell on my "on call night". No beer, couldn't travel outside a predetermined radius, some criminals have more freedom! Glad those days are over ;-}
It never fails to amaze me why such things as stand-by generators, servers and all forms of comms equipment is put in the basement by those that design buildings when they know full well that that will be the first place flooded.
Yes, there is a use for the basement - parking for top management cars.
All that was in the basement - as required by building regs - was the fuel tanks and pumps.
IIRC1, reading elsewhere, that one of those such afflicted companies ended up getting a bucket brigade of workers to haul 5 gallon buckets of diesel fuel up 17 flights of stairs.....
1 Now I remember: http://www.computerworld.com/article/2493111/data-center/huge-customer-effort-keeps-flooded-nyc-data-center-running.html
One large bank - initials start with "J" - had tens of thousands of servers in the basement, which all went under 12 feet of water. Fortunately they did have a backup server plant in New Jersey so actual operations were transferred over pretty quickly. But the cost of lost equipment was $millions. If I recall correctly they did not replace the basement server room, but built a new one somewhere else.
Seems like typical management learning process.
1) Make a bad decision based on non-technical criteria (convenience), not taking into account any possible risks
2) Get caught when bad decision leads to millions in lost money
3) Finally make the right decision when the risks previously ignored are now deemed too serious to continue ignoring
I was once involved in a machine room that was in a basement underneath a swimming pool. It was a former chemical lab and the idea was if something went badly wrong with a spillage a small explosion could empty the pool in to the lab and very quickly dilute any chemicals.
When the lab got re-purposed the explosive charges were removed but the pool remained for some time.
One of my own stories:
Background: I work as a support engineer for a manufacturer of silicon photolithography equipment. This involves work on equipment using hydrogen. Each machine has a separate cabin with support equipment and triple redundant hydrogen sensors. Any of them detect a level over 0.4% and the alarm goes off.
Imagine this. I'm wearing full cleanroom gear. Cleanroom hood with a "surgical mask", white antistatic coveralls, socks over that, cleanroom shoes, and a set of gloves.
That particular day we're having some difficulty and I'm extracting a wafer from a load port manually. Just when I'm finally ready to get the blasted thing out the evacuation alarm goes off.
This means taking the nearest fire exit. Straight outside. So there were where, on a sunny spring day, about 20 of us in full cleanroom gear, looking like bleached martians. Strolling along the freshly mowed lawn. I wish I could post the pictures... (client confidentiality, NDA's and all that...)
They're even more fun when some fucker puts the wrong kind of sticky floor* down in the airlock
- imagine a game of twister where anyone who touches the tape cant move the bit that touches it.
Never did find out if the culprit** got caught as the clean room had to be re-done after the airlock had to be busted open to get people out and I found other work in the meantime.
*meant to be a sort of post-it note type glue that pulls any ingrained dirt of the shoe covers as you walk over it.
**unless it was a supply error but whoever put the floor in must have known there was something wrong
One I heard of was the new clean room a/c failed and management guy went in n thought this was too hot and threw open the fire doors... only for the icy blast of fresh winter air to cause the veneer worktops to peel off the tables... not to mention the cleaning...
In my first job, as a programmer on a VAX11/780 we had the first purpose built computer room in the Old Trafford part of the company, along with its very own AirCon system. I was there early, went in to the machine room to change something and immediately noticed that rather than being icy cold, if was damn hot. Cue call to the guy responsible for building the room and project managing the full install, who rushed in, diagnosed the problem instantly, It was the desiccators that has locked the wrong way, and instead of pumping cool dry air in, they were pumping warm wet air.
He fixed the problem straight away, reversed what ever was going wrong, and cool air flooded the room. Only problem was it crated a fog in the room! The picture of the room through the glass windows, full of fog, with us two unable to see a foot in front of our faces must have been wonderful.
I bet there will be plenty of "botched aircon" stories in this thread, so here's another one...
Early on in my career, I worked for a company that took over a building and needed to keep a bunch of MicroVAXen cool.
Solution: Stick 'em in a broom cupboard and use a bathroom extractor fan with a thermostat to draw in air from the corridor.
Problem is, they wired up the thermostat so that it ran until a certain temperature was reached.
The MicroVAXen got nice and toasty, THEN the fan cut out - so they got even more toasty pretty quickly.
Oh, did I mention this was an ELECTRONICS company ?
Flooded basement iz Urban Legend in as much as if might have happened and it will be repeated but obviously no-one is paying much attention anyway. Eeeee when I wert lad 75% of NORAD went down because the bloke who wrote Access moved desk and they did not terminate his T-junction. Everyone else went ape because their Excel spreadsheets, where the real analysis was done, could not connect to his shit-base to get any data. Everyone is flapping about like expert flappy birds trying to diagnose the wrong problem and I get back from a bit of R&R to have them flap at me. I leave the meeting to go take a piss and on the way back take a detour via my cardboard box of 50ohm terminators and twist one in, wait a while and then check my e-mails. Gregory wants to know if his boss can have his submarines back 'no questions asked'. I return to the meeting and say things seem to be working now mentioning some sort of incompatibility due to an upgrade.. quick check and everyone is happy.
I saw a flooded (well, two inches of standing water) in a network cabinet room on a fourth floor. My employer had the fourth floor of the building, new clients moved into the fifth floor and (without telling us) the landlord let them put a new toilet above our cabinet room. One plumbing installation accident later I got a late-night call from our remote ops saying "the cab room is flooded!" I was very confused when I (still more than half asleep) checked the main cab room in the basement and found it dry but everything was down! It wasn't until I noticed water coming down the rear stairwell that I thought to check our main floor.
I spent 3 months on the 10th floor of 1 Canada Square (*the* Canary Wharf tower, for those of us old enough to remember it in Long Good Friday era and immediately afterwards). The whole floor was being used as teaching/lecture space, so only had about 2 PCs on it. Unfortunately, the building and associated systems had been built on the basis that every desk would have a PC & CRT monitor pumping out heat - so there was an overabundance of cooling, and no heating at all. As a result, we were sat there in an August heatwave, clustered around 3 hastily purchased fan heaters, because every floor on the building was cooling for all it was worth and the "natural" temperature had settled at about 10 degrees celsius.
Around that time, there were constant grumbles from the management company, because all the excess heat was vented out the roof resulting in plumes of what one local resident thought was smoke. So every other morning, he rang the fire brigade & triggered a callout, which by this point were costing the owners about £20k a pop.
We still make fun of our Security Officer over this one.
When i first started at my current employer, we had NO BCP at all, so we started doing a weekly team meeting and decided to introduce a failure scenario and talk over what we would do. The first one was flooding of the Server room.
Fast forward about two weeks, our Oracle DBA gives the Security officer a call on Sunday, she was in running start of day tasks and noticed the floor was wet on our level. Security Officer turns up and they noticed the small pipe on the back of the water cooler has popped off, it's less than 2mm in diamater, however it's right near a power socket, so Security guy has the DBA push him into the kitchenette on a chair with his feet in the air to power off the cooler and turn off the water, except half way in hey release the broom isn't long enough and he walks back out after sorting out the power problem.
The next day, i get there and the Reception roof is all caved in, we lost only two PC's and four monitors with associated keyboards and mice. Looking in the PC's there was a whitish film from the roof tiles that had disintegrated. We then spent the next four days trying to hear each other over multiple fans trying to dry the carpet, which isn't good for a company that is predominantly filled with call centre staff.
We never did another disaster scenario ever again.
Server - or machine rooms - above ground level are easily flooded when the A/C's condensation outlet is plumbed into any old down-pipe. Take one excessive storm (or a simple blockage), the water backs up the pipe and then spews over your racks on an upper floor.
And never put your machine rooms in the basements, that's where the sewage goes when anythings wrong.
Although we had a machine room in the basement of our hospital we at least had the sense to have raised floors and stepped doors to stop the frequent sewage floods. For some strange reason they didn't consider it worthwhile doing the same for the morgue which meant grieving relatives visiting were almost always greeted by the essence of turd.
My first night on-call, ever. I'd been reassured that it was easy money, nothing ever went wrong. So when the phone rang at 2am, I was all "haha, shift ops hazing the new guy, good joke". Unfortunately there genuinely was a problem, fortunately it was an easy fix.
Same place, other times. Because "nothing ever goes wrong", the old-timers have on-call divvied up between them. I occasionally get the nights no-one else wants. And something ALWAYS goes wrong on my shift, such as a filesystem that's been filling up with error logs that hits critical on MY night, so it's not just a cleardown task, it's fix-the-root-cause-on-someone-elses-screwup as well. Or a script that's worked a thousand times but fails when I'm on call. Certain suspicious minds think I'm creating the problems so I get the overtime, whereas I wonder if the Ops just don't bother calling the old lags.
Another time, another place, after the office party - An A/C failure takes most of a machine room down, hard. The on-call guy escalates to management, who call all hands to the pumps. Which sounds like a recipe for disaster, but I discover an ability to metabolise alcohol into brain cells and recover a knackered HP-UX server everyone else had given up on. Not as impressive as D though - D fixes a bunch of seemingly-terminal AIX systems. Come Monday morning, management are full of praise for D's skills and team spirit. D is like "Huh? Whut?" - doesn't even remember getting called, let alone the reanimation magic!
The company I worked for built a new building and put the server room on the second floor, along with lots of cubicles including those for the IT department. About two weeks after we moved in we had a heavy rain and I got a call. When I got to work there was water running down the stairs. The second floor had about four inches of water retained that didn't run down the stairs. Right next to my cube the sheet rock on the wall was busted out and a ten inch pvc pipe was visible. The designers had put in a 90 degree bend, then about foot over another 90 degree bend to angle back down. It was the roof drain. The water had backed up and the weight on the junction caused it to burst. The server room was higher than the rest of the floor on that story as it had a raised floor. The water level didn't get to the wiring under the floor. So the equipment was fine. We worked remotely for about a month as we had to have mold removal and other time consuming operations performed. But the server room stayed live.
Another story.
Call comes in at stupid o'clock in the evening for service to a system at stupid early o'clock the next day. Ohh and could we bring service tool xyz because theirs would take too long to get out of offsite storage.
Yeah sure. I'll just thow the 900 kg of hoisting tool into the back of my Fiat Panda and be on my way...
The worst part of it? THEY designed the bloody tool and knew what it was. But suddenly it was OUR fault I couldn't arrange transport for a tool I would have to take off the production line.
My first job was at a web house that hosted a number of fortune 500 companies websites. They had just moved into a new building they had bought and were very proud to show it off to the new hires. While taking my tour they stopped on at the new server room they had just completed. We stood looking through a huge plate glass window at all the flashing lights of the switches, listening to the HR lady tell us of the cost of all the new SparkBoxes, racks and equipment. As I'm taking this all in I notice in the ceiling of the new server room all the bright and shiny sprinkler spigots for the fire system. When I mentioned the idea that, perhaps water and electronics might be bad, the HR lady stood and looked like a deer in headlights. So after I started I went to my new boss and said the same thing to him. The look on his face could have matched the HR lady. He ran into the server room and stared at the ceiling. After cursing a blue streak in Chinese (he was Chinese), the quick fix was to hang a tarp over all of the equipment until the pipes for the sprinklers could be cut. Fun times...