Could have been worse ..
Bruce could have been working at Hawaii Emergency Management Agency last January.
Welcome to Monday morning, dear readers. We’ll try to make it bearable for you by offering you a new instalment of “Who, me?”, The Register’s column in which readers share stories of having screwed things up. This week meet “Bruce” who told us that “Many years ago I was a junior sysadmin for a large battery manufacturer and I …
Bruce could have been working at Hawaii Emergency Management Agency last January.
Except for the whole hey at least I am in Hawaii thing which is also why they seldom turn up to vote either.
If someone had "accidentally" powered off that system it might not have been a bad thing, under the circumstances...
I've known more than a few people to do this.
I once told a soldier the portable version of a server was ready to be shut-down and packed up for deployment, he dutifully walked into the server room up to a (very) non-portable 42u rack and shutdown the servers in that. Cue calls to my phone from across Blighty asking why systems were down. Thankfully, they didn't take too long to bring back up, but I did have to explain what had happened to some much higher levels.
That was before the days of mollyguard, but I now make sure it's on everything to help avoid accidents (not sure it'd have helped in that case though)
"I once told a soldier the portable version of a server was ready to be shut-down"
Soldiers take things very literally. Never EVER label anything as "BOOT"
One of my wonderful moment was working with a group testing a prototype communications analysis syeyem that I had designed and was responsible for testing in the field. It was a very wet and cold day in the middle of the British 'wilderness' (as it is).
Performance was good, all of the teams identified safe places to site their equipment and there were no obvious complaints about (a very prototyped) UI designed to be usable with gloves and in a somewhat hurried manner.
But I was there for the last week of trials and though after watching people using it I would takle myself off to talk to some of the remote groups and ask them about their experience (hey I was all that ouldf be described as a usability assessor as well as technical deigner). The last team I saw were perched on the side of a hill with a bare gap between protecting woodlands. It was wet and vey cold. I was pleasuably surprised about their enthusiam. Universally approved enthusiastically.
I tried to drill down on why and was somewhat chagrined by their repsonse. It runs so hot that we tackle turns to warm our feet on it!
Neeless to say we did manage to get the power down, and the prototypes were deployed in Bosnia 4 months later......
> Soldiers take things very literally. Never EVER label anything as "BOOT"
Yeah, to be fair to him he was just having a bad day. He knew more than enough about the systems to have not made that mistake, just wasn't really with it that morning.
Not that that made it any easier to explain up the chain, of course.
A soldier would consider everything to be portable.
Would they not assume anything smaller than a bridge *is* portable?
"the middle of the British 'wilderness' "
Bridges are portable. We took them to bits, moved them somewhere else and put them back together again.
Just because we needed a load of lorries etc does not make them any less portable. Nowadays, they would probably sling larger chunks beneath Chinooks and spend less time stuck in muddy places.
I am not sure I can define what the RE officially did not consider portable but the limits will have only increased in the intervening decades!
Holes are also portable. Especially when a staff sargeant tells you that you dug his hole 3 inches too far to the left, pulls out a measuring stick and demonstrates that you dug it in the wrong place and 2 inches too deep.
This in driving sleet on a mountain in the Brecon Beacons in February.
"There was I, diggin' this 'ole,
"Hole in the ground,
"So big and sorta round it was...
"And there was 'e, standing up there
"So big and important with 'is nose in the air"
They don't write songs like that anymore.....
bridges ARE portable
"I once told a soldier the portable version of a server was ready to be shut-down and packed up for deployment, he dutifully walked into the server room up to a (very) non-portable 42u rack and shutdown the servers in that"
In fairness to him, Soldiers tend to get used to carrying 30kg packs. He might have a different idea of portable to you and I.
Have a look at EWBB - _that's_ a bridge.
In that case the soviets had portable factories. In the face of a German invasion in WW2 they completely dismantled many factories in European Russia and reassembled them in the Ural Mountains and beyond. I've always been impressed by that.
Railways were apparently the key to moving the factories.
Also never describe anything to them as ruggedised, unless you want it back with tyre marks on the case
I have created tiled bitmaps with the server's name on it (eg NODE1, PRIMARY DOMAIN CONTROLLER etc), so if you log in to a server via RDP you can instantly see which server it is that you're working on.
And, yes, this was preceded by me rebooting the wrong server. Now I can instantly see which server I'm working on, and this avoids mistakes.
Face it, a slew of open RDP sessions on your desktop will invariably cause you to issue the wrong command in the wrong window. Fun.
For SSH I use different terminal background colours for different environments, but when your brain's addled nothing works apart from red (production), and I can't change everything to red otherwise I'd start to ignore that too.
That works as well - until you get a colourblind appy (or sysadmin)... :)
I've forgotten which Linux it is and when, but on this one when you logged in as root the screen background was red with pictures of bombs on it.
I always check the window title, but use multiple desktops despite having three monitors. Multiple desktop one are local systems, multiple desktop two are production customer facing systems..
I had that on an openSUSE installation, 10 I think.
IIRC it was OpenSuSE that does it when you log in as root.
"I've forgotten which Linux it is and when, but on this one when you logged in as root the screen background was red with pictures of bombs on it."
Don't know about others - but Conectiva Linux did this.
"I have created tiled bitmaps with the server's name on it"
We tried that at a customer on their RDP servers, so users can quickly look at the desktop to tell techs which server they are on.
Turns out roaming profiles will cache the background image, even if it's set by GPO at the computer level.
I just rename it to the server name. After installing the "Feature" these days to turn on certain desktop icons
Now that's lovely.
> background was red with pictures of bombs on it.
Suse Linux had this. IIRC it was brought in after people did things as root, not recognizing they were, often enough with serious consequences.
After some major terror attack (don't recall which) it was removed.
SuSE 7 and 8 had bombs too, way before OpenSUSE.
DesktopInfo is a wonderful tool.
Just come up with a template INI file and stick it somewhere all RDP users can read it, create a shortcut in ProgramData...\Startup to launch desktopinfo.exe for all users, and bake that into your gold image. Easy to package and distribute as well.
Then you get the name of your system as big as you want on screen - colour code for prod/non-prod if you're fancy, and some cute at-a-glance statuses if you want those as well.
In late 1977 I managed to take down all the PDP10 kit at Stanford and Berkeley with a software upgrade. Effectively split the West coast ARPANet in half for a couple hours. Not fun having bigwigs from Moffett and NASA Ames screaming because they couldn't talk to JPL and Lockheed without going through MIT ...
Taking down half the internet (as was) should be the winner here, but the number of machines involved in 1977 is probably of the order of accidentally knocking an average size school offline these days.
Taking down TOPS10 was so easy a luser could do it by assigning too many disk name aliases.
Mostly done for shits and giggles on last day of term with the added entertainment of super-lusers going to the computer centre to wrongly claim "I've just crashed the system".
These days that would probably be terrorism or some serious offence.
In late 1977 I managed to take down all the PDP10 kit at Stanford and Berkeley
I hope they knew what wonderful halcyon days they lived in.
There was a PDP-6 or 10 at MIT that users were always trying to find a way to crash. So they installed a non-privileged "crash" command that would instantly crash the system. This took all the fun out of it and the system reliability was greatly improved.
...your leader is worth following. Screwups will happen. But will grace and a second chance happen as well? If you find these in a leader, make sure you follow that person.
Bet the admin here never, ever made the same mistake again; performance across the board probably amped up as the lesson drove home the seriousness of the job.
I encountered a great leader once, in my first year of college working in a copy and print shop. The owner - a recent immigrant from Lebanon working three jobs at once to get enough cash to bring his family over - always seemed to be a hard man. But one after one all-nighter running a $10,000 job I realized all too late that I'd screwed up the whole thing, and lost a major client. Margins are razor thin so we ate something like $9,600. When Mr. Hammad came in, I just had to press my "man up" button, tell him what I'd done, and wait to be fired. Instead he stared at me for a very long time, and took me in the back for a cup of tea. His one question - that still stings across the years - was "So... tell me exactly why you are so careless with our money? Our paper and supplies and our customers? Did you respect our customer? Is that what you want to be?" Then "I should fire you but instead I want you to stay here and show me who you really are" I wasn't fired and ended up running the business.
Guys and gals like that are tough to find, but the world really needs them. So try to be one.
One way of thinking about is that you'd just had $9,600 of training. If they had've let you go, somebody else would've got the benefit of that experience.
Good way to look at it. Been many decades now and I haven't forgotten the lesson. Spent a lot more for a lot less elsewhere. (Should I slap in a gratuitous reference to DevOps forced fun or would that kill the thread?)
Everybody on the team from server admins right through to data input should know that if they screw up then there will be no negative consequence on their career if they own up and alert the rest of the team immediately it happens.
Because fear can cause cover-up and attempt to hide the problem, and then the problem can compound out of control the further in time you get from the error.
Agreed. Still, I reserve the right to tease the person!
I once killed someones SQL server (and hence the app that was accessing it) by running a not-inconsiderable query. Ordinarily, this would have been fine, if slow.
The kicker was that the server had a dying raid drive. This would have been picked up in the normal course of events by the engineering bods and replaced, however that hadn't happened yet.
The extra load combined with the slowdown from the dying drive ground the system to a halt.
Cue some frantic work to get things back up, and a replacement drive sent back out ASAP.
Slightly better than my moniker of "Server Fucker".....
"That's NOT what an I/O port is used for!"
That's NOT what an I/O port is used for!
Damn it! I just go this keyboard. Now look at it!
My old boss stood on both the powerleads to our AS/400. They don't like unexpeted shutdowns. 4 hours later it was still booting...
We were very sympathetic and supportive
This type of thing is so easily done the only real safeguard is a fully redundant system with fault tolerance. It still baffles me today that major transport operators, banks and so on experience outages when a correctly architected and implemented solution should keep outages at bay, even taking disasters into account.
You forget the first rule of being a major corporation. Redundancy is extra money that could be better spent at the golf course.
Biting the hand that feeds IT © 1998–2018