2199 posts • joined Monday 31st May 2010 16:59 GMT
Being expendable is no bad thing.
Expendable people get vacations. The kind where they don't have to remote in every day for four hours to deal with email/various fires. I am okay with being expendable.
One thing that needs to be brought up in these discussions is that, in IT as with many occupations, heroics aren’t rewarded. You won’t get a pat on the back, a bonus or any respect from your peers. (Certainly not if you blog about your experiences!) Hard work isn’t rewarded. What you might get for your trouble is something approaching job security. Maybe. Not because you cannot be replaced, (everyone can be replaced) but that replacing you with a sufficient number of 9-5ers is financially unpalatable.
That ever shrinking niche is SME systems administration. It’s a world where the resources (time, money, manpower) are tight, competition is fierce and sacrifices are made not only be the rank and file but the business owners as well. Part of the social contract in place is that the business provides non-monetary incentives to keep staff on. Allowing the nerds their own coffee pot, ignoring the plushies and geeky posters and putting up with the quirks, tics and lack of social graces. In exchange, we give 110%. We go above and beyond with the expectation and understanding that we won’t get fired for wearing a binary tie and inadvertently causing the dress code nazi an apoplepsy.
In larger enterprises, union shops or government IT departments, the social contract is different. I think that so long as one is aware of which social contract they are signing, they are not being taken advantage of.
Now, when one side reneges on their part of that bargain…that’s a whole other story. What I find far more common (and despicable) than one lone individual taking advantage of a colleague’s enthusiasm is the corporate culture that says “the job market sucks. It’s time to rescind our side of whatever social contract we have with our staff because they have no available alternatives.” For some companies that is cutting pay/hours. For others it is removing non-monetary incentives from the equation. For all of them it involves removing job security. (Rule by fear!)
When the market picks up, and EVERYONE starts heading for greener pastures, the corporate word bemoans the fickle nature of employees and calls for greater immigration.
Technology and best practices aside, we’re really having a debate about the value of LOYALTY. Corporations don’t tend to have any towards their employees. What value then for an employee to have loyalty to their company? I don’t have an answer to that. Some part of me finds it important. I think that in the SME space, the concept still retains some tattered shreds of value, but it is here that my experience hits a wall.
I’m good at this here computer fixing business. I’m good at research and I can invent new and innovative ways to solve almost any technical problem, even on impossible budgets. I work hard, and obviously have a self motivated (driven?) nature that is fairly rare amongst folk my age. I have identified my strengths. Hurrah.
What of weaknesses? Well, I obviously work too damned hard...to a detrimental point, most likely. Somewhere in there though is the knowledge and realisation that I need to learn when to risk saying “nyet”, even if it runs the risk of breaking the social contract an SME admin has with his employer. Always there are limits. I am still learning mine. I honestly don’t think this project was one of those cases…but the reaction of my peers here on El Reg show that it was probably on the border.
That’s a good thing though. Growing, learning...even these very debates and discussions in the comments: where are the limits? Where do the responsibilities of the business owners, management, IT staff, and the user base begin and end? I suspect there is no one true universal answer. Each environment is different; each admin must make their own independent judgements. By reading articles and comments from other admins, real life stuff that isn’t the sterile perfection of a whitepaper, we can see where we stand in relation to our peers. Who knows, we might all learn something.
I sure have.
Except that there was a contingency plan, and it isn't so simple as "the business side of the equation withheld funds." Moreover...the project didn't fail. In fact, it occurred more or less as I figured it would. Some unexpected things went wrong, but no more than I figured would...given the circumstances. A bad couple of weeks were had...but they are over with, and everything is working grandly.
For all the wailing, gnashing of teeth and commenters with axes to grind...the project in question did what it was supposed to do, with predicted parameters and frankly, it could have gone a lot more sideways than it did. If it had, I still had an abort option.
I agree with you: IT as a whole, myself included, accept too many failures. We are very quick to accept compromise. Truth is, doing IT "properly" is bloody EXPENSIVE. Outside the reach of some organisations, like mine, trapped in the mid market. It's easy to say flippant tripe like "well just quit and work elsewhere." That's bull. I've a family, mortgage payments...what's more, I've a sense of loyalty and professionalism that prevent me from screwing over folk I work with, and for.
I understand your overall frustration with how this same story of failed or skin-of-ones-teeth IT projects is continually repeated. I honestly do. The truth of the matter though is that life is never so black and white as it can be made to seem by post-operative diagnosis of some else's issue over the Internet. That's the point of talking about the elephant in the room and doing articles not about things SHOULD be done, but rather how they sadly end up being done in the real world. Learning, thinking...growing our minds beyond just our own experiences by taking those of others into consideration.
Is there no value in your world for innovation beyond “network management by whitepaper?” Should all companies that can’t refine/upgrade/purchase/manage/whatever their IT to some arbitrary standard simply close up shop and go out of business? Who gets to decide? Why should any business owner/manager prioritise IT over other business units? In the SME world, there is almost never enough to around. Why should IT be immune to the concept of “good enough?” What makes IT so special?
I’d love to know, because as someone who works in IT, knowing what supposedly makes me more important than the rest of the company would be a fantastic boost to my ego. Who knows, maybe you have a valid answer with concrete reasoning. Then our entire industry can use it during our various budget talks each year.
Re: Gogle Ops
I've been doing SME networks for so long now that "interesting" is cognate with lack of sleep. On any on the networks I oversee, if I hear someone say "that's interesting" I check the coffee supply. I realise that working for Google Ops as the lowly blue collar drone that my creds would get me would probably be boring.
Oddly enough, I could do with a short spell of boring. :)
There's truth overall to what you say. I thrive on the challenge of making something out of nothing. Sadly, it's a dying art. Computing is heading in the direction of appliances at all levels. Unless you have an iron ring HR types feel you unqualified to innovate. (Or apparently be paid a living wage.) What then for dinosaurs like me?
If the future is fairly inevitably either managing appliances or herding clouds…cloud herding actually seems the less boring of the two. Much as I’d love to, I’ll never be able to afford to get an iron ring. So dreaming of being a Google Cloudherder seems as reasonable a future as any other.
I’d take rewarding over buzzword bingo. I’d also choose “pays a living wage and will exist in 10 years” over rewarding. Both combined would of course be the best possible option.
This is exactly what I have been wanting for /years/. Please, please, PLEASE kick down to the SME market!
Love to "just say no." Issue is that I'm not just a numpty with no knowledge of the stakes. I can't simply put the onus on the folks "up thataway" and wash my hands of it. There aren't any other good choices. There aren't more resources to be had. It's not that the brass are being too skinflint, it's that the resources that might have made this doable the "right' way absolutely had to be redirected to other areas.
It sucks and it meant a really bad couple of weeks. It meant having nothing but a series of bad alternatives to choose from. Still, it had to be done, and “just saying no” really wasn’t a viable option. It isn’t a matter of anyone’s ego, just of the cold reality of the times. As to “the answer is obvious”…finding another gig isn’t exactly easy. Even if I could, I wouldn’t throw the other admins here under a bus like that; there is a certain minimum necessary to get done before I would feel okay with heading for greener pastures.
The hellish weekend was one that bought me probably five MONTHS of trying to fix the old domain/users/security settings/permissions/GPOs/whathaveyou. Given the horrifying backlog of work I’ve got on my desk, I still think it was critically necessary. Yeah, I’ve got a pile of work in front of me, and yeah it wasn’t pretty…but if I want to keep this place glued together then I have to make some pretty hard choices.
For what it’s worth, I absolutely refuse to accept new projects for the next 18 months. The manhours/month required for operational support and regular maintenance has to drop dramatically before any new projects change things up. When a company/network grows too fast there is always a good risk of hitting exactly this wall. The methods and processes in place are too ad-hoc and informal to support the extant structure. Thus I need to take the next year and a half to rationalise and formalise everything I can.
Saying “no” to a project like this would have been cutting off my nose to spite my face. Political implications aside, the project benefits IT Operations just as much as anyone else. Not in the short term certainly…but definitely in the long term. From my point of view, if that means a crappy couple of weeks then so be it.
Written down and placed in a sealed envelope. Given to CxOs along with other critical information (documentation paths, directory restore passwords, etc.) In the unfortunate event I get hit by a bus it is required that at least some attempt be made to provide business continuity. In a way, that really is what the whole project has been about. The old network was too fragile by far. If I got hit by a bus then the chances someone could simply have walked in and taken over were pretty slim. That’s some bad juju right there, so most projects over the past year have been about addressing exactly that concern.
Sadly, when you are removing band-aids upon band-aids upon band-aids, periodically you run across one that just has to be ripped off. The worst of it however is over. We’re not out of the woods yet, but I can definitely see daylight. There is a visible future wherein the network complies with as many standards and best practices as is possible given the budget…and all divergences are carefully and thoroughly documented. The % of this network for which the configuration exists only in my mind is decreasing steadily every day.
I couldn’t be happier about that.
Your take is correct. I am simply not quite that cold blooded. I recognise that things are unsustainable over the long term. The issue is that I won't simply drop everything, quit, and leave everyone else holding the bag. If I am going to keep on here, then the network needs to be rationalised and the workload brought down to a more sane level. If I am going to leave here then the network needs to be rationalised and the workload brought down to a more sane level. Stay or go, what needs done is no different. To my mind simply upping sticks and leaving would be a dick move. That’s not an excuse…it’s a moral choice.
I might be that person in the proverbial corner some times…but still…this is my network. Stay or leave, my professional ethics require me to get the network into a serviceable enough state that no one else will have to put in these kinds of hours on it ever again. esrase the mistakes of the past and start it fresh. That was kind of the whole point of the overhaul…
I find "pristine" networks have one of two things in common:
1) They are block upgraded on a regular basis. All tech replaced at around the same time, tested to work together and then phased in side by side the old network. Hardware, software…everything. Basically a completely new tested network is rolled out ever few years with nothing harder to accomplish than a small data migration.
2) The network admins in question have access to two of the following in spades: Time, Money, Manpower. They can then do “organic” network growth by throwing their two plentiful resources at the network ensuring everything goes to plan.
Life is significantly more interesting when the network never stops growing and you never quite have enough resources. In the medium term the only way to survive is to reduce the resource consumption. Of course everything seems so easy when you are reading it on a website and not actually living through it.
I have to admit to some jealousy when reading the comments to these articles. So many people with such unquestionably pure networks! The number of people who apparently are in the position to simply put their foot down, say no and go elsewhere without financial hardship or feelings of guilt is extraordinary.
That said, I’ll be honest when I say that El Reg’s readers are on the whole a fantastic bunch. I’ve read several really good ideas in these comments. I’ve also had quite a few really worthwhile e-mails from readers also bearing things for my mind to chew on. Life’s a learning process; I don’t remotely claim to know it all. I have learned as much from the reaction of readers as I have from the project itself.
The articles have sparked debate; both here in the comments thread and in the real world where my responsibilities lie and I have to get things done. I am optimistic that the result of this debate will be positive. I have learned things. With luck some of my readers have learned things. What more could I ask?
I've not attempted to "trash [JaitcH's] linguistic skills." I merely seized a very rare opportunity to prove him wrong on something. He's right more often than not and he seems to enjoy proving others wrong as much as any of the regular commenttards around here, so...
...I get one narrow and ultimately hollow victory in. Also: is etymology a linguistic skill? Or does that count as a study of some form. Sort of like History, but with more of an OCD bent to it?
Pint because no offence was intended.
Time to refocus on security, Adobe.
Microsoft refocused on security and we went from "ha ha" default security in Windows 2000 to "not impenetrable but arguably within reach of the competition" in Windows 7. It’s not just Flash; it’s Adobe Reader too.
I second Dan's noscript call:
And I raise a few more plugins besides:
Adobe, Pull your damned socks up.
The computer attached to the human's skin varies greatly depending on the human. Some have deficiencies in hardware, software or both that lead to a complete inability to perform tasks such as unloading a dishwasher simultaneous with any form of contemplation.
The neat part about robots is that once you get the programming and hardware design right the failure rate is significantly lower. As is the performance variability of the operating unit. The downside is that you tend to get a shorter cumulative operational lifespan given the limitations of current materials technology. Additionally; wetware regenerates minor to moderate damage. Robots do not.
A quick Bing of "Danica Patrick" reveals quite a young lady. I thusly am forced to conclude that you have some high standards, sir. Unless you are the chiselled image of a Greek god yourself I would venture perhaps unreasonably so.
I’ve not seen her act, so that part I can’t really comment on.
I love you.
Avarice I can see being derived from avaritia. Avarice comes to English by way of some older interlinguistic mixing with French. French of course shares a common Latin root with all romance languages and they all head from there towards proto-indo-European.
Greed? Greed derives from "greedy" which follows a chain of language into the middle English through proto-Germanic all the way through to proto indo-European.
The proto-indo-European words for “greed” and “avarice” as far as I know had very different meanings. Contrary to avarice, which means what it does now, in virtually all antecedent languages greed means "hungry" or "to hunger for.” This is not precisely cognate with "avarice" except in certain dialects of modern English. (Even this is still debatable as even in certain modern dialects avarice is seen as a choice in behaviour where greed is still seen as a behaviour performed from necessity.)
Pedant for obvious reasons.
...cannot be good.
Ever more the reason to continue trying to replace everything I use with open source. I agree with JaitcH. What I buy I OWN. If you want to rent me software; show me a rental agreement. (I will promptly not rent your software.) Creators deserve to be compensated for the creation of their works…but regular citizens deserve to own what they purchase.
Your assessment (like many others) makes some big presumptions. First that we had the resources to do it any other way. Secondly that the company would have had the money or inclination to provide the proper resources if only I had thought ahead and told them they were required.
We didn’t have the resources required. In fact, no matter how loudly I complained we would never have had the resources required; the funding simply didn’t exist. Not only that, but as bad as it all sounds, two days of disruption (the Monday and Tuesday immediately following the changeover) during what is typically a very slow month for the company was significantly cheaper than the cost of the equipment required to do the change properly.
As to getting reamed out by C-level executives, bring it on! I would love an actual review of how IT in my company is (or is not) organised. Formalised (preferably hierarchical) management structures, direct and clear lines of authority and responsibility as well as concrete budgets, project timeframes and resources allocations would be a significant asset.
Put another way: it’s a 75 man company. There is a fantastic amount of “seat of the pants” /everything/. The crazy part is that we do actually succeed in getting it all done, with next to nothing and keeping it all running. I happen to agree that proper planning, reasonable timeframes and adequate resources are the proper way to do any project. I also don’t happen to live in a world where that’s /ever/ possible.
Telling the powers that be “that is impossible” will only earn you a “make it possible.” Everyone will sit down and discuss the consequences; if we push forwards with X, what will have to suffer to reach destination Y? In the case of a network changeover like what we went through, if you don’t have the resources and tie to do it, then you have to accept exactly what happened.
We knew it going in. It wasn’t a surprise. We accepted that it would turn out this way. There was no available alternative. It’s the reason I write about my experiences; documented examples of how badly things can go are good case studies of why “doing it the right way” are necessary.
I thusly can’t agree with your assessment of “biting off more than I can chew.” I knew what I was getting into. I knew how miserable it would be. I also knew that all the alternatives were worse.
Well, I knew going into this that I would be up all weekend. Given the total lack of resources for this project it was inevitable. The contingency plan (reverting to the old network) would only have kicked in had we been unable to get 100% of the critical systems and 80% of the secondaries up and running by start of business Monday. As it was, we just slid in under that line.
As to why we did this in the first place…that’s pretty complicated. There were several requirements at play. The IT folk (myself and the other sysadmins) needed a new domain; the old one Active Directory schema was pretty beat up for various reasons and it was (shockingly) less work and hardship to block replace than to repair. (With more than 75 users, that would not have been true.)
Secondly, in order to facilitate things for some planned future single-sign-on jiggery-pokery, we needed to have all our users transitioned to the new username naming scheme.
Thirdly, the CTO is big on “using the newest software simply because it’s the newest software.” The pressure was on to upgrade all the software to Microsoft’s latest and greatest.
Lastly, this absolutely had to be done by September. Due to budgetary considerations, we weren’t going to get the software licences until at least July, during which I was on site at other locations preparing server hardware and deploying Wyse thin clients. This essentially meant that despite all the planning earlier on in the year, we didn’t actually have a chance to start knocking together the domain controllers, template userspace VMs, email server or anything else until the beginning of August. The changeover occurred the weekend of the 20th of August.
Due to resource constraints, there were some pretty severe limits on what we could build beforehand, but we did manage to get most of the user creation out of the way. We got all five DCs built, the email server built, and the BES/WSUS/OCS/Teamviewer server built. (Though OCS ended up having to be reinstalled.) We got the template userspace VM built, but only so much could actually be put into the template or pushed via GPOs…a great deal had to be punched in manually.
There were additional considerations, but those are the big ticket items. Many companies go through such changeovers, but in general they take 3-6 months to do it. To do so, you have to have the gear to run both networks side by side. You establish trusts between the domains, allowing users on the new network to access resources on the old and vice versa. With the right resources and enough time, these sorts of migrations don’t have to be nearly as painful as what we went through.
Indeed, the bits discussed were only the painful parts. There’s plenty of stuff still active on the old network. (Our site-to-site VPN servers for example.) Since they don’t actually have to be part of the network to do their job (for example flinging bits at the right subnet) they haven’t been moved/replaced yet. The painful stuff is the “live, in your face” elements that we had to move from one net to the other without the ability to phase stuff over gradually. There’s still plenty of stuff for which we do indeed have that opportunity and so we are taking our time.
It’s hard to tell the whole tale from every possible angle with every possible scrap of information in 5 500 word(ish) articles. When I sat down to write all of this immediately post Doomsday Weekend I had somewhere in the neighbourhood of 12000 words. Some of that information got cut into other articles (not directly Doomsday Weekend related) such as my recent crackberry and Exchange 2010 articles. The rest has languished in my “potential material for future articles” folder. There is a vast gap between 12000 words worth of story and the 2500-ish I was asked to tell it in. (I am not yet so good a writer as to accomplish that level of compression without zipping the document.)
It is thusly why things like “the reason for the move” or “what the contingency planning” were got left out. When choosing what to make the articles about, I chose to talk about the worst of the worst; lay my mistakes and errors bare for others to learn from. I finished it with a bit of a happier article for no particular reason other than to change up the total “doom and gloom” feel of the set.
Thusly I get to spend time in the comments answering everyone’s questions about the who what when where how and why. All is good so long as somewhere along the line some newb administrators learn from my mistakes and avoid them. That would be the best outcome that I can think of for these articles.
Comments on El Reg often have as much or more information as the articles themselves anyways. In any case, thank you for the nice comment!
The message you should be taking away from this is that it isn't the way to do things. Comments sections on previous articles have beaten this to death, but I'll go for another round.
First off, these articles are about how things go badly when you don’t have the time, resources or opportunity to plan for every contingency. In a perfect world, noone should EVER do a huge cut-over such as was documented in these articles. It’s a monumental pain in the ***, both for the sysadmins and the users. I didn’t exactly have a lot of choice in the matter; the reasons behind which are also detailed to death in the comment threads of the previous four articles.
What was the contingency plan? The contingency plan if this all went horribly sideways and couldn’t be reverted was that the vast majority of everything was virtual machines. While we didn’t have the resources to keep everything online at the same time (I.E. run the two networks in parallel) storage space is not something we lack. Thusly we had (and still have, since I haven’t deleted the old VMs yet) the ability to simply turn the new network off and fire up the old one if the world truly did end.
The heartache there would have been disjoining the physical computers from the new domain and rejoining them to the old. That wouldn’t be the worst thing; those systems would still have all the old profiles from the old network; things would simply have carried on the way they were before. At any given point in the operation a complete reversion to the old network was no more than half an hour out. If you want to take away one positive message from this all, it is the awesomeness of Virtualisation in that regard.
As to signing off on this project, people don’t “sign off” on anything where I work. You are given a series of deadlines and requirements then left entirely up to your own devices. There aren’t formalised processes for anything unless you create them. There is rarely enough time to create any formalised processes nor the resources to do so. Every chance I get, I sneak a little organisation into the mix, but I am not often afforded the opportunity.
Again; you’d have to read the comment threads from previous articles, I went over a lot of this territory already. Long story short; we’re undermanned, with too few resources and in addition to keeping the network running, it’s size is growing at a fairly rapid pace. I don’t put in 80+ hours because it’s the best way to tdo it. I do it because I have no choice. Noone should be taking the Doomsday Weekend series as a shining example of what do to, but as a series of lessons in how to avoid making the same mistakes I made.
Additionally, they are a lovely set of articles to think about where you work, and be thankful if you happen to work in a place with enough resources to pull these kinds of changes off (or do anything, really) “properly” and “by the book.” Where I work really isn’t the worst there is either; there are plenty of sysadmins out there with fewer resources, less time and more demanded of them. These articles are indtended not only lessons in avoiding my mistakes, but in not taking what you have for granted.
I doubt that.
An Arab male might have been tackled and wrestled to the ground in a very undignified manner - people are less worried about tackling men than they are women - but it seems to me that Aircraft staffs are supposed to be trained to recognise things like panic attacks. A great many people suffer from them, and they are /ENTIRELY/ involuntary. Nobody CHOOSES to have a panic attack. The really lucky sufferers of them can sense them coming and take a pill to try to ward them off, but the very involuntary nature of them makes panic attacks all the more devastating.
I would never put a good bout of xenophobic racism past our American overlords…but panic attacks really would have to be something that airline staff (like bus drivers and other such mass transit personnel) would have to be trained to recognise. They are simply a fact of life.
To be really highly pendantic about it
You have no joy in your soul. How lovely for you that you must sneer at what other people enjoy. Who is sadder? The people who get a moment's passing interest and amusement from debating the pedantry related to a series of Televisions shows they found interesting, or the individual who feels so threatened by this that they feel they must lash out at these harmless nerds?
TL;DR: It’s not a playground and folk ‘round here pity (rather than respect or fear) bullies. Grow the fnord up.
Almost too much information for 2:40am...
...but I'm glad I stuck through to read it all. Fantastic article with much juicy brain informations. More like this!
...and oddly enough, Bing is getting really useful. I converted to Bing about a month ago. I have used it pretty exclusively since, but I will be honest when I say I have had reason periodically to go back to Google. I am batting around the idea of writing a Sysadmin Blog article on it. Anecdotal comparison of both engines as relates to IT and non-IT searches. (I consider IT searches as those required to help me discover the source of errors, either in an IT Operations context or a PHP development context.) Additionally I could talk about user uptake of Bing (or lack thereof) during the latest network overhaul.
I do have two slots to articles left for which I don’t have topics…I am unsure if I want to burn one on Search Engines though. What do you guys think?
Qo'noS is the Klingon homeworld, but it is only one planet among many. Kling or more properly Klinzhai would refer to the "Citizens of the empire." Mostly Klingons, but some subjugated races that had been integrated into Klingon society as fully citizens as well. To invite the totality of the Klinzhai would be to invite members of many various species that made up an empire spanning hundreds of light years and dozens of habitable planets.
Mine’s the one with the misspent youth in the pocket…
I'm sure they'll get away with "opt out after the fact" as well. "The vast majority of people want Google in thier home. Tigheter integration between consumers and the third half of thier brain helps google to deliver quality advertisements that people want to see. If you don't want Google in your home, then maybe you shouldn't be doing anything in your home that you don't want anyone other people to see."
You are debating prototypes
on an IT website. IT is an industry where we are regularly subjected to "prototypes" being sold to us as full version product. Just look at the entire Microsoft 2007/Vista generation of product. "Alpha" level product is almost unused anymore. The line between "beta" and "release" blurs significantly...
...what exactly does prototype mean to such a crowd? In the hardware world, I am sure it has great value. (Or not. iPhone wot?) Developers have cheated around the edges of the concept, and management/marketing types have blatantly ignored it. Operations guys have suffered the realities of this product development apathy for decades now.
Prototypes? How droll. They don’t contribute to shareholder value!
Just print it on a piece of silicon and it’ll fit in the palm of your hand! Simples!
Seriously though...when they do miniaturise the thing…that tech + VTOL spy droids they already possess make for some cheap and efficient "canary in a coal mine" droid mappers. Screw the military, these blighters would be excellent for civilian use. Hostage situation? Send in the droid. Suddenly you not only have the building blueprints to plan any necessary assault, but also a comprehensive mapping of any defences the hostage-takers have put in place, office furniture that might impede you, etc.
Could be a hell of a think for folks who need to check sewers/abandoned subway lines/what-have-you for structural stability before sending wetware down the rabbit hole to shore the thing up for another few decades of wear. This tech has some real possibilities.
Of course, it could also be the start of those flying “Minority Report” compliance droids. Let’s just ban Google form ever using you up front and save ourselves the grief and enrichment of lawyers worldwide, shall we?
Band-aids on top of Band-aids on top of Band-aids until guess what the only path through is to burn it all down and start from scratch. "Organic growth." Go from one server in one site to 40 physical servers and 200 virtual in five sites in seven years. Not exactly Google levels of growth...but then Google doesn't only have three Ops guys, either. IIRC correctly, Google still does burn it all down and start from scratch every few years anyways. They just have the manpower and the resources to transition rather than cut over.
Man, wouldn't being a Ops guy in one of their DCs be neat? Has to be one of the coolest Ops jobs on the planet.
A little too tl;dr, actually. As much as I would love to dump it all on the devs' laps...they are in a similar situation. Not enough devs, many demanding customers and a rapidly changing IT landscape. I do have to give them the benefit of the doubt. Working with devs from the companies that sell us their software for years now…they seem look good, but dramatically overworked folk. There are no easy fingers of blame here.
Please read the part where I said that I believed the passes should be non-revocable. The idea is to create a barrier to entry. You are required to shell out a non-trivial amount of money to become registered...but that registration is irrevocable. I would recommend that the registrations be renewed every five years, similarly to keep the signal-to-noise ratio down.
The difference between a REGISTRY and a CERTIFICATION LIST is that a registry is simply "we are journalists and we are telling everyone so." A certification list is "we are applying to be journalists, please let us be so." That’s not thought control. It’s more like a tax. It ensures that only people willing to put up money (and thus be serious about being journalists instead of simply single-cause fly-by-night Internet griefers) will be able to be recognised as Journalists.
In Canada, we could set this barrier around $500 for five years. A completely trivial amount for a news agency, as this wouldn’t be per PASS…but a registration number that allows the agency to issue as many of it’s own passes as it should choose. For a dedicated blogger/hobbiest…this might be a slightly painful amount, but not out of the question if they were committed to their hobby. (It would still be among the cheaper hobbies I know of.)
$500 is however a lot of money for someone to pay in order to issue themselves a press pass to be considered a journalist for the sole reason of barging in on an event and causing trouble.
If you are worried it keeps citizen journalists out...nothing would prevent a group of citizen journalists from banding together to form a sort of “freelancer blogger’s union.” That entity would be responsible for policing itself internally, but could apply for the $500 registration and issue press passes to all of it’s members. Whether or not that entity were taken seriously by anyone who wasn’t FORCED to take it seriously (because it is a registered entity allowed to issue press passes, the government and police would have to allow it’s journalists the same freedoms as any other news organisation) would depend on how judicious it was giving out those passes.
An entity that was formed by troublemakers for the sole purpose of being able to get press passes to harass people would most likely still have to be allowed by police to take pictures/conduct interviews/etc. It would however be blackballed from EVERYWHERE ELSE by event organisers/companies/etc. Unless it was completely formed by sociopaths, that social pressure would force it to weed out the worst of it’s membership. Again, improving the signal-to-noise ratio of those who have press passes via an economic incentive.
Understand I say the above as a die-hard socialist. I may disagree with capitalist methods most of the time, but in a true democracy, it believe the financial incentive here is the only way to increase the signal-to-noise ratio.
It was indeed a farce.
I happen to agree. We have three sites not located in the same city as myself. I took the time on the company dime to go to two of them. I trained up the store manager and assistant store manager in the new software insofar as was possible given the dearth of test equipment. When I attempted to do so in the second store I visited, any attempts at training the manager were brushed off. "Just install it, we'll figure it out when you do." The assistant manager was on holidays.
As we have a fellow in the third city who is capable of swapping a hardware component with about 90% success rate I was strongly discouraged for spending the time or money to visit the third city. Surely I could simply tell the pseudo-tech in the other city how the software worked, no? Predictably, this is the city where most of wailing occurred. Not for lack of superhuman and heroic effort on behalf of the peusdo-tech on the other site – he tries his best and I think him greatly for it – be he’s not an IT body. (Let that in no way diminish the excellent resource he provides. I couldn’t administer that city without him.)
Risk mitigation isn’t something that’s considered until after folk have experienced that particular kind of failure. I went into this weekend knowing /exactly/ how bad I would be. I told everyone how bad it would be. I had a conversation with the CEO in which I explained to him that this weekend would be hellish on the IT Operations staff and the whole week following would be shaking out the bugs. We were going to miss things. We were going to fail at thing, or some things would go wrong unexpectedly at the last moment.
Oh, and I didn’t want to roll out the newer versions of the software. I’d have been quite happy to sit on XP, Office 2003 until 2012. Though to be honest I would have upgraded communicator to 2007 R2 because it actually /is/ better than 2005 R2.
As to staff period of rest and rotation…they were enforced. I ensured that the other sysadmin and the bench tech got rest. As close to a full eight hours sleep per night as we were able to provide them. I personally was the one who put in the 82 hours straight because I will ask no man to do what I am not willing to do myself. I believe in leading by example and if I can make the lives of those I work with easier by working a little harder…then so be it. Besides, I needed them bright eyed and busy tailed to handle the pile of crap that was going to hit us Monday morning.
It wasn’t an ideal situation. It was compounded by the fact that I made mistakes and misjudgements along the way. I should have had contingencies planned for a few things I didn’t and I probably a few to many contingencies for things that didn’t blow up. Throw in the things I couldn’t control or predict, as well as nonexistent budget and impossible deadlines and you have a Doomsday Weekend. (I don’t use the term lightly.) Learned a lot though…some of which I am hoping to pass on through my blogs.
Maybe to the very ill informed.
Every time I think of Africa I generally think of the African Union (http://en.wikipedia.org/wiki/African_Union) and the phrase "maybe there's hope yet..."
Trevor Pott. No s.
I don't view myself as in any way heroic. Simply backed into a corner. I do think that the other two souls involved in this are somewhat heroic: they have many other choices, but to but the shoulder to the grindstone anyways. I wish there was a way to big them up something fierce; I think they have done the impossible under exceptional circumstances.
Myself on the other hand, I am theoretically in charge of IT Operations here. To my way of thinking, anything good that occurs in IT operations is because of the valiant efforts of my staffs and everything bad falls on my shoulders. I didn’t have the resources to do this right and I was inadequate at MacGyvering a solution to pull it off given the limited resources available. That is indeed my fault, I will attempt in no way to dodge that responsibility.
This isn’t an article about heroics. This is an article explaining how it all goes horribly, horribly wrong when you're backed into a corner. When you get backed into a corner, learn from what's talked about in the article and find a way out of it. If nothing else then go back to El Reg on that day, print out these articles and present them to your manager. Whatever you do, don't let yourself get trapped in the same situation I got trapped in.
I knew exactly how bad this was going to be before I hit the weekend. I told everyone how bad it was going to be. I also had no other choice. Under no circumstances should anyone view what occurred as heroic or anything to look up to. I would be saddened beyond words if someone who read these articles was unable to learn from them and avoid the mistakes made.
This is the case here in Canada. My interpretation of the laws in the UK state that this would also be true however IANAL so take my thoughts with a spoonfull of salt.
"It's not up to you to decide what you want. Apple will take care of that for you."
Apple dictate your desires to you. Google will predict your desires before you knew you desired them. So it Google telling Apple what to tell you to do?
Neat stuff for Mac users. Since i will have to support a few here soon, this is great information. Excellent article, Kelly!
As regards getting a project manager, your sentiments on this are something I can’t agree more with. Unfortunately, my opinion on the matter (that systems administration and project/departmental management are separate occupations by necessity) is not shared. I can do one or the other, but trying to be both results in bad things.
As to "upgrade" department at a time...sadly it’s only possible when you have the gear. When you are switching active-directory forests, it's very much an all-or-nothing proposition. It's not simply adding a domain to an existing forest, but actually a completely new forest. Now, if we had had the hardware to run both systems in parallel, I could have done some fancy things with trusts, and redirected outlook for people on the new network back to the old e-mail server until we switched that, then directed people on the old network to the new e-mail server etc.
If we could have run the two networks in parallel I could have taken all the time I wanted to change over and this would have gone smooth as you please. When you have to do it using only the resources you have, then a forest change absolutely requires a complete cutover.
- Product Round-up Smartwatch face off: Pebble, MetaWatch and new hi-tech timepieces
- Geek's Guide to Britain The bunker at the end of the world - in Essex
- FLABBER-JASTED: It's 'jif', NOT '.gif', says man who should know
- If you've bought DRM'd film files from Acetrax, here's the bad news
- Microsoft reveals Xbox One, the console that can read your heartbeat