Reasons to be cheerful • The Register Forums

Monday 13th September 2010 14:37 GMT JimC

Good grief...

Makes me glad I'm not at a Microsoft shop!

1 1
1. Monday 13th September 2010 15:36 GMT Davos Summit
  
  P45
  
  This isn't a Microsoft problem. This whole series of articles reads like a list of what not to do. No contingency, terrible planning, non-existent testing and consistently going against best practice. It's amateur hour. I have to say if it was my company I'd be looking at the way IT is staffed and managed.
  
  0 0
Monday 13th September 2010 14:46 GMT Jason Bloomberg

One thing ...

Others have said it already in posts to the previous articles and I've got to echo the sentiment; this huge system change had all the hallmarks of a potential disaster in the making. Was there anyhting which did not change drastically? Could you have made it any harder for yourselves?

One thing you haven't mentioned is what was the contingency plan?

Eighty-odd hours without sleep may be an impressive personal feat but is perhaps more indicative of the risk involved while flying by the seat of your pants. What if it had gone wrong? What if come Monday morning it was all tits-up and going nowhere fast? Put in another 80 hours straight?

I find it hard to believe that anyone signed-off this project or that anyone would want to be involved in it. The message I'm taking away is that this is not the way to do things.

2 0
1. Monday 13th September 2010 15:34 GMT Trevor_Pott
  
  @Jason Bloomberg
  
  The message you should be taking away from this is that it isn't the way to do things. Comments sections on previous articles have beaten this to death, but I'll go for another round.
  
  First off, these articles are about how things go badly when you don’t have the time, resources or opportunity to plan for every contingency. In a perfect world, noone should EVER do a huge cut-over such as was documented in these articles. It’s a monumental pain in the ***, both for the sysadmins and the users. I didn’t exactly have a lot of choice in the matter; the reasons behind which are also detailed to death in the comment threads of the previous four articles.
  
  What was the contingency plan? The contingency plan if this all went horribly sideways and couldn’t be reverted was that the vast majority of everything was virtual machines. While we didn’t have the resources to keep everything online at the same time (I.E. run the two networks in parallel) storage space is not something we lack. Thusly we had (and still have, since I haven’t deleted the old VMs yet) the ability to simply turn the new network off and fire up the old one if the world truly did end.
  
  The heartache there would have been disjoining the physical computers from the new domain and rejoining them to the old. That wouldn’t be the worst thing; those systems would still have all the old profiles from the old network; things would simply have carried on the way they were before. At any given point in the operation a complete reversion to the old network was no more than half an hour out. If you want to take away one positive message from this all, it is the awesomeness of Virtualisation in that regard.
  
  As to signing off on this project, people don’t “sign off” on anything where I work. You are given a series of deadlines and requirements then left entirely up to your own devices. There aren’t formalised processes for anything unless you create them. There is rarely enough time to create any formalised processes nor the resources to do so. Every chance I get, I sneak a little organisation into the mix, but I am not often afforded the opportunity.
  
  Again; you’d have to read the comment threads from previous articles, I went over a lot of this territory already. Long story short; we’re undermanned, with too few resources and in addition to keeping the network running, it’s size is growing at a fairly rapid pace. I don’t put in 80+ hours because it’s the best way to tdo it. I do it because I have no choice. Noone should be taking the Doomsday Weekend series as a shining example of what do to, but as a series of lessons in how to avoid making the same mistakes I made.
  
  Additionally, they are a lovely set of articles to think about where you work, and be thankful if you happen to work in a place with enough resources to pull these kinds of changes off (or do anything, really) “properly” and “by the book.” Where I work really isn’t the worst there is either; there are plenty of sysadmins out there with fewer resources, less time and more demanded of them. These articles are indtended not only lessons in avoiding my mistakes, but in not taking what you have for granted.
  
  1 0
  1. Tuesday 14th September 2010 15:29 GMT The Unexpected Bill
    
    It doesn't quite read like that though...
    
    I find that interesting. Not to be too critical (generally I've enjoyed your articles), but while I was reading this article I kept thinking "this is all going very well" and I thought that was the point...look at how well all of this is working. It's almost going *too* well, especially when the only thing that won't work afterwards is the installation of a driver. Hmmmm.
    
    All the while, I *was* wondering just how it is that you have such a charmed life that exempts you from the two rules of system administration* and how bold of you it was to do this without mentioning a backup!
    
    * all hardware sucks / all software sucks
    
    0 0
    1. Tuesday 14th September 2010 16:05 GMT Trevor_Pott
      
      Long story
      
      Not much space to shove it in.
      
      When looking for things to cut, I figured two things would hopefully not have to be explained in detail. 1) NOBODY does this sort of thing without an abort option. 2) I talked oodles about how everything was in virtual machines. I kind of hoped the readers would make the connection to "and the old VMs were kept around." Cutting 12000 words down to 2500 requires (as my editor puts it) ruthlessness.
      
      0 0
Monday 13th September 2010 15:34 GMT Charles Calthrop

I really like these articles

I think these articles are great. Just to echo the previous points though, surely the contingency plan should have kicked in after you'd been up for, say, 24 hours straight?

My backround is not in this area at all, so forgive what also might be a stupid question, and apologies if you've already covered this; but why all this bother in the first place? Is it to get a better ROI, or because of end of support periods or for better functionality.

Anyway, as I said, I really like the tone and coverage of these posts; thank you very much.

0 0
1. Monday 13th September 2010 17:01 GMT Trevor_Pott
  
  @Charles Calthrop
  
  Well, I knew going into this that I would be up all weekend. Given the total lack of resources for this project it was inevitable. The contingency plan (reverting to the old network) would only have kicked in had we been unable to get 100% of the critical systems and 80% of the secondaries up and running by start of business Monday. As it was, we just slid in under that line.
  
  As to why we did this in the first place…that’s pretty complicated. There were several requirements at play. The IT folk (myself and the other sysadmins) needed a new domain; the old one Active Directory schema was pretty beat up for various reasons and it was (shockingly) less work and hardship to block replace than to repair. (With more than 75 users, that would not have been true.)
  
  Secondly, in order to facilitate things for some planned future single-sign-on jiggery-pokery, we needed to have all our users transitioned to the new username naming scheme.
  
  Thirdly, the CTO is big on “using the newest software simply because it’s the newest software.” The pressure was on to upgrade all the software to Microsoft’s latest and greatest.
  
  Lastly, this absolutely had to be done by September. Due to budgetary considerations, we weren’t going to get the software licences until at least July, during which I was on site at other locations preparing server hardware and deploying Wyse thin clients. This essentially meant that despite all the planning earlier on in the year, we didn’t actually have a chance to start knocking together the domain controllers, template userspace VMs, email server or anything else until the beginning of August. The changeover occurred the weekend of the 20th of August.
  
  Due to resource constraints, there were some pretty severe limits on what we could build beforehand, but we did manage to get most of the user creation out of the way. We got all five DCs built, the email server built, and the BES/WSUS/OCS/Teamviewer server built. (Though OCS ended up having to be reinstalled.) We got the template userspace VM built, but only so much could actually be put into the template or pushed via GPOs…a great deal had to be punched in manually.
  
  There were additional considerations, but those are the big ticket items. Many companies go through such changeovers, but in general they take 3-6 months to do it. To do so, you have to have the gear to run both networks side by side. You establish trusts between the domains, allowing users on the new network to access resources on the old and vice versa. With the right resources and enough time, these sorts of migrations don’t have to be nearly as painful as what we went through.
  
  Indeed, the bits discussed were only the painful parts. There’s plenty of stuff still active on the old network. (Our site-to-site VPN servers for example.) Since they don’t actually have to be part of the network to do their job (for example flinging bits at the right subnet) they haven’t been moved/replaced yet. The painful stuff is the “live, in your face” elements that we had to move from one net to the other without the ability to phase stuff over gradually. There’s still plenty of stuff for which we do indeed have that opportunity and so we are taking our time.
  
  It’s hard to tell the whole tale from every possible angle with every possible scrap of information in 5 500 word(ish) articles. When I sat down to write all of this immediately post Doomsday Weekend I had somewhere in the neighbourhood of 12000 words. Some of that information got cut into other articles (not directly Doomsday Weekend related) such as my recent crackberry and Exchange 2010 articles. The rest has languished in my “potential material for future articles” folder. There is a vast gap between 12000 words worth of story and the 2500-ish I was asked to tell it in. (I am not yet so good a writer as to accomplish that level of compression without zipping the document.)
  
  It is thusly why things like “the reason for the move” or “what the contingency planning” were got left out. When choosing what to make the articles about, I chose to talk about the worst of the worst; lay my mistakes and errors bare for others to learn from. I finished it with a bit of a happier article for no particular reason other than to change up the total “doom and gloom” feel of the set.
  
  Thusly I get to spend time in the comments answering everyone’s questions about the who what when where how and why. All is good so long as somewhere along the line some newb administrators learn from my mistakes and avoid them. That would be the best outcome that I can think of for these articles.
  
  Comments on El Reg often have as much or more information as the articles themselves anyways. In any case, thank you for the nice comment!
  
  1 0
Monday 13th September 2010 15:36 GMT Fatman

RE: reasons to be cheerful

My God man, my boss would never allow me to perform such drastic network remodeling over such a short period of time.

In her mind, and the CEO (owner's) mind, we (IT) exist to `make the company run`. There is no way that such a disruptive change would be tolerated specifically because IT would have its ass reamed by each and every C level executive for incompetent planning.

We have had to re-locate and MERGE numerous branch locations into the main campus, and into the main computer network.. (Branch in this case being located at another PHYSICAL site; running its own INDEPENDENT systems.) Networking determined which sub blocks of address space would be assigned to specific organizational units, etc. Much of this s--- was done long before any physical move. The only thing left to the moving weekend was the physical relocation of equipment. And none of us worked 82 hours straight.

Have you ever heard of the expression: "Biting off more than you can chew?"

Me thinks you didn't.

0 1
1. Monday 13th September 2010 17:01 GMT Trevor_Pott
  
  @Fatman
  
  Your assessment (like many others) makes some big presumptions. First that we had the resources to do it any other way. Secondly that the company would have had the money or inclination to provide the proper resources if only I had thought ahead and told them they were required.
  
  We didn’t have the resources required. In fact, no matter how loudly I complained we would never have had the resources required; the funding simply didn’t exist. Not only that, but as bad as it all sounds, two days of disruption (the Monday and Tuesday immediately following the changeover) during what is typically a very slow month for the company was significantly cheaper than the cost of the equipment required to do the change properly.
  
  As to getting reamed out by C-level executives, bring it on! I would love an actual review of how IT in my company is (or is not) organised. Formalised (preferably hierarchical) management structures, direct and clear lines of authority and responsibility as well as concrete budgets, project timeframes and resources allocations would be a significant asset.
  
  Put another way: it’s a 75 man company. There is a fantastic amount of “seat of the pants” /everything/. The crazy part is that we do actually succeed in getting it all done, with next to nothing and keeping it all running. I happen to agree that proper planning, reasonable timeframes and adequate resources are the proper way to do any project. I also don’t happen to live in a world where that’s /ever/ possible.
  
  Telling the powers that be “that is impossible” will only earn you a “make it possible.” Everyone will sit down and discuss the consequences; if we push forwards with X, what will have to suffer to reach destination Y? In the case of a network changeover like what we went through, if you don’t have the resources and tie to do it, then you have to accept exactly what happened.
  
  We knew it going in. It wasn’t a surprise. We accepted that it would turn out this way. There was no available alternative. It’s the reason I write about my experiences; documented examples of how badly things can go are good case studies of why “doing it the right way” are necessary.
  
  I thusly can’t agree with your assessment of “biting off more than I can chew.” I knew what I was getting into. I knew how miserable it would be. I also knew that all the alternatives were worse.
  
  0 0
  1. Tuesday 14th September 2010 00:07 GMT Anonymous Coward
    
    Making it possible --- and the word "No"
    
    It is my experience that once they have picked themselves off the floor, listened to it echo around the room a few times, and been caused, by the shock of it, to turn their own egos down just a tiny notch, that even the most autocratic bosses can accept that something is impossible or unreasonable. If they can't, and if the economy allows, the answer is obvious!
    
    Don't say I've obviously never been there: I've been there with directors of a Japanese giant, not to mention the local British staff lower down the chain but still higher than me. Sometimes it has to be done.
    
    That you succeeded at an eighty hour stint (I nearly said stunt) is incredible, and praiseworthy. Have a beer. That anyone should expect an employee to be capable of doing difficult work over such a period of time is ridiculous.
    
    There cannot be a system manager who has never worked long hours, and it is often to get ourselves out of trouble we, ourselves created. Thirty hours was my longest and that was neither fun nor expected. I had to pick up the pieces after an AIX upgrade that a contractor screwed up. The ridiculous part was that I'd one the course on it, but as the new asst. sysadmin at that time, the company didn't know me well enough to trust me rather than the contractor they did know. That was one occasion on which I should have said "no", but I'd been there a couple of weeks, and I didn't know the contractor didn't know what they were doing!
    
    We got to know each other better over the years (and I had a good time with most of my Japanese bosses).
    
    (my mistake, on that occasion, was repairing the screwup rather than reverting and starting again, which would have taken less than half the time).
    
    0 0
    1. Tuesday 14th September 2010 09:13 GMT Trevor_Pott
      
      @Thad
      
      Love to "just say no." Issue is that I'm not just a numpty with no knowledge of the stakes. I can't simply put the onus on the folks "up thataway" and wash my hands of it. There aren't any other good choices. There aren't more resources to be had. It's not that the brass are being too skinflint, it's that the resources that might have made this doable the "right' way absolutely had to be redirected to other areas.
      
      It sucks and it meant a really bad couple of weeks. It meant having nothing but a series of bad alternatives to choose from. Still, it had to be done, and “just saying no” really wasn’t a viable option. It isn’t a matter of anyone’s ego, just of the cold reality of the times. As to “the answer is obvious”…finding another gig isn’t exactly easy. Even if I could, I wouldn’t throw the other admins here under a bus like that; there is a certain minimum necessary to get done before I would feel okay with heading for greener pastures.
      
      The hellish weekend was one that bought me probably five MONTHS of trying to fix the old domain/users/security settings/permissions/GPOs/whathaveyou. Given the horrifying backlog of work I’ve got on my desk, I still think it was critically necessary. Yeah, I’ve got a pile of work in front of me, and yeah it wasn’t pretty…but if I want to keep this place glued together then I have to make some pretty hard choices.
      
      For what it’s worth, I absolutely refuse to accept new projects for the next 18 months. The manhours/month required for operational support and regular maintenance has to drop dramatically before any new projects change things up. When a company/network grows too fast there is always a good risk of hitting exactly this wall. The methods and processes in place are too ad-hoc and informal to support the extant structure. Thus I need to take the next year and a half to rationalise and formalise everything I can.
      
      Saying “no” to a project like this would have been cutting off my nose to spite my face. Political implications aside, the project benefits IT Operations just as much as anyone else. Not in the short term certainly…but definitely in the long term. From my point of view, if that means a crappy couple of weeks then so be it.
      
      0 0
      1. Tuesday 14th September 2010 23:52 GMT Anonymous Coward
        
        Well, of course
        
        If you sit down with "them upstairs" and jointly come to the conclusion that this has to be gone through, then I guess that's a different matter. I just hope they made it worth your while. When I did my 30-hour weekend I was still low enough on the ladder to get paid overtime. In that respect it was worth it. Later on I used to expect days off in return for extra days worked, or hours way beyond the call of duty.
        
        Yes... I was bolshy. Very bolshy. At the same time, I used to say that being a systems manager was like being a parent: you just had to do what was needed when it was needed. I also recognised that I was fantastically lucky to walk into my job (which I kept for 11 years) the week after I was made redundant. I really was bolshy, but not *that* bolshy :)
        
        Your plan for the future sounds pretty healthy.
        
        Have a beer. You need it!
        
        0 0
        
        Wednesday 15th September 2010 08:57 GMT Trevor_Pott
        
        @Thad
        
        I got to take the following Thursday and Friday off. Past that, well...it's a tough job market. It doesn't matter how hard you work if the resume papers don't have the right letters after your name.
        
        It's why I've taken up writing.
        
        That said, a beer sounds bloody grand. Bloody grand indeed...
        
        1 0
Monday 13th September 2010 17:01 GMT PaulWizard

What could possibly go wrong....

"The administrator password on all joined systems, newly built or ported from the old domain, was reset to the new company standard."

I'm guessing it was something easy to remember like "password"

0 0
1. Tuesday 14th September 2010 00:05 GMT Trevor_Pott
  
  *shock*
  
  13 char randomly generated! :P
  
  Oh. Joke icon. D'oh. More coffee...
  
  0 0
2. Tuesday 14th September 2010 00:05 GMT Notas Badoff
  
  Very easy to remember, if it's from real life
  
  While it depends on if the managers need to know the password, I'd think a simple "admin password == admin experiences" relation would work.
  
  Wouldn't even have to be nasty word(s), merely descriptive, unless you think "jiggery-pokery" is nasty (that is, _we_ do, but maybe management wouldn't recognize it?)
  
  0 0
  1. Tuesday 14th September 2010 09:13 GMT Trevor_Pott
    
    Admin passwords
    
    Written down and placed in a sealed envelope. Given to CxOs along with other critical information (documentation paths, directory restore passwords, etc.) In the unfortunate event I get hit by a bus it is required that at least some attempt be made to provide business continuity. In a way, that really is what the whole project has been about. The old network was too fragile by far. If I got hit by a bus then the chances someone could simply have walked in and taken over were pretty slim. That’s some bad juju right there, so most projects over the past year have been about addressing exactly that concern.
    
    Sadly, when you are removing band-aids upon band-aids upon band-aids, periodically you run across one that just has to be ripped off. The worst of it however is over. We’re not out of the woods yet, but I can definitely see daylight. There is a visible future wherein the network complies with as many standards and best practices as is possible given the budget…and all divergences are carefully and thoroughly documented. The % of this network for which the configuration exists only in my mind is decreasing steadily every day.
    
    I couldn’t be happier about that.
    
    0 0
Tuesday 14th September 2010 00:11 GMT Anonymous Coward

Just me, then?

Was it just me thinking "Good man! That sounds like a hell of a meaty project and something that I'd like to do myself" then?

I've pulled these sorts of oh-fuck-here-we-go projects, where there's little planning and even little contingency, but nothing other than good intuition and skill get you to your intended destination.

Well done, anyway. Good write-up. Ignore the skeptics; they're just jealous that they don't have a seat-of-their-pants to fly by any more...

2 0
Tuesday 14th September 2010 00:15 GMT Guido Esperanto

I like this article

Its the first of the "sysadmin" ones that I've read.

From some of the comments I'm amazed at the number of fully compliant, totally kosher IT systems out there.

Goes to show that my 10+ years experience are in minority places.

I've had experiences like Trevors and while it would be absolutely great to have all the support an abilities around, its just not always possible. So you make do with what you got in the time you got it.

Some of these scenarios echo my own. Sure you can cast righteous GOD OF SYSTEMS judgement on my attempts, but you know what the job got done and stayed doing, up until I hand over. that was what was important.

I think this and the other (if they are similar) articles highlight issues all around the world in companies that want maximum return on absolute minimal investment., and sometimes it does mean cutting corners.

The good news is that its a fantastic place to learn, nothing like being in the deep end to boost your knowledge.

The downside is that when it goes wrong (and it does) there is nothing worse than the sinking feeling in your stomach about finding a resolution quickly and more importantly will resolve the issue.

Kudos Trevor for spilling the beans on some real IT departments. There are far too many sanctimonious documents and people telling you how (in green fields) to do things right, truth is, its not always like that.

2 0
1. Tuesday 14th September 2010 09:13 GMT Trevor_Pott
  
  @Guido Esperanto
  
  I find "pristine" networks have one of two things in common:
  
  1) They are block upgraded on a regular basis. All tech replaced at around the same time, tested to work together and then phased in side by side the old network. Hardware, software…everything. Basically a completely new tested network is rolled out ever few years with nothing harder to accomplish than a small data migration.
  
  2) The network admins in question have access to two of the following in spades: Time, Money, Manpower. They can then do “organic” network growth by throwing their two plentiful resources at the network ensuring everything goes to plan.
  
  Life is significantly more interesting when the network never stops growing and you never quite have enough resources. In the medium term the only way to survive is to reduce the resource consumption. Of course everything seems so easy when you are reading it on a website and not actually living through it.
  
  I have to admit to some jealousy when reading the comments to these articles. So many people with such unquestionably pure networks! The number of people who apparently are in the position to simply put their foot down, say no and go elsewhere without financial hardship or feelings of guilt is extraordinary.
  
  That said, I’ll be honest when I say that El Reg’s readers are on the whole a fantastic bunch. I’ve read several really good ideas in these comments. I’ve also had quite a few really worthwhile e-mails from readers also bearing things for my mind to chew on. Life’s a learning process; I don’t remotely claim to know it all. I have learned as much from the reaction of readers as I have from the project itself.
  
  The articles have sparked debate; both here in the comments thread and in the real world where my responsibilities lie and I have to get things done. I am optimistic that the result of this debate will be positive. I have learned things. With luck some of my readers have learned things. What more could I ask?
  
  1 0
  1. Tuesday 14th September 2010 09:42 GMT Guido Esperanto
    
    one other "pristine network" aspect
    
    one other "pristine network" aspect, which tends to be more of a negative than a positive.
    
    Ideal networks, (hose that are governed by an ITIL framework, have oodles of political beauracracy and nothing gets approval without several signatures and ass covering) often do NOT have the ability to react to a scenario faster than those companies whose IT bods hold all the cards.
    
    Par example, a new 0 day exploit appears for a particular brand of db, MS SQL , MS will release a patch (eventually) to plug the hole.
    
    The corps, will perform testing to ensure compatibility and maintain performance, while the non-corps, will just install the bloody thing and hope for the best.
    
    Sure it could go wrong and ass-holes start a puckering (those that know will understand!) but nothing gets the heart racing and blood pumping quite like unplanned server downtime.
    
    1 0
2. Tuesday 14th September 2010 09:24 GMT xj25vm
  
  @Guido Esperanto
  
  "I think this and the other (if they are similar) articles highlight issues all around the world in companies that want maximum return on absolute minimal investment., and sometimes it does mean cutting corners."
  
  Really? That doesn't exactly go along with the following from Trevor's own comment:
  
  "Thirdly, the CTO is big on “using the newest software simply because it’s the newest software.” The pressure was on to upgrade all the software to Microsoft’s latest and greatest."
  
  The reality of the matter is that, if you look carefully, many of these extreme feats are wholly avoidable by looking sideways at a project, planning differently, and being realistic about what is truly required and when. Sounds more like a combination of business ego from the CTO who naturally has little understanding of IT systems, coupled with a failure of his/hers IT people to firmly educate him/her on the merits, needs and downsides of a full upgrade.
  
  I do know the thrill of getting the jobs done on the edge. The feeling that you are out there, at the limits of the possible. Feeling that breeze - the breeze of the frontier, the great thrill of pushing your personal limits as far as they will go. And it is great stuff never the less. I also can see that Trevor, like others, has enthusiasm for his work. And that is absolutely great. Every industry, not only IT, has benefited from enthusiasm. We all need it - that's how great work gets accomplished.
  
  But may I suggest stepping out of those IT contractor/sysadmin/technical guy shoes for a moment, and slipping yourself into the shoes of a well-heeled business person. The fact of the matter is that we are in an industry where, according to some statistics, 50% of large projects fail. Can you imagine this happening in any other, more mature industry? What if 50% of buildings the construction industry puts out never make it to the delivery day, or collapse after? And this high failure rate is caused in large part by poor management and planning for those projects that failed.
  
  The fact is that running projects on high risk/low contingency ratios will work few times, you will be the hero for a while. But then the inevitable will happen, reality and statistics will catch up with you and bite you hard in the backside. All those screw-ups you hear and read about, where contractos/sys admins/dba admins loose entire email databases at ISP's, entire business databases being purged by mistake, the wrong plug being pulled, the wrong storage device being emptied. All those disasters causing thousands, tens of thousands or hundred of thousands of dollars/pounds/euros - it's just stuff that happens to others? You will be surprised if you look closely at those stories how many of them were contractors/systems administrators/programmers filled with enthusiasm for their work pulling extremely long shifts (like they did many times before), running through a poorly planned, no/low contingency procedure (like they did many times before). They have been heros plenty of times before - but it eventually caught up with them. And everybody will rate them as incompetents - when the high stakes gamble didn't pay off.
  
  Professional drivers are only allowed to work a limited number of hours before they are required to rest. Indeed, they push around tons of metal which, out of control, would endanger lives. But they operate by comparison a far, far simpler set of controls, then a server administrator. Speaking as a hypothetical hard-nosed business person, I wouldn't let employees/contractors in the work place I am in charge of operate a coffee machine after 80 hours of work - never mind a business critical server. I have to admit that, as an IT guy - I'd get a kick out of the feat - but one and the other are two different matters. The potential for human errors in such an advanced state of exhaustion is astonishingly high. Never mind lack of backup/contingency planning. Just having anybody working so tired on IT systems constituted on its own taking business risks way beyond any sense.
  
  1 0
  1. Tuesday 14th September 2010 10:54 GMT Trevor_Pott
    
    @xj25vm
    
    Except that there was a contingency plan, and it isn't so simple as "the business side of the equation withheld funds." Moreover...the project didn't fail. In fact, it occurred more or less as I figured it would. Some unexpected things went wrong, but no more than I figured would...given the circumstances. A bad couple of weeks were had...but they are over with, and everything is working grandly.
    
    For all the wailing, gnashing of teeth and commenters with axes to grind...the project in question did what it was supposed to do, with predicted parameters and frankly, it could have gone a lot more sideways than it did. If it had, I still had an abort option.
    
    I agree with you: IT as a whole, myself included, accept too many failures. We are very quick to accept compromise. Truth is, doing IT "properly" is bloody EXPENSIVE. Outside the reach of some organisations, like mine, trapped in the mid market. It's easy to say flippant tripe like "well just quit and work elsewhere." That's bull. I've a family, mortgage payments...what's more, I've a sense of loyalty and professionalism that prevent me from screwing over folk I work with, and for.
    
    I understand your overall frustration with how this same story of failed or skin-of-ones-teeth IT projects is continually repeated. I honestly do. The truth of the matter though is that life is never so black and white as it can be made to seem by post-operative diagnosis of some else's issue over the Internet. That's the point of talking about the elephant in the room and doing articles not about things SHOULD be done, but rather how they sadly end up being done in the real world. Learning, thinking...growing our minds beyond just our own experiences by taking those of others into consideration.
    
    Is there no value in your world for innovation beyond “network management by whitepaper?” Should all companies that can’t refine/upgrade/purchase/manage/whatever their IT to some arbitrary standard simply close up shop and go out of business? Who gets to decide? Why should any business owner/manager prioritise IT over other business units? In the SME world, there is almost never enough to around. Why should IT be immune to the concept of “good enough?” What makes IT so special?
    
    I’d love to know, because as someone who works in IT, knowing what supposedly makes me more important than the rest of the company would be a fantastic boost to my ego. Who knows, maybe you have a valid answer with concrete reasoning. Then our entire industry can use it during our various budget talks each year.
    
    0 0
Tuesday 14th September 2010 00:31 GMT xj25vm

Excuses

All these comments and replies on this article, and the previous ones, reminds me of some stuff I read (and learned out of my own mistakes) a while ago. And that is the fact that the higher you are on the IT ladder - the more political skills you require. IT skills alone are not enough anymore. I don't work at this sort of level or scale - but one thing I learned early on when dealing with clients is that if you don't have enough balls to rein in on the clients/managers decisions when necessary - sooner or later you will find yourself in a very tight corner, with half of the world collapsing around you, working insane hours - and sometime even having to take all the blame for decisions that weren't even yours in the first place.

It is a whole different set of skills, but the sooner you learn to walk away from others' wrong decisions, firmly refuse to get involved in what you know is bordering impossible and constitutes awful planning, the better. Even if it means loosing a lucrative contract, or risking a profitable or promising business relationship. The costs otherwise tend to be a lot higher when everything goes wrong. For one's personal health, business reputation and even finances. Sometimes out of greed and the lure of a well paid project, sometimes out of technical enthusiasm and the attraction of a fresh technical challenge, IT people find it hard to put their foot in the door and say "Enough is enough, this is a bad idea, and if you want to proceed with it like this, good luck with that but I'm not going to be part of it".

Just my two cents.

0 0
1. Tuesday 14th September 2010 09:13 GMT Trevor_Pott
  
  @xj25vm
  
  Your take is correct. I am simply not quite that cold blooded. I recognise that things are unsustainable over the long term. The issue is that I won't simply drop everything, quit, and leave everyone else holding the bag. If I am going to keep on here, then the network needs to be rationalised and the workload brought down to a more sane level. If I am going to leave here then the network needs to be rationalised and the workload brought down to a more sane level. Stay or go, what needs done is no different. To my mind simply upping sticks and leaving would be a dick move. That’s not an excuse…it’s a moral choice.
  
  I might be that person in the proverbial corner some times…but still…this is my network. Stay or leave, my professional ethics require me to get the network into a serviceable enough state that no one else will have to put in these kinds of hours on it ever again. esrase the mistakes of the past and start it fresh. That was kind of the whole point of the overhaul…
  
  0 0
  1. Tuesday 14th September 2010 11:16 GMT xj25vm
    
    Title
    
    "Stay or leave, my professional ethics require me to get the network into a serviceable enough state that no one else will have to put in these kinds of hours on it ever again."
    
    It is very much an interesting question. I respect your obvious professionalism, and passion for your work . The following is not really a direct counter argument to your statement(s). More like a follow on rhetorical musing. I have at times wondered myself where the line should be drawn. How much of it is my personal responsibility, and how much is others'. Some points to ponder are:
    
    1. "this is my network". In way, it is your baby. Like any professional who has worked hard on a project, we all know the feeling. On the other hand there is always the danger of failing to realise that this is a business environment, that all that network and everything that it contains legally belongs to somebody else (unless you are one of the owners), and at the end of the day, as history taught us, even in the highest ranking positions, requiring the greatest amount of skill and talent, no one is truly non-expendable. I know, it's harsh, but it's just the way it works.
    
    2. On occasions, there are employees of a company, in management positions or otherwise, who will willingly take advantage of colleagues who have a passion for their work and high principles. It becomes then tricky to really draw a line in these circumstances between personal ethics and responsibility, and being taken advantage of. For example, one might see pulling a network in shape on a minimum finance and time budget as the responsible thing to do, while their superior might look at it merely as an opportunity to spend less then what should be spent on a project.
    
    As I was stating above, I am not aiming these matters wholly and directly at your situation and your articles. It's more the fact that everything that was talked about contributed to remind me of some of my own musings.
    
    In all eventuality - the spirit in which you do your work has to be commended.
    
    0 0
    1. Tuesday 14th September 2010 12:40 GMT Trevor_Pott
      
      Being expendable is no bad thing.
      
      Expendable people get vacations. The kind where they don't have to remote in every day for four hours to deal with email/various fires. I am okay with being expendable.
      
      One thing that needs to be brought up in these discussions is that, in IT as with many occupations, heroics aren’t rewarded. You won’t get a pat on the back, a bonus or any respect from your peers. (Certainly not if you blog about your experiences!) Hard work isn’t rewarded. What you might get for your trouble is something approaching job security. Maybe. Not because you cannot be replaced, (everyone can be replaced) but that replacing you with a sufficient number of 9-5ers is financially unpalatable.
      
      That ever shrinking niche is SME systems administration. It’s a world where the resources (time, money, manpower) are tight, competition is fierce and sacrifices are made not only be the rank and file but the business owners as well. Part of the social contract in place is that the business provides non-monetary incentives to keep staff on. Allowing the nerds their own coffee pot, ignoring the plushies and geeky posters and putting up with the quirks, tics and lack of social graces. In exchange, we give 110%. We go above and beyond with the expectation and understanding that we won’t get fired for wearing a binary tie and inadvertently causing the dress code nazi an apoplepsy.
      
      In larger enterprises, union shops or government IT departments, the social contract is different. I think that so long as one is aware of which social contract they are signing, they are not being taken advantage of.
      
      Now, when one side reneges on their part of that bargain…that’s a whole other story. What I find far more common (and despicable) than one lone individual taking advantage of a colleague’s enthusiasm is the corporate culture that says “the job market sucks. It’s time to rescind our side of whatever social contract we have with our staff because they have no available alternatives.” For some companies that is cutting pay/hours. For others it is removing non-monetary incentives from the equation. For all of them it involves removing job security. (Rule by fear!)
      
      When the market picks up, and EVERYONE starts heading for greener pastures, the corporate word bemoans the fickle nature of employees and calls for greater immigration.
      
      Technology and best practices aside, we’re really having a debate about the value of LOYALTY. Corporations don’t tend to have any towards their employees. What value then for an employee to have loyalty to their company? I don’t have an answer to that. Some part of me finds it important. I think that in the SME space, the concept still retains some tattered shreds of value, but it is here that my experience hits a wall.
      
      I’m good at this here computer fixing business. I’m good at research and I can invent new and innovative ways to solve almost any technical problem, even on impossible budgets. I work hard, and obviously have a self motivated (driven?) nature that is fairly rare amongst folk my age. I have identified my strengths. Hurrah.
      
      What of weaknesses? Well, I obviously work too damned hard...to a detrimental point, most likely. Somewhere in there though is the knowledge and realisation that I need to learn when to risk saying “nyet”, even if it runs the risk of breaking the social contract an SME admin has with his employer. Always there are limits. I am still learning mine. I honestly don’t think this project was one of those cases…but the reaction of my peers here on El Reg show that it was probably on the border.
      
      That’s a good thing though. Growing, learning...even these very debates and discussions in the comments: where are the limits? Where do the responsibilities of the business owners, management, IT staff, and the user base begin and end? I suspect there is no one true universal answer. Each environment is different; each admin must make their own independent judgements. By reading articles and comments from other admins, real life stuff that isn’t the sterile perfection of a whitepaper, we can see where we stand in relation to our peers. Who knows, we might all learn something.
      
      I sure have.
      
      1 0
      1. Tuesday 14th September 2010 23:53 GMT Anonymous Coward
        
        An article?
        
        This would make an article in itself. It is a great reply
        
        1 0
        
        Wednesday 15th September 2010 08:57 GMT Trevor_Pott
        
        @Thad
        
        Tell my editor that!
        
        Actually, I am quite shocked they let me comment around here, given that I write articles for them. I am sure I cause some people much angst. I am also not sure I can help myself. Writing is my catharsis. That there are people somewhere who will pay for it is something I consider extraordinary.
        
        It’s a good way to talk, debate…work out problems and ideas. The fiancée is back from her acting job soon. I am sure that once my regular conversational companion has returned, the volume of comments from me will drop off rather precipitously. Until then, driving Sarah into an early grave by writing long comments to everything I can find have made the past five months actually survivable.
        
        Counting the days until the end of the month…
        
        0 0
        
        Wednesday 15th September 2010 09:04 GMT Sarah Bee
        
        Re: @Thad
        
        Yes, I trust you'll be sending flowers.
        
        0 0
        
        Wednesday 15th September 2010 12:58 GMT Trevor_Pott
        
        Of course.
        
        It is the very least I can do.
        
        <3 Sarah.
        
        0 0
2. Thursday 16th September 2010 09:14 GMT foxyshadis
  
  @xj25vm
  
  According to Trevor, the window for the best possible time to migration was rapidly approaching with preparations barely done. That leaves a few alternatives: Skip it and put it off for another year, tell upper management to go stuff themselves, deal with the inevitable consequences, and approach your work without passion. Hire outsiders to make it happen, robbing you of the experience, the money, the adulation, and the pride, not to mention introducing even more unknowns and mistakes that you'll have to support later. Or bite the bullet and take prep to a fever pitch, get the hands-on, and build it exactly how you want it during the short time you have left. (I've done all three when I felt they were warranted.) There is no "oh, just duplicate the hardware, then buy double the software licenses and step up to the most expensive version of each product to enable the clustered changeover" option.
  
  I don't really believe anyone who hasn't worked in an SME environment has any right to denigrate the decisions of passionate IT from the standpoint of their govt/enterprise background. Give advice, point out mistakes, offer alternatives, yes. But claiming that they should spend and best practice their way out of every jam shows an incredibly woeful misunderstanding an economic sector with constant cash shortages. Doing things under budget and early _is_ supporting business, in environments where IT is a business enabler, not a cost center always at odds with the rest of the company.
  
  As the staff had not had previous experience with a massive switchover to 2008/Ex2010 etc, putting it off to another year wouldn't magically give them the experience to deal with every problem and make the switchover perfectly seamless. The only way to become deeply proficient is to make some mistakes and learn from them while you get everything working. That's the best possible outcome of any IT project.
  
  0 0
This post has been deleted by its author
1. This post has been deleted by its author

Topics

Special Features

Vendor Voice

Resources

COMMENTS

Good grief...

P45

One thing ...

@Jason Bloomberg

It doesn't quite read like that though...

Long story

I really like these articles

@Charles Calthrop

RE: reasons to be cheerful

@Fatman

Making it possible --- and the word "No"

@Thad

Well, of course

@Thad

What could possibly go wrong....

*shock*

Very easy to remember, if it's from real life

Admin passwords

Just me, then?

I like this article

@Guido Esperanto

one other "pristine network" aspect

@Guido Esperanto

@xj25vm

Excuses

@xj25vm

Title

Being expendable is no bad thing.

An article?

@Thad

Re: @Thad

Of course.

@xj25vm

Other stories you might like

Row breaks out over true severity of two DNSSEC flaws

Nominet to restructure, slash jobs after losing 'major deal'

Just one bad packet can bring down a vulnerable DNS server thanks to DNSSEC

Square blames last week's outage on DNS screw-up

Microsoft DNS boo-boo breaks Hotmail for users around the globe

If your DNS queries LoOk liKE tHIs, it's not a ransom note, it's a security improvement

Microsoft opens up Defender threat intel library with file hash, URL search

Microsoft squashes Windows 11, Server 2022 bugs with preview patches

Two thirds of DNS queries for IPv6 hosts sent to Chinese resolvers fail, researchers find

He's only gone and done it. Ex-Register vulture elected to board of .uk registry

Apple network traffic takes mysterious detour through Russia

About Us

Our Websites

Your Privacy

shock