There is no statement quite so eloquent nor action so elegant;
as a smack in the mouth!
If you meet him Bill let me know , I'll back you up.
And that's one of the easier chores our reader found himself faced with in a new temp job. Most weekends, our On-Call feature looks at the odd situations readers find themselves in when called to do something on a client site or in the dead of night. This week we're making an exception for reader “Bill”, who rates himself as “ …
I've been in this situation several times, and in each case the predecessor was not a sysadmin. He was a member of staff that had some technical literacy, that started with the system when it was around 5 workstations using workgroup sharing. The guy looked after it while doing his normal day to day job. Over the years the number of desktops increased, the guy while still doing his day job figured out how to install windows server and exchange, it probably took him a few attempts. His day to day job didn't relent, and he found himself working later and later into the evening. As the number of users increased he became the help desk for every jammed printer and power cycle. Fast forward a couple of years his normal work is suffering, he no longer has the time for IT and he's mentally exhausted from the stress of keeping a multimillion £ company running. Then he leaves. The company takes on a new person to do his non-IT job, and in we get called to look at the IT.
The owners of the company probably have no idea what he was doing to keep the system going, after all he'd just always done it for years, they never needed to spend money.
"The owners of the company probably have no idea what he was doing to keep the system going, after all he'd just always done it for years, they never needed to spend money."
That's exactly the problem I see repeatedly, along with the mentality that everything can be done on a desktop windows box.
"Idiot boy" quite likely burned out and has never learned decent practice as this wasn't his area, nor was he paid enough to do it properly. At some point he either signs off sickj, or says "fuck it" and goes on a richly deserved vacation.
This happens with volunteer stuff too. At some point people expect that things get done and can be pretty rude when they're not. At that point the volunteer might well say "sod this" and walk away from it.
This happens with volunteer stuff too. At some point people expect that things get done and can be pretty rude when they're not. At that point the volunteer might well say "sod this" and walk away from it.
That ties in with an earlier discussion where I stated I would never take on work for a charity. This is basically reason 2 after "unlikely to ever get paid": you tend to inherit a mess which is a bit like a house of cards or an erect member: it only stays up as long as you don't f*ck with it (my apologies for being crude here, but I suspect that anyone who has ever been near such a situation must be a saint not to utter a few words that would get bleeped out daytime TV).
The problem is then that YOU were the last one to touch it, so it's your fault now and you end up fighting to get this mess into some shape before you depart - without any expectation of payment, or appreciation of the valiant battle you fought.
60 users? It would have been easier to make a new up to date golden image and redeploy via FOG. Office updates alone will swamp a PC for an hour or so "in the background" never mind service packs (4 years of updates will probably include a service pack) WSUS for 60 users isn't that network intensive, same for SQL, if it has enough RAM and disk space then probably enough. Exchange really needs its own server.
I inherited a network 10 years ago running a pair of 2k3 servers, both were P4 xeons (poweredge 2800) with 2 gig of RAM each. One ran IIS, SQL and profiles (domain controller) the other had exchange 2003 and was a domain controller AND the enterprise cert server plus shared drives.
No central AV, each of the 200 desktops updated individually.
My god, you have never experienced pain until you have seen exchange trying to run on 2gig of RAM. (It was a school with 80 staff and 600 pupils). Apparently they had spent 100k upgrading the network 3 years previous. I wanted to see receipts for what he had spent. The majority went on new cabling (from cat5e to cat5e) and new network cabinets oh and a silly great overprovisioned massively tape drive for each server.
At least the 100gig RAID 5 server drives were backed up onto tape each night.
That, and also trusting Windows to free up space automagically was probably not the brightest move, especially in that case. On old boxen (even relatively well-managed ones), this leads to disaster more often than not.
I can understand the state of mind that led to the decision, though.
Some IT services companies aren't any better. I took over at one company and the IT company had never patched the Windows XP machines and they had never run virus scans on any of the PCs. Better still, my employer had Exchange Server, with every account externally available via Outlook Web Access. The problem? The service had reset all of the passwords for every user and stopped them from setting their own passwords, so every user in the company had the same password, 12345!
The first thing I did was disable OWA, set all passwords to need to be changed and only turn OWA back on as users requested it - from 200 employees, 2 users who actually knew they had access.
The reason why they had never run AV? The Windows XP machines still had SP1 and 256MB RAM, I turned on local scanning, the PCs were unusable for nearly 2 days, for scanning nearly empty 40GB drives! I managed to get most of the machines upgraded to 2GB RAM and patch them.
I also had an IT department who thought Linux servers didn't need patching because, well, Linux. I mean SUSE Enterprise from 2001 isn't going to be vulnerable to anything, is it?
> Some IT services companies aren't any better
But you can only say that if you know the whole story.
I work for what some future person might call "one of those It services company that's no better" - but with some customers you just can't do things right. They won't pay you for your time to do stuff (especially at out of hours rates), they won't permit the downtime to do it, you send emails to the director responsible pointing out that they've had no backup for months, that servers need patching, etc, etc, etc and it still has no effect.
With some customers you get used to adding a footnote to all emails along the lines of "and we take no responsibility for any data loss of disruption". So, still the IT services fault that the server hasn't been patched for 5 years ?
Anon for obvious reasons - we have a customer that has begrudgingly allowed some patches to be installed and a backup taken. But only because the age of the SQL server has made something break with another package - one where there's a "if you don't upgrade, we won't support you" clause in the agreement, and that package effectively is their business (business sector specific system - without it and the data it holds, they are gone). And the backup only got done because we point blank refused to apply any patches without having a full backup first.
But that's still only the patches needed to fix the SQL problem - not any other SQL or OS patches !
And if that sounds like a bit of whining about getting the blame for other people's faults - then that's how it should sound.
Chances are he had a motivation for his final conduct. To top it all off, after years of asking, the company refused to either reclassify him as a sys admin and give him the better salary, and instead gave him grief for his under performing original job. So he was being expected to be a sys admin, and do another job, at a lower pay rate. I can't say I blame him as I have seen a similar situation. It all began with the best intentions but over the years he became taken for granted and marginalized in the company structure.
You were probably the first qualified IT person to ever enter that room. No doubt the company only hired you after failing to find another sucker to take on the task in house.
Left our place (school district) under a cloud, within a week the historical grades for all of our High School students disappeared in a puff of virtual smoke. Backups were consulted and were found to have last been completed 9 months earlier. Anti Virus software was found to have been subscribed to on a site license basis and left uninstalled for 18 months, a full license audit found massive discrepancies in the licenses held and the number of installations. All in all, a big mess, hardware wise it was just as bad, took the replacement nearly 2 years to figure it all out and get it into a fairly decent working state.
Sounds like a temp job I had at a non-profit school district 3-4 years ago(which got me to quit IT altogether)
Their prior IT director(guy who barely know how to install windows) quit, and someone I know got the job so he called me up saying he needed help fast. I decide to help him.
Get on site to the one school... network cables are thrown on the floor where students walk on them, and roll their chairs over. Actually the one room had a friggin hub flung on the floor that was all bent up with footprints on it, and that was the best part of their network design...
They were wondering why their server was acting up, and randomly failing, and the backup system was not functioning. Their backup system was some proprietary automatic tape library thing that the company disappeared that manufactured it, and the last known software for it was windows 95... Best part this was all stored in the friggin JANITORS CLOSET with absolutely 0 ventilation. I opened the door I felt like I was in some sauna that reeked of dirty mop water. No no reason at all for a server to randomly shut down, and fail.
OK main organizations office's server completely failed(their power line got hit by lightning 2 days prior, and took the server with it)... This one had a working backup system though. Problem is the only thing that was set to backup for 5 years was the backup software's directory... Yup the last admin/IT director did every other day backups of the backup software directory only. They got mad at me that I couldn't restore 5 years worth of financial data...
Next school... This one actually had a decent layout as it was done by an IT company. The last IT director signed a contract with a tech company to have a T1 installed for it. They needed it running for when school started, problem is though the last IT director signed a contract for it to be ready 2 months AFTER school started. Guess who was at fault? yup mine cause some idiot signed a contract 4 months before I got hired... It also didn't help that the principle was a complete fuckin idiot who seemed to enjoy making my life hard. He complained he needed a bigger monitor(he had an 18 inch) so we ordered him the next size up from the distributor we had credit with which was a 22inch monitor. How many people would complain about getting a 22inch monitor? Instead of telling IT anything he instead went and complained straight to the head of the organization claiming we weren't doing our work. I wanted to beat this guy so hard upside the head it wasn't funny.
Next school... Server room is a coat closet with 0 ventilation in the main office. Difference is this school had a principle smart enough to know computers don't like heat so he had a door stop holding the "server room's" door open. This place didn't cause me much issue till their projector they never tested didn't work. Well last IT director had it installed by a company, BUT had the wrong wire ran for it... He had RGB ran for it while the projector they had only had a VGA port on it. Had to buy a long ass VGA cable, and re-run it. Still no where near the pain of the other buildings I mentioned as like I said the principle wasn't completely technologically retarded, and understood it took more then 25 seconds to get stuff working.
Last school was built by me, and my buddy ground up. We picked servers, routers, and did everything to get it running. Only issue I had was some software they demanded for their library that they wanted on the server wouldn't work so I grabbed a spare workstation, installed the stuff on that, and put the computer in the bottom of the server cabinet(yes we bought a real cabinet unlike the others) that was in a climate controlled room, and let it run on that. Also made a script to auto backup all data to the main server so it would get on the tape nightly. I liked working in this building as the only issues that ever went wrong was the teachers couldn't remember their passwords(which were all 12345...)
outside the last school which I built all the workstations were also donations from companies so they were old beat up workstations. I would have to respond to an inquire by the head of the place EVERY single time I needed ANY part as IT depts budget was blown. Thing is it wasn't blown by us the secretary, and her friends decided they all needed $5k laptops, and took the cash for it out of our budget.
Then after we finished they demoted my buddy without notice, and hired some other dude to be IT director. We both quit the next day as I only did the job to help him, and they kept blowing off hiring me as full time(was supposed to be temp to full).
This should be above the final paragraph as to why we quit without warning.
Ohh remember the turd I said signed a contract, and I got the blame? Well he was good friends with the secretary that ordered ludicrously overpriced laptops. She didn't trust me, and my buddy so she gave him access to the network remotely(yes the secratary was required to have the admin passwords by the organization), and let him dig around, and change settings. He screwed some stuff up, made a few backdoor accounts(which we promptly deleted, and banned his ISP) and when we went to complain to the head of the organization he accused me, and my buddy of crap as he trusted his idiotic secretary as she claimed her friend knew more(yup hub in the middle of the room guy).
For the finale we found out like a year later that the organization lost faith with all the backers(their main backer really liked me, and my buddy), and got broken up, and most the idiots lost their jobs that gave me hell :D
Firstly. Kevin. I'm sad to hear that you had such a bad time and it's unfortunate that you left the IT field as it sounds like you had the commitment and drive to do a professional job with limited resources.
secondly. Everyone. Please forgive me for going off topic. Can anyone one please explain what is meant by 'non-profit school district'. I first took it to mean similar to state run schools here in the UK. Then I thought it may be like a collection of academy schools run by a charity. Are my guesses any where near the mark?
Well its a non-profit company which runs a charter schools (should have probably clarified that a little)
Essentially(to quote wikipedia) a charter school is a school that receives public funding but operates independently of the established public school system in which it is located. To show how far they can get in the organization 2 of the schools(which were also the most backwards design wise) were schools which had religious classes as part of the curriculum they were also 100% privately funded. The other 2 were partially funded by the city so they couldn't have any religion classes.
In the UK from what I read on the wiki page the similar version is something called foundation schools.
As for dropping outta IT that was just the end of it. I just fell out of love with it as I've worked for a few companies, and pretty much every one treated the IT guys like crap even though without them there is no way they could do their jobs. Add in competing for jobs with new people in IT whose skill sets only seem to be being able to color with a box of crayons, and it gets quite annoying also due to the low pay. For instance that last job I did the pay was so low I could have made similar cash flipping burgers. The pay would have been a tiny bit lower, but wouldn't have the level of stress that helps cause health issues. The pay, and treatment I will say could just be a regional thing though.
Sounds like you got away easy.
This is just for my present position, but I was hired by word-of-mouth as I happened to be in the job market at exactly the same time that a disaster befell this particular workplace. I was snapped up after they'd done most of the initial firefighting (please bear that in mind) and have thus far witnessed the following:
1) One server. Literally. One. Running 500 users. That setup was in the process of being replaced when I started by the following setup: One server running all the user stuff, another running the SQL server (including payroll), the print system and the phone system, and some shared areas, and backup software, all kinds of junk. Ironically, they had some of the most powerful servers I'd ever seen running Windows thin-clients - powerful enough to run 50+ user sessions. They never got used and everyone hated them, but the servers outclassed everything else in the server room (but were sadly quite old - floppy disk era too - and we just replaced them. Back in their day they must have been TOP of the line). They were sitting idle while the one-server did all the work until it fell over.
2) A set of data-recovered failed RAID disks, in a box. Previously resident in the single server. £10k to recover and they never got all their data back. User profiles and documents had been recovered from CLIENT ROAMING PROFILE COPIES! The recovered drives I had framed and hung on the wall with a plaque reading "Cogito Ergo Facsimile" (excuse the Latin - hopefully "I think therefore I make copies"?)
3) No backups. None. The guy was still getting emails about a freeware backup utility but hadn't even bothered to deploy that. There was nothing. No tape, no NAS, nothing except for what was on the server hard drive. And he had been there to ignore BOTH RAID failures. By the time I inherited it, there was some NAS boxes but also an illegal and unlicensed copy of Backup Exec on every server too.
4) No WSUS at all.
5) No client images (not even WDS, they just bit-for-bit copied existing machines!).
6) Exchange installed on the DC, thus making an unfixable and unsupported combination (officially, you cannot remove Exchange that's been on a DC because you shouldn't be able to do that in the first place - and demoting a DC server that's running Exchange is dangerous and likely to break both!).
7) Every cable measured TO THE INCH to the patch panels and crimped by hand. And often going through the centres of the racks so you couldn't actually insert anything more into the rack without de-patching EVERY CABLE and re-patching it. For one cabinet we had to pull an all-nighter just to rewire 24U. And we rewired EVERY cable in there.
8) I found a switch hidden in a radiator cabinet powered by a socket inside the floor (near a cellar hatch). That switch ran all the main office and wasn't documented anywhere. The uplink for it was Cat5 over 150m using internal cable that went externally and was thoroughly destroyed by the time I got there. Apparently that had been in place for several years and nobody knew about it. Until it went off.
Needless to say, I got triple-normal-IT budget in order to fix the problems. We bought a proper set of redundant blade servers, spread them over the site, multiple backup strategies, proper backup software, full virtualisation and service separation, a complete re-cable (including redundant links around the site and to the Internet) and it's now... well, quite impressive.
My boss has also indicated that next month we will have a full, live, in-service failover test. I think because I've made all these assertions about what should happen on a modern system and he wants to see if it's true. As in, he will "pull power" (not literally, but simulated by turning off machines gracefully) to one entire server location in the middle of the working day to see what happens. We are merely expected to provide "business continuity" (i.e. We don't lose data and thus bankrupt the company! Shouldn't be hard! Shows you what kind of IT they had previously!) but I'm actually expecting "service continuity" (i.e. nobody but us notices that anything has happened).
But that's not even the worse I've inherited. Hell, I refused to touch one charity's network that I was invited to work on. I had to literally say to them "I can't touch that" and they knew I was doing them a favour by saying so. It wasn't fully backed up, the backups were at a remote site they didn't have access to, and nothing on the desktops was in a state where I thought I could safely play with it, and they dealt with the medication records of dying children, etc. Sorry, I have no qualms about fixing it for you, but it's really in your interest to get a proper firm in - because the responsibility with that, given the state it was in, was so bad you wouldn't have been able to afford the price I'd have to put on that responsibility. Start again, get a proper firm in, and get some ongoing support while you're there. It will cost the earth, but it will be nothing compared to continuing on that precipice you're on of losing that data. I did make sure they had at least one sufficient backup before I left but that was all I could do in the time.
I'm sure people have worse stories too, but by comparison some "neglected" server settings and a single non-booting server (sorry, your solution of a note not to reboot it is NOT a solution, even temporarily) is nothing.
Every cable measured TO THE INCH to the patch panels and crimped by hand. And often going through the centres of the racks so you couldn't actually insert anything more into the rack without de-patching EVERY CABLE and re-patching it. For one cabinet we had to pull an all-nighter just to rewire 24U. And we rewired EVERY cable in there.
Master troll is masterful!
Awesome story, but...
"(sorry, your solution of a note not to reboot it is NOT a solution, even temporarily)"
Of course it is. Every workaround is temporary and necessary until a permanent fix is done. Putting a note on a server to remind yourself not to touch it until the replacement is done is perfectly sensible, the alternative would be shutting down backups entirely until new hardware is sourced, OS is installed, and software set up. At least it could mostly work in the days or weeks until the replacement was prepped, depending on the priority.
Only he liked to take things apart just to "see how they worked""...
After being called in to take over from this guy because he called in sick I was confronted with all kinds of equipment (servers, switches, routers etc.) strewn around the place and in pieces, just because this guy was better at taking things apart than putting them back together again !
So after spending half a day putting all this stuff back together again I told my boss that if I ever have to work with this guy I would probably give him a "boot to the head" and quit my job there and then !
1. Not all patches work, so they should be removed from the count.
2. Not all patches are to fix problems. Microsoft seems to regularly issue patches to harvest bytes of yours their previous versions of spyware may have missed. These too, should be removed from the count.
3. Some patches are patches of patches - remove again.
4. Some patches are for stupid devices no-one in their right mind should be using in this century - e.g. fax drivers for cars. Remove these from the count.
5. Some patches actually open attack vectors people use to get into systems. Probably very old versions aren't even probed. So some of these patches can be removed from the count.
6. Some patches are to prepare you for other patches - e.g. Windows 10. Definitely remove these.
7. Updating anti-virus software is a waste of time and bandwidth because all anti-virus software is rubbish.
So after all that, he probably only really missed around 15,000 patches, which is much better.
Besides, as long as there's a reliable firewall (say the BT Home Hub 4) between his systems and the internet, then there's absolutely nothing to worry about.
"3. Some patches are patches of patches - remove again."
And how does one know without manually auditing every single patch? Let's say a whizz with awesome powers of concentration can check a patch in 20 seconds. That is 138 hours, or fifteen and a half days doing nine hour days. And making no mistakes. And taking no more than 20s per patch. And not counting any time for actually applying, reboots, etc.
The icon is surely how anybody would want to feel.
"3. Some patches are patches of patches - remove again."
And how does one know without manually auditing every single patch?
How many people manually audit every single patch?
The update list probably contains updates to every IE versions between 6-11. If the clients can be updated to the latest versions then earlier patches can be dismissed.
25000 updates in WSUS sounds like it includes drivers and probably way more languages and products than were needed. Yes, I've seen cases where every product was selected for no good reason...
I'd first remove the drivers since if the computers already work you shouldn't try to fix them! Then I'd deselect all unneeded products and languages, and uncheck tools/feature packs and other classifications, create automatic approval rules and run the WSUS cleanup.
And how does one know without manually auditing every single patch?
WSUS tells you whether patches are standalone, or if they supersede or are superseded by (or both) other patches. It's very easy to select all superseded patches and decline them, as a starter for ten...
Also, given the job this useless tit had done, it wouldn't surprise me if he'd not selected the correct product types/languages, and appropriate levels of patching, which probably would have reduced the 25,000 considerably. Additionally, older versions of Windows included patches for Itanium/IA64 which a quick search/decline in WSUS would knock a fair few off the list too (guessing on a hunch that they weren't running Itanium infrastructure).
1) Button your lip until you fully understand the whole environment
2) Ask lots and lots of questions without passing judgment
3) Do an total audit. If that means lifting floor tiles then so-be-it.
Write up your findings( warts and all) and present them being prepared for one of the following to happen
1) Be told that it is none of your business (so you quit on the spot)
2) Be told that there is no money to fix anything (as above)
3) Be ignored (guess what?)
4) Be given carte-blanche to fix things (Do the job with a smile on your face)
5) Something else.
The situations described so far in this article and subsequent comments are not that uncommon IMHO.
I've seen the very expensive software I use on a daily basis configured in production with everything as it comes OOTB.
It was only by luck that those systems hadn't been crashing a 100 times a day before the problems were uncovered. Even so, it took Management more than 3 months to auth even the slightest change. Needless to say, I left that contract as soon as I could.
I'm sure that there are countless Dilbert stips that could amply describe the sort of problems that will be revealed here. If far too many is it the actions (or more likely the inactions) of the PHB's that have caused the problem in the first place. Their 'face saving' can even cause more damage.
2 Years of skipped patches, updates, and basic maintenance skipped at an accounting firm with just over 100 PCs is the worst I've seen. Most the Windows 7 computers had never had updates run, ever. Same with a bunch of the 2008 servers. The Exchange server had one patch level, maybe.
Everything worked, somehow, and of all things backups worked. I do feel sorry for the previous tech, the company had become so change averse that he was hamstrung by the fear something may go wrong that he had stopped doing any updates. Unfortunately this built up a huge maintenance debt and things started going wrong and he couldn't keep up, and they fired him because he didn't do his job.
They had another firm come in for a few weeks, and I assum told them they needed to change the way they did everything, and that, yes downtime had to occur. They got rid of them and I ended up on the project. Told them the same thing, this time it clicked and they figured out there was some kind of structure problem. Worked quite a few weekends since then getting everything caught up.
"They got rid of them and I ended up on the project. Told them the same thing, this time it clicked"
You were lucky. Such companies usually just keep firing consultants until they get one who tells them what they want to hear.
I've seen some multimillion dollars ballsups as a result.
"You were lucky."
It may not be luck. See the Rules post above. If you have documented what the current situation is, what's wrong (& what's right) and WHY (compare with currently accepted good practice) and what needs to be done to recover, complete with priorities, then it becomes difficult for them to argue. Difficult as in not having a legal leg to stand on if it turns nasty.
"Such companies usually just keep firing consultants until they get one who tells them what they want to hear."
Following up on my own post.
One or two consultants make a career of telling companies what they want to hear, do what's asked (usually installing a setup decreed by manglement which patently will not work as designed) and then putting in exorbitant extra charges for "adjustments" or "changes"
This can be a great wheeze if you're ethically challenged - I know one firm which made more than $5 million this way. In the end (after a "forced management change") the whole mess was declared unfit for purpose and replaced with an OpenSource solution that cost about $50k to implement and deply.
The people responsible (manglement and the consluting company) both ended up getting paid more in other gigs and creating even more spectacular ballsups. Be very careful if you see someone's references to "highly sucessful rollouts" and/or glowing employer references. Often they're mandated by lawyers as part of the cost of getting rid of the offenders.
"2 Years of skipped patches, updates, and basic maintenance skipped ... is the worst I've seen. Most the Windows 7 computers had never had updates run, ever. Same with a bunch of the 2008 servers. The Exchange server had one patch level, maybe.
Everything worked, somehow, and of all things backups worked."
What I find interesting from some of the comments is how so many so called professionals have swallowed the idea (or is it an urban myth?) that systems need to be constantly patched just to work, they don't! In fact they can be very very stable, hence why it is normal practise to disable auto-updates on a server and only install them as part of scheduled maintenance. The problem in many smaller companies is that without a well staffed professional IT function scheduled maintenance gets pushed onto the back burner and forgotten.
What is particularly interesting is that of the examples written about here is that none had a malware problem worth commenting upon that was attributable to the absence of patches...
IT guys should be accountable for their work. Their work shoukd be peer-reviewed, just as ANY technical workin ANY field should be. In firms where there is noone else who can do this an external organisation shoulx be engaged to audit and report on status/risk level/recommendations.
If as IT workers we find that we are not being held to account, ourselves, then we must be brave and honest enough to speak up and recommend to our employers that they get our work checked by a third party.
Just one thing : who is paying for all that ?
Sorry, but pie-in-the-sky intentions will never overcome the clueless manager whose hands is on the purse strings.
And that is the problem in every post of this kind of article. Issues cropped up because the managers put the budget on something that seemed more important until the amount of trouble was just too big to ignore - by which time, of course, things were much, much worse than they needed to be.
A proper manager should at least have an up-to-date list of logons and passwords, implying an accurate knowledge of what is plugged in where. Anything less than that and you're not negotiating helping them with their IT, you are in point of fact becoming the IT manager. Without the authority required for the job, you are doomed to either fail, or put in a lot more effort than you are being paid for.
Biting the hand that feeds IT © 1998–2019