I'm sorry, but with Laura DiDio's record in the SCO farce (google for it or check Groklaw) I would want some additional corroboration to anything she says, even if it's "the sky is blue".
Server vendors make a lot of noise about how reliable their systems are, but how do they really stack up? It's hard to say. Getting qualitative information out of vendors is easy enough - they all seem to have the most reliable machines ever built - but what about some objective quantitative information that puts these claims …
... that Mac servers have less downtime despite being generally being looked after by admins with little experience. Does this count as evidence in favour of the hypothesis that Mac are targeting their product at the ... less savvy? And that therefore Mac users may thus be assumed to be prats?
LEAVE IT ALONE
Get your server, configure it properly, put it in a corner and leave it there.
Don't fiddle with it, patch it, upgrade it, add new applications, refresh the O/S or change it's networking environment. Let it get on with it's job and it'll just run.
Of course, this all presumes you have chosen a server that doesn't need new security updates every tuesday, that you have the ability to set it up correctly in the first place, that your organisation has the foresight to predict the workloads for years in advance (and specify the hardware accordingly) and the vision to put in place a network that doesn't have to be constantly messed with - every time a new users wants another PC on his/her desk.
The most stable system I have every seen was (is?) a Solaris 6 box running a telesales campaign app. No-one is allowed to touch it. The application vendors refuse to allow it to be upgraded ("I'm sorry, that configuration is not supported") The root password is known only to the Head of Department and it's IP address is hard-wired. It's been like that for 8 years: never failed once (Barclays? are you listening?)
Of course, if everyone hardened their servers to be this reliable, there'd be far fewer jobs for system admins. So one conclusion is that the IT departments which perform 10,000+++ changes per year on their production estate (of which a significant proportion are reverting / fixing / re-patching previous, failed changes) do so in order to preserve their jobs and ensure their overtime pay.
An interesting study, however I wonder to what extent the results are influenced by the different types of application they will tend to be used for. If I'm prepared to pay the substantial premium necessary to run an application on AIX or HP-UX as opposed to Linux or Windows, then two things are clear: (a) this is an important (to me) application and (b) having swallowed the camel of the substantial additional hardware and sogtware costs I'm unlikely to strain at the gnat of a few competent sysadmins to run it.
My point is that you can't ignore the direct and indirect costs of the kit. In my experience, if you're prepared to spend half the costs of the big boxes on a Linux or Windows implementation, the latter can be made just as dependable and resilient. The trouble is that many Linux and Windows implementations are done on the cheap and reduced resilience is the result.
"7 years of experience for Windows admins..., and three years for Mac OS server admins"
...and yet despite this Microsoft servers had MUCH more downtime.
What this tells me is that no matter how much Windows training you have, it's still going to bork up in new and unusual ways that will take you a long time to fix. Personally, I'd go with AIX or HP-UX since I know more about them but MacOSX seems quite friendly as well.
Why do people still use Microsoft? The more I find out, the less I understand it!
The report author is careful to make it clear that like is not always being compared with like, here. She points out that the definition of 'unplanned outage', that the IT managers questions for the report, used, were the is "Oops - bang! Is that meant to happen?" situations, as opposed to the "its Tuesday, so we reboot" cases.
Also, there's an element of perception. Different departments might be able to cope with up to 4 hours without email, whereas an IBM mainframe going off-line for a period of more than a few minutes each year, could have much more serious impact on the business. The high scores for Power series may reflect the absolute need to keep those systems up; so all credit to any sys admin, for achieving that, but comparing the management of a farm of file and print servers, with running a company mainframe, is like highlighting the difference between successfully landing a Cessna, and putting a 747 on the deck without killing everyone.
The report author acknowledges that, for some measurements, a z990, running the workload of a thousand or more equivalent wintel of x86 Linux boxes is counted as 'one unit', for reliability terms.
Also, it is a matter of perception. iSeries, and it's ilk, may rate badly, simply because - although in a lower league from the mainframes - few AS400-type boxes are running applications or databases that businesses can do without for long. They occupy that middle ground in the must-have stakes. Any perceptions of unplanned outage on iSeries, is likely to be factored much more highly than an Exchange Server being unavailable for an hour or so.
Finally, as the author also notes, the networked world we now live in, means that comparing platform with platform is not always the most robust method of determining business impact. If an AS400 is offline, a Z990 that interacts with it may be unable to do a lot of its work. The Z990 is fine, but until the AS400 is back up and running, it's just a big, black box. From a user perspective the job still wasn't getting done. It departments would do well to remember this fact.
Sorry but these figures are just plain nonsense. It seems to me that they should be per year and per server or something. Per year and per shop does not mean anything, as you wll probably have more incidents if you have 1 million boxes than if you have only ten.
Or are the figures implicitely per year and per server? Meaning a shop with 100 AIX boxes experiences about 42 "tier 1" incidents and 34 "tier 2" incidents per year? Even without performing any system administration you can't be that bad.
"... that Mac servers have less downtime despite being generally being looked after by admins with little experience. Does this count as evidence in favour of the hypothesis that Mac are targeting their product at the ... less savvy? And that therefore Mac users may thus be assumed to be prats?"
No, but that comment, alone, counts as evidence that you're probably one of those mouth-breathers that cannot make it from one end of a 24 hour period, to another, without hijacking a comment thread on The Internet to make some random brain fart about <insert favourite hate-target here />.
Why don't you go and install a patch, or something? Your tone suggests that you work with the kinds of systems that need them.
There's a lot to be said for cheap sys administrators: it's why we let your lot have your own server room, in the first place, remember? If your own pet OS can't match up in the qualifications/downtime stakes, I'd be worried, if I were you, not smug.
I still remember visiting a council office somewhere in Chelsea, clutching a brand new windows 2000 server and asking to be pointed in the direction of the server room. We dont have a server room, or indeed a server. they confusingly replied. I found it a few hours later, the server that is, not the room. It had long since been buried under stationary boxes and books in a cupboard under a desk. The console fired up, and the uptime showed 6 years 11 months. I was sorely tempted to leave the site well alone. However, being council money, we had to replace.
Nice information. What ever basis you choose. Smarter Techs, Bigger Iron, n-Unit rackscounted as 1 unit. Its a lot more effective to run Unix- Linux and the derivatives than Windows..much less support required.
Now how am I going to get my clients off Windows Server 2003 and prior!
...up and down like Paris's undergarments.
It only gets worse then when you factor in the downtime to patch a p570's firmware when its piled high with 10lpars of 24x7x365 services... or sdd, or a TL, or the latest sp were its promised its fixed.
Like an earlier comment, chuck it in and leave it. Highest uptime in the company I work are Solaris 2.6, AIX 4 and HP-UX 9 with uptime in 3+ years. AIX 5.x, days, maybe months.
I work at a large investmentbank and I love the IBM Power servers! They are rock stable and high performant. But recently we benched 3 of the Power 570 servers against one SUN T5440 on Siebel 8.0 and the SUN machine crushed them all (see Oracle web page for exact numbers). Since then we are migrating away from Power servers to Solaris and SUN. I love the Power servers, but they dont cut it when it comes to performance nor price. One P570 cost several times more than one T5440
Kebbie, are you serious?
are you talking about that benchmark again?
we all know you love T5440 Niagara's.
I didn't see any Niagara crushing a POWER server when running OLTP workloads.
Anyways, nobody buys Sun anymore.
I think you are spending money in the wrong place.
If you are really into TCA and need small cheap boxes go for Nehalem or Opteron.
Sun is dead...
I just wanted to bring balance to all FUDing going on here. Lots of people writes:
"I work at a large bank / exchange / whatever and we love SUN and Solaris, but now we are migrating to Power because of bla bla".
I am just doing the same, that you guys are doing. Maybe I should drop the /Kebabbert and only post anonymously, just as you guys are doing?
For the benches, there are many benches where Niagara crushes Power utterly. Do you want to see some? I can post many of them. And, dont forget that one T5440 is 76.000 USD and one slooooow P570 is 413.000 USD. T5440 is several times as fast as three P570, and it is much cheaper. To me, it is a no brainer which one to buy. And Oracle will reprice the Database to punish Power now. :o) As I said, no brainer.
Biting the hand that feeds IT © 1998–2019