Press releases issued by software companies are one of the more common sources of myths and legends in the database world. No real surprise there you may think but therein lies a paradox. We all know that press releases are highly partisan, so we expect everyone to treat them with suspicion; yet we aren’t surprised when they …
MTBF versus lifetime
You really, really need to learn the difference between MTBF and lifetime.
Still I am concerned with SQL Server reliability
Because it has to run on Windows, SQL Server is not as reliable as any of the enterprise class databases.
It may well be as reliable as any other database when both are running on windows, but is a factor or two more unreliable when compared with databases running on a unix variant or an IBM mainframe OS.
I must admit I am confused as to the point of your article. Oracle was pointing out the fact that SQL Server is not as reliable as Oracle. Especially at the time this was published this was true to a great extent.
Microsoft can hardly cry foul over massaging figures to match marketing objectives. Just look at its claims regarding the costs of managing Windows vs Linux etc.
FUD runs deep
The most publicized failures in the industry always seem to happen on Solaris/Oracle, whether it be Amazon, Google, Salesforce.com, etc., but perception still is with Unix/Oracle as the superior product.
"Unbreakable" proved to be a farce.. Oracle had many more patches thant SQL 2000.
For reference, Microsoft.com has the HIGHEST availability rating in the industry according to 3rd party monitoring (Keynote and Gomez). Higher than ANY and ALL sites hosted on Unix/Linux, including Google and Amazon. JetBlue airlines has the best on-time rating, customer satisfaction, etc. and it runs on a Unisys/Windows Datacenter SQL cluster.
FUD, or likely 1995 ignorance, is getting old. Windows sucks is a Windows '95 argument. If it still sucks, you need a new admin.
I wonder, in all this, what the _actual_ longest-running oracle cluster is now.
I don't see any comment in the article that would warrant the MTBF comment? Lifetime isn't even mentioned. It was about the average uptime (one form of MTBF) of two very specific forms of cluster configurations, not about the nodes.
It's about debunking a myth that had a great deal of traction in 2001, thanks to a sleazy PR, whether it was true in the general case or not. That's it. It's for historical purposes only, and not here to present Microsoft as immune to sleazy PR or in any way argue about what the real reliability is.
I guess that overall point is, don't believe anyone's claims until you've investigated the source yourself. Most of the world's misinformation spreads around because few bother, even when it's not difficult or impossibly expensive to do so.
But the point...
The point is, surely, that Mark was criticising shoddy PR figures - which often don't get questioned - not commenting on the reliabilty of any databases per se.
When you are looking at real-world reliability then, certainly, a requirement for "planned downtime" in the operating system must be factored in. And MTBF often doesn't mean what you think it means anyway. Perhaps, if one server fails, the other 12 are more likely to fail at the same time; as, if the failure is due to faulty hardware, a whole batch of hard disks or chips are likely to be faulty.
And how a database fails is important. If a database fails without courupting the database in any way, you can probably tolerate more database failures than if you have to repair the database after a crash. And, yes, mainframe DB2, say, really doesn't fail often - and probably doesn't corrupt itself if it is forced to fail either - but that isn't the point of the story, as I see it.
Yes, it DOES still suck...
"For reference, Microsoft.com has the HIGHEST availability rating in the industry according to 3rd party monitoring (Keynote and Gomez). Higher than ANY and ALL sites hosted on Unix/Linux, including Google and Amazon."
Jeez. Any Web site will stay up for a long time if you provide enough failover Web servers and "round robin" DNS, but we want the SERVERS to stay up - and Windows ones just don't.
I laughed when I saw MyCrapSoft's own ads for SQL Server giving a 99.97% uptime. Har-har-har... we all know why "the five nines" is what we need for reliability, but if you don't, do the maths: 99.97% uptime means you could be down for eleven days a year. And in 2006, eleven days down means that, commercially, you are dead... :-)
Apples and Oranges
This all seems like a silly word game.
A SQL Server Cluster (Failover Cluster) is about redundancy. What is described here as a 'cluster' sounds more like Federated Servers which is all about performance - for a TPC test rig I know which I would choose.
Thinking about these 2 discrete concepts in terms of Raid levels would show a Cluster as effectively Mirrored and a Federation as Striped.
So congratulations Oracle - your 12 server optimised-for-redundancy cluster has a higher mtbf than the competitions optimised-for-throughput 'cluster'!
Next we will be hearing from disk manufacturers how their 12 disk raid 1 has better mtbf than the competitors 12 disk stripe and therefore their disks are better...
Comment on "Yes, it does still suck"
Forgive my uncertainty here, I may well be wrong. But I can't see how 99.97% uptime can give 11 days of downtime a year. Surely 99.97% means that it is up for 99.97% of the time. So, in 100 days it would be up for 99.97 days and down for only 0.03 days. In 10,000 days there would be 3 days down. I make that, very roughly, 1 day every 10 years.
Even if we assume a cluster of 12 machines set up for speed rather than redundancy; that still only means just over one day a year, not 11.
But perhaps I am wrong.