Oh Dear
I appear not to have updated my VMWare server for several months. It also appears to still be running happily. Perhaps I'll leave it a little longer before doing anything to it.
We need a smug git icon.
Irate VMware customers were left unable to power up their virtual servers this morning because of a bug that killed their systems when the clock clicked round to 12 August. The bug was sent out to customers in ESX 3.5 update 2, VMware's latest hypervisor, which went out on 27 July. The version could have been downloaded and …
OK, so maybe it wouldn't have been noticed in time if the Source Code had been out there with the customers. We have no way to know.
But the fact remains that there is absolutely no good reason that will stand up to the briefest moment's scrutiny why customers should not be given full access to the Source Code of any applications that they intend to run on their computers. Not one single reason.
Therefore I stand resolutely by my position, calling for mandatory Source Code disclosure and stiff penalties for non-compliance. Vendors who have nothing to hide, have nothing to fear.
Note, I'm NOT saying people should necessarily be allowed to distribute copies of software at will; although since the absence of Source Code has done nothing to prevent this, it is unreasonable to suppose that the presence of Source Code will make this any easier. I AM saying that people should be allowed to examine and modify the Source Code to any software they are properly authorised to use, to delegate such activities to third parties and to pass on details of any modifications they may make to other authorised users of the same software without let or hindrance from the vendor.
This would open up a lucrative secondary market, creating jobs within the IT sector: certifying software as fit for a particular application, and adapting it to the way people do business, as opposed to vice-versa.
Nobody would eat a cake if it didn't have on the packet a list of the ingredients and how much fat, protein and carbohydrate it contained, would they? And I don't think many people would buy a car if the manufacturer refused to allow them to fit fluffy dice, transfers, beaded seat covers or anything that plugs into the cigarette lighter, but forced them instead to trade in their car for a brand-new model with ever-so-slightly-different controls because the old one would not drive down roads that had already been driven on by one of the spiffy new ones.
I am convinced that the only reason anybody puts up with this sort of behaviour around computer software is that most people just haven't been around computers long enough to have seen that there used to be a better way.
(Oh, and by the way: I don't download pr0n. When you've seen one naked body, you've seen them all; and when you've actually seen a real one, computer graphics don't cut it anymore.)
I just get this feeling when using VMware that it's buggier than it should be and the company just seems to accept that state of affairs. So something like this probably has to be expected. Their reaction that, basically, it's not that big a deal just confirms for me that I wouldn't want to run anything really critical on it. That's why I don't.
Rarely do I get a good laugh out of the commenters on El Reg, but today, my thanks go out to The Other Steve with this down-to-earth comment:
"For a start, not one in a thousand people have the skills, shit, not one in a thousand programmers have the skills, to read through the source listing of a hypervisor and spot a bug like this, unless it's something really glaringly obvious like a great big commented section that says "THIS CODE WILL CAUSE THE SYSTEM TO FAIL ON AUGUST 12"."
Virtualization needs to be done well, with the same level of planning and preparation as any other deployment solution. Some of you seem to think VM is a cop out for lazy or unskilled admins to not have to tune their boxes, but anyone effectively using VMs will tell you they had to do plenty of tuning to get the VMs humming.
The benefits of virtualization are obvious to people who are good candidates for virtualization. Not a good fit for you? OK. I run my shop with half the people I would need without VM's and I save my company tens of thousands of dollars every quarter and can deploy new servers in minutes.
Bottom line, we spend a lot less on hardware, electricity, and payroll. And our response times have gotten better every month. And our disaster recovery could not possibly be simpler and faster. We do a full DR simulation every year. Full rebuild from backup tapes. It's a breeze. Can't imagine doing this without VMs. Or maybe I can and don't want to.
Totally agree with AC here. In my organisation, we have 300+ physical boxes, and we are undergoing a "consolidation" regime to move them ...to 300+ *virtual* boxes. It totally flabbergasts me - we still have to pay umpteen squillion for each of the Windows licences, and then there is the VMware licence on top of it. Sure, there are some hardware savings, but there is no cost reduction for installation and maintenance. The hardware savings would be the same (if not better without the overhead) if you did a normal kind of consolidation.
When I mentioned such arcane things as running more than one web-app on a box, or more than one server app (with a local client) on a server, I got mutterings of "compatibility issues" and "performance". Hello? You sort out your compatibility problems by installing apps on the same box that play nicely with others, and as for performance, if you double your RAM, CPU and disk spindles (after eliminating obvious memory leaks and the like), your performance will no doubt improve and cost less than all those stupid VM and Windows licences. Gah!
I appear not to have updated my VMWare server for several months. It also appears to still be running happily. Perhaps I'll leave it a little longer before doing anything to it.
We need a smug git icon.
Sun preps xVM Server for release. And most delicious it's looking too.
Time related bugs are indeed hard. Although this may well not have been a bug in the code, more a bug in specification, if indeed, it is a licence issue. Code inspection of the licence code would have revealed a perfectly working sub-system.
Time issues are caught in New Zealand. Companies used to ensure that they had very good relationships with customers in the land of the long white cloud. Because that is the first place the bugs come to light. Gives them nearly 12 hours before it hits Europe and up to 18 hours for the US. (Note to earlier poster - today is the 13th in Australia - we get the bug 2 hours after New Zealand and well before most of the world - NOT after.)
Very very often bugs are not in the code. They are in the specifications. The majority of very well known big time bad bug examples can be traced to the specification, and thence to perfectly correct implementation. The Patriot Missile is a perfect case in point. There was no bug in the Patriot code. The specification called for a missile system that was intended to be highly mobile, and would be set up roughly once a day in a new location. The required drift spec for the clock was derived from this. During Desert Storm the missiles were set-up in fixed locations and no-one realised that this would result in the system remaining operation for longer than the time the clock was specified to remain within bounds. The fix was as simple as rebooting the system daily.
Building and managing large code systems is hard. There is a lot of snake oil out there that claims to provide magic (silver) bullets to cope. Most are a waste of time, or only useful in very constrained environments. Building an accounting system is a very different beast to an operating system. But it sounds as if VMWare need to get their release QA process sorted. This one should have been caught.
i was at a customer site today in Oz where 50% of their infrastructure was DOWN HARD. what i overheard they couldn't get through to the Virtual Support Drones and spent the majority of the entire Oz business day rebuilding literally hundreds of servers to remove the patch.
regardless of platform, imagine most of your servers down all day. and imagine the urge to not piss off to the pub at 0900 when they came in and found Nightmare on Virtual Street.
i just happened to have to hang around most of their day as we were trying to get some apps installed, and i'd be the first to say that this won't be the last that the VM guys hear of this. it smelled worse than a bad Blooper Patch Foolsday from Redmond.
For anyone that doesn't want to use VMware but wants virtualization (or consolidation more accurately), they should take a look at OpenVZ. It's free and works very well. It only runs Linux, so don't plan on using Windows with it. That's actually a feature: you save a ton of money on Windows licenses.
VMware is for the birds.
except maybe the single point of failure ... and unknown hardware contention but we'll see that one later.
I still can't believe production systems are running using this thing.
One extra layer of crap that is very handy during functional/business testing ... but heavy load, stress & volume, network & disks. I'll be watching from the sideline.
Storage virtualisation ... when you are used to precisely locate data on your spindle to max the perf, Trix let us know when it goes tits up, so you don't feel alone when the "I told you so" moment comes.
Of course they're using it - It's ESX - it's the enterprise virtualisation system with a proven track record... thousands of businesses rely on it.
When the free one came out I even moved to it at home because it was amazing how much more efficient ESXi was than VMWare Server (it's able to do far better resource management as it's a custom OS with a tiny footprint).
VMWare haven't been exactly forthcoming on this bug though. They started OK.. emailed everyone on their list and said they'd update 'every two hours'.
Somewhere between sending that and actually working on it they changed their mind.. not only did they not update every two hours they deleted their kb article referring to the bug so it's impossible to find out what the state is now. Not even microsoft attempt that kind of news management.
Programs have bugs. Linux has bugs. Firefox has bugs. VMWare has bugs.
If you sysadmins really believe that all software on your systems is 100% bug free you should be fired. Why is this 'a really big deal' for VMWare? They're embarrased, the developer responsible is embarrased, the code reviewers are embarrassed, the static analysers are embarrassed. Apart from that they really couldn't give a shit.
Shit happens. It'll happen again.
@benefits of virtualisation
yeah, good URL: it states:
RESOLUTION
To resolve this problem, use the appropriate method:
>Back to the top
(as n there is none... meh - its for Win95/Win98...)
@VMs why!!!
Hooray for someone who has their head screwed on... this AC distilled it all in one bucket: iron is dirt cheap, standardise on one OS, plan+prep, and use applic layer balance Xrs to mitigate the user load - dude, you one top man! Welcome at my place any time for a beer....
lastly... did VMware screw itself by itself, or was it change of date by an app that did it? Either way what a hilarious fuck-up... what else is waiting in the wings - system call to query the OS type and it deletes the boot partition? ... caveat-emptor for you boys+girls who want to cut corners....
[hint: never employ Ex-MS execs - MS implant funny devices in their heads b4 they leave....'resistance is futile' is replaced by 'prudence is infantile' as a disguise]
skull+crossbones coz you gotta have a pirates mentality to survive in the IT marketing world of utter bullshit... (and quoting 'parlez' when you're caught out will only attract a quick walk down the plank)
My company was looking (was being the big word) at using VMWare, had a salesman in yesterday telling me how great VMWare was and how stable etc (usual salesman bit).
Weirdly he didn't mention this bug, anyone care to draft me a response to his email today asking when we were thinking of signing up for VMWARE ?
You can only use the following word - Hell, Freezes, Over, When....
Why do you think so many companies have long adoption cycles for new operating systems and software?
Cliches exist for a reason. "Fools rush in", and so on.
When Windows 98 came out, a company I worked for decided it was finally time to upgrade all of their desktops and laptops to Win95. The company I am at now, and all of our clients, are still using XP. Why? Because you don't need the latest and greatest updates for everything.
Whether it's open or closed source software, there will be bugs.
As for the immediately previous poster's comment (unless some got squeezed in before I clicked 'Post'; I am referring to the Paul who hates salesmen), if you were to take the same policy towards every piece of software (rejecting it due to one bug), you would at best be using MS-DOS if not manual typewriters or pen and paper for everything.
Sign up, sign up for Blocks and Files, The Reg's weekly storage newsletter