In the summer of 2008, Google flipped the switch on its App Engine, letting outside developers build applications atop its state-of-the-art online infrastructure – and it soon got a lecture from Jason Hoffman. Hoffman – the founder and chief technology officer of Joyent, a San Francisco outfit offering a (somewhat) similar …
Faster than Bare-metal?
"If anyone uses or ships a server, the only reason they wouldn't use SmartOS on that box would be religious reasons," he says. "We can actually take SQL server and a Windows image and run it faster than bare metal windows. So why would you run bare metal Windows?"
"Bare-Metal Windows" meaning what exactly? Windows running on a bare-metal hypervisor or a normal non-virtualised windows installation?
If the latter; how the hell do you do that?
I was wondering the same thing then more carefully re-read the section.
I believe he is specifying that you have an IO-bound vm that runs faster than on bare metal, presumably because ZFS is so much better than NTFS and you have some spare RAM for some caching. I mean truth be told Windows is not great at disk caching.
Still it's a pretty bold claim, one that I think has a few strings attached.
Anyway, live VM migration without SAN *is* exciting. all hail ZFS!
Faster than Bare-metal?
All through the article he talks about this being for 'Legacy' stuff, that's usually code for the stuff that no-one understands but we dare not touch because we know the company depends on it.
Often the reason for virtualising this kind of load is to get it onto faster more supportable HW without changing the OS.
In this situation, by making better use of the new HW than the underlying guest OS can, we can see better performance than running the guest OS on native HW.
For example if you virtualise a 32 bit windows OS with a memory limitation of ~3.5Gb on a server with a large memory, then you could conceivably use the extra memory as a cache and reduce the I/O considerably.
If you have a legacy server running some old version of windows that can't make use of modern 10Gb ethernet or 8Gb San HBA's then again a virtualisation layer can get around this.
The ultimate example would be an old OS that simply won't boot on new HW where the existing hw is failing and can't be replaced because it's no-longer available.
The performance on existing hardware is none at all because the existing hw has died.
The performance on new hw is 'none at all' because the crappy old OS can't boot.
The performance on a VM on the new hardware is something, therefore better than 'bare-metal'.
Presumably the SmartOS container is better at managing IO, resources, etc. Given that MySQL doesn't scale well over multiple CPUs, it doesn't sound that far-fetched.
Sometimes emulating is indeed faster
FreeBSD sports something called a 'linuxulator', a shim that wraps around linux binary processes, catches the syscalls and translates them to FreeBSD ones as available. Meaning you can run the ordinary stuff but really linux-specific stuff will barf or bail. Incidentally this is the same mechanism used to support previous major versions allowing for backward compatability, as breaking api changes are reserved for major version number changes. Anyhow. It used to (and might still) be the case that certain syscall implementations on FreeBSD were just a tad faster than the same calls on linux, even counting the translation overhead, allowing FreeBSD to run linux binaries faster than linux.
Something similar might be the case here. Note the mention of IO-bound being a factor. If they have very good IO handling, block caching, and so on, and windows (still) doesn't, then that's a fairly easy win. There might be other things that are just more efficient when done through several translation layers rather than having the redmondian code go at it alone.
Up to 50 times performance gain for a JVM?
The JVM is already reasonably nippy, is he really saying that through magic he can make my 100ms process take 2ms? cus that's seems to be the implication of that sentence ....
Sounds amazing, but waaay to good to be true. Maybe this is true is some bizarre I/O bound corner case; exaggeration on this scale doesn't inspire confidence.
but your 3 hour process may well be significantly sped up...
Maybe he is talking about through put? Not necessarily lowering latency?
The only way you're going to run faster than bare metal (i.e. an OS running straight against the hardware) is if you're cheating on things like filesystem virtualization, e.g. holding a honking great big chunk of fs in memory. As such, what's to stop the baremetal os doing the same? What does it mean also for what happens if the power goes?
see post above
E.G. a 32-bit OS which will only support up to 3.5GB RAM (Windows, I'm looking at you) so can only have a tiddly disk cache. Never mind that the stupid OS (Windoze again) constantly tries to free RAM by swapping stuff to disk.
Stuff that on a server with >4GB of RAM and you are wasting it.
Virtualise it on a server with >4GB of RAM and even if there's some overhead the hypervisor can use all that extra RAM as a disk cache. Shimples.
Cue the old 16MB barrier...
... and swapfiles on ramdisks (through EMS and lots of shuffling memory regions around) being used to speed up windows 95. Redmond's finest never held much of a cutting edge.
Windows Server 2003 and up support up to 64GB through paged address extensions (PAE). So unless you were using some crappy ancient version of Windows with ancient hardware and limited physical memory you wouldn't have been restricted to 3.5GB to start with. It's seems likely to me they're stuffing 2-3x the amount of physical memory into the box over what the virtual machine sees and using it as a honking big file cache in some way. Which IMO is cheating really. If a hypervisor does that then so too could the actual OS just by letting it see the memory in the first place and tuning it accordingly.
PAEing the price
sure windows can do PAE from 2003 and up but then you have to pay M$ for the Enterprise version of the OS. :(
this even applies to the 2008 version which I wouldn't describe as ancient.
Keyboard Video Mouse got to do with this?
Er, you might like to have mentioned that SmartOS is based on Illumos, and that Illumos is coming along at a very rapid and impressive clip.
"and with DTrace, they can do things he says they've never done before."
Personally DTrace enabled me to see things I've never understood and probably never will!
They're probably young and have never seen a real OS level debugger before*.
* where the REALLY arcane stuff happens whether you want it to or not even if you can understand it.
More probably: you have not seen DTrace nor are familiar with what it can do. There is a reason DTrace is so hyped, and why Mac OS X and FreeBSD has ported it. And why IBM AIX is copying it and calling it Probevue. And why Linux trying to copy it.
Mozilla developer switches from Linux to Solaris:
“I don’t think I can work on Mozilla without DTrace ever again. Too useful.” — Rob Sayre
Why do you think every developer has hyped DTrace? Because they dont know better?
Maybe I'm a bit thick but consider this sentence;
"On Monday, Joyent announced that it is open sourcing a version of the KVM hypervisor"
Given that KVM is already licensed under the GPL then any "version" of it by definition *has* to also be released under said license.
So they can hardly "open source" something that was already open source to begin with.
ZFS is yesterday's news
I for one will be welcoming our Btrfs filesystem overlords very soon
You must be kidding. BTRFS is immature. There is not even a way to fix corrupted btrfs filesystems yet! Sure, such basic functionality will come in some time.
But, do you really think ZFS will stand still? Do you think that ZFS development has stopped? There is a enormous gap between btrfs and ZFS today, ZFS is many years ahead. Do you really think that btrfs will narrow that gap anytime soon? Sure, if ZFS development stopped today, it would take some years to catch up.
There are sysadmins that refuses to use filesystems younger than a decade because they have bugs. When btrfs is released as v1.0, it will take many years before it is let in to the server halls.
Sure, Btrfs might have some high ambitions, but ZFS exist today, and ZFS protects your data. While btrfs does not and might corrupt your data.