back to article Fault tolerance in virtualised environments

In this, our final Experts column in the current server series, our reader experts look at fault tolerance in Virtualised environments. As ever, we’re grateful to Reg reader experts Adam and Trevor for sharing their experience. They are joined by Intel’s Iain Beckingham and Freeform Dynamics’ Martin Atherton. Server …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Happy

    Virtually speaking...

    The Reg is really trying to push this virtual stuff this week, no?

    "...Machines that would have turned end-of-life this year can now run a handful of virtual servers with ease and the huge savings in terms of costs of equipment"

    I'm sorry - why does running a VM suddenly make a machine faster or have more capacity? The machine won't get any faster (it will actually get slower because it's now running with the overhead of the VM), and you can throw more disks at the machine to increase capacity without employing a VM.

    "...Virtualisation, when handled properly can lead to some pretty amazing uptimes...bla bla ....clusters ....failover ....etc"

    I can do this with an OpenBSD box running pf sync and carp, and I'm sure other OS's have similar capabilities. And my current box has been running (with OpenBSD) 24 hours a day for the last 5 years without a hitch, and that's WITHOUT any kind of failover or redundancy. "Pretty amazing uptime" is standard if you choose the correct platform.

    You may have guessed, I'm still not convinced by the VM stuff. Whenever anyone talks about VM, it is usually in the same sentence as "Windows". If you are going to run critical services on something as flaky as Windows then no wonder you feel the need to cluster it all and run multiple redundant servers, even for basic stuff. As has been pointed out previously, VM seems to be mostly a sticking plaster over (almost exclusively) Windows to make the thing work with any kind of reliability. If you use a proper, reliable OS (and use appropriate chroot / jails), you simply do not need all this crap.

    1. Dan 10
      Happy

      New comment system still needs titles

      Some reasonable points made there, although I think the sticking plaster bit is a slight oversimplification. The fact is that Windows is the dominant platform for many companies. Microsoft were happily bumbling along, being criticised for poor HA measures etc, when along came the upstart VMware, which turned the world upside down and allowed Windows shops to acquire the improved management/availability etc without a wholesale shift to a completely different OS.

      Incidentally, although you're right about having to run the overhead of the hypervisor etc, the memory management model in virtualisation products is partly what allows a number of lightweight-ish VMs to co-exist on a previously mediocre-spec server. Shared code pages between VMs and a decent optimisation model for private iterations of copy-on-write memory pages mean that a basic server may consume a small fraction of the 512MB that it had when it was physical. combine that with the fact that most Windows servers are only utilised at <15% and yes, you can have 5 servers on a relatively old bit of kit, running at maybe +60% util.

      Oh, and finally, Trevor Pott has one serious moustache.

      1. Anonymous Coward
        Happy

        ...but...

        I hear what you're saying but shared code pages, copy-on-write stuff, virtual memory models... this is what an OS does for a pastime. I know I am repeating what many others have said but this is what they are designed to do - you run your apps on the OS and get the rest of it for free.

        Also, you say that "sticking plaster" is an oversimplification and then go on to explain that it is BECAUSE Windows is so bad that VMs are gaining interest and make an otherwise unviable platform, half usable. What is that if it's not "sticking plaster"? Just to hammer home the point, you then say that running Windows in a VM means that you end up using "a fraction of the 512MB that it had when it was physical". All this demonstrates is that Windows has an appallingly bad memory management model, which should come as no news at all to anyone who has opened a Word document, never mind tried to run any kind of service on it.

        I respect what you're saying and I can see that VM techniques are interesting (they have been around for donkeys, so there must be something of interest in them). It's just that they are the wrong solution in most cases. The correct solution is to use a platform that works correctly in the first place and doesn't need scaffolding to stop it falling flat on its face.

        1. Trevor Pott o_O Gold badge
          Pint

          @AC 16:29 GMT

          "The correct solution is to use a platform that works correctly in the first place and doesn't need scaffolding to stop it falling flat on its face."

          This is absolutely 100% true. Now, to make this a reality, can you please (pretty please?) provide the tens of millions of dollars required to have the software vendors for my industry rewrite all their apps for Unix or Linux? Perhaps you can toss a few extra tens of millions in there to get them to make it all cluster aware, and while you're at it we'll have to get them to code everything to run on something like Hadoop.

          x86 virtualisation is a kludge. I know this, you know this, and almost everyone who uses it knows this. It was a way to deal with bad programmers, and poor platform choices. It's also a way to graft HA and cluster functionality onto things that simply don't have it without rewriting them. The more mature it gets the more features get built into the virtualisation stacks that, let’s face it, should be part of the OSes and apps we are using.

          Just as an example, in my industry, all the critical applications have code bases 15 years old or better. It's an industry in which the client-side of things will always be windows. (Photoshop, amongst many other critical applications, only runs on Windows.) There isn't a choice to simply "not use Windows."

          For the many people in my situation x86 virtualisation is an absolute godsend because it allows us to try to compensate for the bad decisions made by entire industries over the course of literally decades.

          No matter what level of evangelising is thrown about, "right solution" versus "wrong solution," Apple versus Windows, Linux distie A versus Linux distie B...we don't all get to choose what we use. Even if we did get to choose the platform we wished, there’s a religious zealot around the corner screaming that our choice is “wrong, wrong, wrong” for the entire internet to hear. You make a fair enough statement that the best path is to choose a path that doesn’t involve vulnerable unstable platforms. The problem is that this insight doesn’t help the vast majority of us get the actual applications we must use to keep our businesses running migrated to these “stable platforms.”

          So what is presented seems a lot like bellyaching and evangelising without helping or offering a workable solution. Unless of course you’re stumping up the several Mil required to get the devs to port the applications we use. In that case I'm your new best friend for life. If not, I’ll stick to my virtualisation, and smile because it cuts my workload in half, and does about the same do our downtime.

          For now though work to be done, and soon it will be pub o'clock.

          1. Anonymous Coward
            Happy

            Ok, but...

            OK, I'm with you on that. You have some apps that will only run on Windows. There's not much you can do about that. But are we talking about running VM on the desktop or on the server? Correct me if I'm wrong, but most VMs are running on servers. Does photoshop really care what system is providing network-based storage (I don't know - what other network services might photoshop use)?

            It seems to me (and again, I am happy to stand corrected), many people seem to take apps like email, web servers, database servers, etc etc) and run each of these in a VM instance. These are not desktop applications, and viable (and (almost?) always, MUCH better) alternatives are available that will run on non-Windows platforms.

            I'm not an evangelist; it is no skin off my nose what you run on your machines. But I DO have to work with Windows on a daily basis and I find it incredible that many people and companies persist with it, and (in the shape of VMs and other techniques) constantly prop it up and then sit back and say "hey - that's really cool", when it reality it is (as you day yourself) and kludge, and a pretty awful one at that. In fact, it's not "cool" at all; it is a distinctly backward step.

          2. Anonymous Coward
            Thumb Up

            One last thing

            Kudos to you for admitting that the VM thing is a kludge. You must be the first person I have (virually) seen that uses (and indeed supports the use of) VM but admits it's a kludge. Every other VM person seems to treat it as a holy grail.

            1. Trevor Pott o_O Gold badge
              Pint

              re: VMs

              Oh **** yes it's a kludge. The ideal system would be something small, yummy and stable that I could run on a vast array of disposable lightweight servers. (See: mini racks of Atoms or CULV servers that are becoming a "niche thang.") Problem is that huge numbers of workloads require windows, and windows is both shite at high availability and bad with the not dying.

              As to Photoshop (and similar client-side apps necessitating windows) and what they have to do with the price of rice here...client side windows means server side windows. Don't bother with the "that's bollocks, you can use server-side Linux with your Windows clients." Been there, done that, went back to Windows. For all the alternatives, Windows clients talking to Windows servers, (and the nice stack of vertically integrated goodies Microsoft sells CALs for,) really are just way easier to use.

              There is a point where the time and sanity of the admin who has to run everything has to be considered, and (sad but true,) if there is a significant Windows estate deployed, you will probably be better off with Windows Servers running the show behind the scenes.

              Still, kludge or not, x86 virtualisation *is* the greatest thing since sliced bread. It solves a gigantic pile of problems that used to give me ulcers, and I am one of them folks who are far too poor to even buy the management tools. (Let alone the blue crystals!) The world is full of crummy programmers, and great programmers restrained by crummy project managers. Virtualisation helps the man in the trenches keep it all running.

              For everything else, there’s Mastercard.

      2. Trevor Pott o_O Gold badge

        @Dan 10

        You should have seen last year's beard...

  2. Pantelis
    WTF?

    Correction for Adam Salisbury

    "There are vendors who specialise in fault tolerant hardware on which to run your hypervisors but these are expensive to buy and even more expensive to code software for, an arguably cheaper option would be to invest in a blade frame for perhaps the greatest resilience and even cheaper option than that is fault tolerant software".

    That statement is unfortunately false with regards to the "expensive to code software for" part. There are FT systems that can run VMWare ESX and then you can run on top of ESX other OS's and any application that can run on those OS's without any extra coding, let alone "expensive" coding. Both NEC and Stratus have such xeon based FT systems and though pricewise they are more expensive than standard Xeon servers, they do offer complete hardware redundancy and uninterrupted operation even if they experience a failure on any component (cpu/memory/chipset/video card/netword card etc) without the need for any extra or special configuration, or coding, or software because the fault tolerance is built into the hardware.

    Keep in mind that they don't offer cluster type continuous operations because in a cluster you have what is called failover time. For such FT systems there is no failover; it is true uninterrupted operation in the event of failure.

    Get your facts straight please before posting erroneous articles which may mislead others!

  3. Trevor Pott o_O Gold badge

    @Pantelis

    Can you give specific examples? Preferably cheap examples?

This topic is closed for new posts.

Other stories you might like