It is every IT administrator’s worst nightmare. All the employees’ desktops have been virtualised and are running on a server. The pilot project worked well and everyone was happy, but then the team tried to scale it up and now it’s Monday morning and 3,000 users have just walked in with their lattes and croissants, sat down at …
Only a surprise to noobs
This effect has been around since the beginning of time.
Whether it's hundreds of users coming in on a monday morning and all trying to access their email (off the one single server, that was only capacity-planned for a steady-state load) at the same time.
Or the hundreds of call centre staff who all go <click> when their shift starts: like the email or Windows servers "storm", but much more intense, as they all start within a few minutes of each other.
Or (worst of all) bringing up a system after a crash when EVERYone tries to log in continuously just as soon as they see the login screen.
It's even been a problem in the days of mainframes when everyone tried to fill in their weekly timesheets at 16:30 on a friday afternoon (they had to be done before you left, you couldn't fill them in earlier - 'cos you didn't know what you'd be doing - well you did: you'd be waiting for CROVM4 to respond for about 15 minutes)
But, of course, no manager is prepared to shell out for a system that's specc'd at 500% of their steady-state capacity requirements, to handle a workload that will only exist for a few minutes once or twice a week.
Re: Only a surprise to noobs
That's pretty much what I was thinking - you'd have to have been pretty crap in your initial project to *not* think "Hmm, how're we going to deal with peak load situations?" I mean it's not like they don't already happen even with domain-bound workstations that use networked authentication and home directories...
Yes, there'll still be idiots who get caught out by it, but then that's the nature of idiots, isn't it?
I wonder if there's milage in caching some of the OS components on the client machines. Sure, you'd need slightly more expensive and complex and powerful clients, but on the other hand if they booted using a local kernel and userland and then accessed files and applications over the network, think of the bandwidth-saving!
Just think about it..
..you could even load up the entire OS locally, saving huge bandwidth and server power. Some kind of "personal" computer.
Why hasn't anyone thought of that before?
So what you're saying..
..is that virtualisation only works if the machines concerned are being under utilised in the first place :)
What he's saying is that VDI only works if you have an obcene budget and/or more money than sense or if your users access the system sequentially and we all know how quick sequential access is, right chaps? ;-)
More proof this cloud BS is simply another method of trying to shift more server units
Everything's under utilised and that's a good thing
All computers (servers, PCs etc) are under utilised. When they ARE fully utilised (i.e. running at 100% capacity) the response time is so bad that everyone complains. The basic problem is that t'management buy hardware, not a service and baulk at the idea of spending £-mega for a machine that will sit around rusting for most of the time.
Although if you look in any company carpark, you'll see more £-mega of cars sitting around rusting, which nobody is worried about. That's because the owners of those fine vehicles are prepared to pay for the utility they get (i.e. not having to sit/stand beside/be groped by - some smelly stranger on public transport: rather than the utilisation they get measured in minutes of use per day.
The basic problem is that people fixate on the reports from performance monitoring tools, rather than the quality of the service they are getting.
Using one shared disk image is not always enough; this is where technologies like Linux / KVM's KSM (Kernel Samepage Merging) can drastically cut down the memory requirements on the host server. RedHat claim to have had 600 virtual machines running on a server with 256GB of RAM and 48 cores. That's almost certainly over doing it, but it can be done with KVM/KSM.
Perhaps we could develop an Intelligent Thin Client™, the author touches on the subject of direct-attached storage but what if we could somehow extrapolate that concept further and have a client which could process it's own boot sequence and not just by utilising local storage but by utilising client-side memory and processsing capacity.
Such a magic bullet could and would masively alleviate the burden on the network and server resources and even allow users to continue working should back-end infrastructure fail. It'd also cost a fraction of upgrading to a fibre-core LAN/WAN/SAN and doubling the number of servers in the VDI cluster to provide sufficent failover capacity (IMHO).
Actually the compelling case for VDI is security and support for large enterprises with multiple sites. Thin clients require little skill to maintain and support, and upgrades can all be managed from the data centre. A corrupted VM can be blatted and a fresh one started in moments, an upgrade requires a few images to be modified off line, not an estate of thousands of machines.
The more I read about new technology the more I feel as though I've fallen through a time warp.
We are about to deploy a large VDI implementation, and we have been told from day 1 by our vendor that you always maintain a configurable pool of pre-booted instances on each server, and that you need to be very careful with bandwidth between your storage and servers. They also suggested that holding the VM image on a flash disk in the server might not go amiss.
To me that was "No Sh** Sherlock", but then I've been doing this a long time, and remember all about session start-up on our VAX servers at 9am on a Monday morning. Mind you we did only have to worry about RS232 connectors to VT220s and the odd terminal concentrator. Nor did I have to cope with managers who thought that VPNs increased the available bandwidth on a link.
Load- and StressTest during and after PoC
VDI is a complex matter, not just due to performance, but complex in many ways (as can be read in the comments). But the basics are equals as with Server Based Computer (SBC) as we've been doing for the last ten years.
Durint the design phase you should always consider the capacity, and therefore the scaling requirements and possibilities.The impact of booting a machine is huge and logging in a user is about 150% of a regular user (as a guideline). Jim Moyle wrote an excellent article about IOPS in an VDI environment which I highly recommend : http://www.jimmoyle.com/2011/05/windows-7-iops-for-vdi-deep-dive/
The thing is though, you should always test the performance during and after a PoC (or in the development phase). What is the impact of one machine, one user or certain processes? What happens if you increase the users? What happens if all users write their billable hours at the same time?
This is why a LoadTest is key for success. A StressTest would have prevented the problems described in the article, or atleast you would have known upfront (and could have informed the managent before the managers started complaining).
I wrote an two part article about LoadTest / StressTest best practices here http://www.ingmarverheij.com/2011/05/loadtesting-best-practices-part-1/.
LoadTesting is relative easy and cheap if you know what your objectices are. If you have the chance, and I bet there is, make sure you can validate your design (and a bonus, you'll be able to see the impact of changes in the infrastructure or applications).
I have a solution
Fixing this is easy, but I'm not sure it would take off. If everyone had their own computer, with maybe a hard disk, ram and a processor..
You know there was a reason why we moved away from dumb terminals in the 80s. I have a cricket bat to smack around the head of any co-worker who ever suggests using desktop virtualisation.
It's a pity
All this technology spent on parallel processing and all we get is the same problem we've had since the dawn of time. Man simply can't sneeze and hiccup at the same time, not that I'm sure one would want to.
timewarp? nothing new under the Sun (or IBM)
Well, first there was a vax+vconsole cycle of the 1970s, (or was it 60's?)
... rebound to the standalone console.
then there was the mainframe+terminal cycle of the 1980's
... rebound to the PC.
then there was the Plan9+thinclient cycle of the 1990's
... rebound to the LAN server and desktop PCs
then there was the server+browser cycle of the 2000's
... rebound to the home SAN server and PCs
then there was the cloud+VM cycle of the 2010's
... where to next?
Seems like every generation suddenly gets this brilliant new idea about how to break out of their parents "oldfashioned" technology.
Nobody's as dumb as those who just don't want to know
They never learn, do they. More surprising are that people have known the difference between M/M/1 and M/M/infinity queues for just over 40 years now.
Don't blame the charlatans who sell the snake oil, blame the idiots who buy it.
- iPad? More like iFAD: We reveal why Apple ran off to IBM
- +Analysis Microsoft: We're making ONE TRUE WINDOWS to rule us all
- Climate: 'An excuse for tax hikes', scientists 'don't know what they're talking about'
- Analysis Nadella: Apps must run on ALL WINDOWS – PCs, slabs and mobes
- Yorkshire cops fail to grasp principle behind BT Fon Wi-Fi network