Re: Still pugging something
"that nobody really wants."
Hmmm, I don't think you've read the article properly, nor do I think you understand programmers either.
If you look at what's been going on in CPU design over the last 15 years you can clearly see that the CPU manufacturers have concluded that the vast majority of programmers are not prepared to confront the un-scalability of the software they right.
What Happened When Someone Built a Pure NUMA CPU
For example, the Cell processor in the PS3 is perhaps the ultimate physical expression of the benefits of properly embracing NUMA in your software. The Cell doesn't give you the option - it's maths cores (the SPEs) are unable to directly address each other's memory - pure NUMA. This obliges the programmer to write software that is wholly NUMA aware. If you do that and know what you're doing you can get performance that even today Intel's biggest chips are only just challenging.
Hiding NUMA from the programmer
Whereas Intel, when they finally went NUMA, hid that from the programmer by making QPI synthesise an SMP environment. It took a lot of silicon, and their design panders to the 'average' use case of one machine running several different programs and needing good performance for each of them.
Which Design Strategy Sold Best?
Now, any kind of market analysis will show that Intel got it right, and the IBM, Sony and Toshiba got it wrong. Sure, Sony put the Cell in the PS3 and have sold a bundle of those, but the number of programmers who can fully exploit the CPU is really very low. IBM realised that too, which is why they dropped it a few years back. Much to my great annoyance in the world of high capacity mass-parallelism signal processing where such architectures are very familiar and exciting.
So?
So what does that analysis tell you? It tells us that programmers, mostly, cannot / do not / aren't allowed to spend time and effort properly architecting their code for true scalability.
So lets carefully analyse of what ScaleMP have actually done with their hypervisor. In effect they've done an Intel. Intel, for multi-socket boxes, have a bunch of cores connected on a network (the QPI) that allows any of them to access any memory anywhere else as if it were a true SMP system. All that ScaleMP have done is written a hypervisor that, if you squint only a little bit, provides a bunch of virtual cores connected on a network (the Infiniband) that allows any of them to access any memory anywhere else, as if it were a true SMP system.
Given that, and the clearly continued success of SMP (synthesised or not) in the modern NUMA world, how can you say "that nobody really wants" it. I think ScaleMP will do quite well once system developers realise what it is.