This year the supercomputing community gathers in Seattle on November 14-17 for SC11, the conference to learn and talk about how they design and build and code the biggest computers in the world. So what can you expect? Who better to ask than the chair of SC11, Scott Lathrop? In a wide-ranging conversation, we explore how the …
No More Vectors
Today's microprocessors have vector instruction sets such as MMX, AltiVec, and SSE, which involve dividing a wide chunk of memory into several numbers. This derives from such computers as the AN/FSQ-32 and the Lincoln Laboratory TX-2.
The other kind of vector instruction, where a vector might consist of a large number of double-precision numbers - 64 of them, instead of 2 - but instead of being fetched all at once, is processed one number to a clock in a dense pipeline sequence - is the one used in computers like the Cray I.
Back around 1994, you could buy a board with the Fujitsu MB92831 on it - it was a single-chip co-processor with vector registers and vector instructions like those of a Cray, used in machines like the Mieko CS2.
Why we don't have things like that now, but even better, is well-known, so they say: the extreme cost advantages of commodity hardware, and the huge gap between memory bandwidth and computing speed, which reduces the benefits from vector instructions. But I still wonder, and suspect that classic vector instructions would benefit commodity micros for prosaic tasks like computer games.
Isn't that almost exactly what this GPGPU business is all about?
Crikey, you used to post this guff in comp.arch & comp.sys.super donkeys years ago. Just to repeat what you have already been told by guys who designed, built & used PVPs... Vector instructions sucking data out of wide pipes can look great on paper but they are a PITA when it comes to actually putting them into silicon (or GaAs or whatever else you have). Clock skew is *not* your friend. Cray went to considerable effort to minimise the impact of clock skew and as clever as they were they still ended up developing the S-MP, CS64xx, Starfire and T3D/E lines.
As for Cray-1s and GPUs, they have had their moments in the sun because they gave folks a factor of 5 boost in MFLOPS/W. Thankfully (for the majority of us poor misguided souls who are forced into parsing JSON & XML for a living) boring readily available CPUs have caught up them in terms of sustained MFLOPS/W in any case.
If you are serious about wanting big wide register files in your CPUs you should consider throwing a boatload of cash at Itaniums. I am not even being sarcastic here.
My coat is the one with the T805-30 rescued from Thomas the Test Engine in it. ;)
Oh dear, it's official - I'm an old geek. After weeks of watching The big bang theory - which I probably recorded for totally different purposes by mistake, based on the title, I just clicked on El Reg's title about supercomputing. Because I want one!
But of course the ladies would politely say, size doesn't matter. But that's probably wrong too. When I'm in bed with a lady, I just calll it Dark Matter. (Mind you the smart arse lady usually replies - that's obviously string theory.)
Why not retrophrenology? You would just need a massively parallel hammer
One of my favourite diskworld footnotes and i was beat to it.
i need to spend more time in the pork futures warehouse, so i can predict these things in advance!
- Analysis iPhone 6: The final straw for Android makers eaten alive by the data parasite?
- First Crack Man buys iPHONE 6 and DROPS IT to SMASH on PURPOSE
- First Fondle Reg journo battles Sydney iPHONE queue, FONDLES BIG 'UN
- TOR users become FBI's No.1 hacking target after legal power grab
- Vid Reg bloke zips through an iPHONE 6 queue from ZERO to 60 SECONDS