21500 Jeffries does not seem much.
That said it should get a fair sized HPC box.
But how much does the software cost to run on it?
The government is to invest £158m in IT infrastructure, including data storage, networks and high performance computing, to support research institutions and industry. Announcing the move, David Willetts, the minister for universities and science, said the largest allocation will pay for a national supercomputer to support …
Many Universities do a very bad job of this, very long overruns in delivery and usability. This is due to poor procurement, one group dominating the procurement, e.g. Physics, as well as rubbish project management by the main contractors, which is often, partly, caused by the University demanding xMillion cores by £8:50.
At least at Daresbury and RAL, they have some people with an idea of how to buy something at a fair price and make sure it works.
"The universe is not required to be in perfect harmony with human ambition" (Carl Sagan)
That might explain what we refer to as accidents - every cloud ....
But now let's make some dosh throwing investment and actions in that general direction whilst we can, it keeps the old economy going for a while longer you know - a happy accident chasing after the defense of a sad accident I say - all man made and artificial like.
For projects which don't require government level security, would Amazon cloud farms billed on actual usage not be cheaper and quicker to deploy? Heck if Google will create secure sandboxes for the Google Apps for not much then surely Amazon would at least look at it if they don't do so already?
Surely if you design the software to take advantage of the architecture, the opposite is true - use queues and workers for passing around work units, store your input/results in blob storage, etc...
Then, even if an entire data centre goes down, your app should just be able to spin up, re-process any units that were interrupted by the meteor hit, and continue.
You should also be able to handle the equivalent of cross-thread communication between servers using higher-level comms (although the performance hit _may_ be significant if there's lots of signalling)
The only way I can see cloud making it worse is if the applications are trying to do everything in memory in a single run - in which case an individual server is just as likely to be hit my a meteor / suffer catastrophic failure as any one node in the cloud
The network is just not up to it. I have an associate who has benchmarked these things using applications that currently run on 1000s of cores of various dedicated HPC resource, and get outside one node and there is no competition. For instance one chemistry example runs in 194 seconds on 4 nodes (32 cores) of a University HPC server, but 450 seconds on 4 nodes (32 cores) of the cloud even though the single node speeds are about the same (534 v. 520).
To be fair if you don't need good, low latency comms it's not nearly that bad. Fortunately not a lot of proper scientific HPC falls into that camp - this keeps me in a job!
Ian