729 teraflops, 71,000-core Super cost just US$5,500 to build • The Register Forums

Wednesday 12th November 2014 04:08 GMT Notas Badoff

I don'know, wha'd you wanna do tonight?

"729 teraflops ...

nearly 71,000 AWS cores for an eight-hour run ...

completed nearly 620,000 compute-hours."

Trying to figure out what this means in AWS latent capacity. Am I really reading that AWS has high NN K cores just laying around - unused - just waiting for customers? I can understand a couple K cores just kinda loping along poking the spare database or serving a web page, waiting for a 'real' question to come along. But towards and beyond 100 K cores shooting the shit waiting for a good stretch-of-the-legs? How much capacity have they got just 'waiting'?

2 0 Reply

Wednesday 12th November 2014 04:54 GMT Denarius

Re: I don'know, wha'd you wanna do tonight?

might have been a slow day. NSA might be running short of crypto to crack Amazon might have broken even that day at least, but it does suggest a lot of capacity is lying around. Just seems odd that after a career watching batch operation being derided by client server real time interactive enthusiasts that a solid batch job situation outside a bank is reported.

1 0 Reply
Wednesday 12th November 2014 08:03 GMT Charlie Clark

Re: I don'know, wha'd you wanna do tonight?

The genesis of AWS was just that: lots of capacity lying around that was required for a few very busy periods of the year (Thanksgiving, Christmas,…).

Businesses get to choose between operational and capital expenditure and pass the risk onto suppliers like Amazon. But don't worry: their risks are also limited as data centres are usually funded by substantial subsidies.

1 0 Reply

Wednesday 12th November 2014 04:16 GMT Trevor_Pott

Except...public cloud "doubters" never doubted this particular use case. Software was rewritten specifically to work with the public cloud, it is a definable, burstable workload, it runs as a batch (input workload, receive result, you don't need to be connected all day to it) and it has a definable cost.

That's completely different from taking a legacy "must be up 24/7" workload and tossing it into the public cloud. Especially one where the developer has no intention of (or can't, because they're out of business/lack skills/etc) rewriting the thing for the public cloud.

The public cloud is not "pay for what you use", it is "pay for what you provision". If you need to provision the workload to be available 24/7 then the public cloud is a terrible value for dollar. If you need to essentially run an HPC batch process, then it'll do you just fine.

19 1 Reply

Wednesday 12th November 2014 04:52 GMT Nate Amsden

absolutely, nail on the head there.

this "super computer" didn't cost $5,500 to build, it cost $5,500 to rent. big difference(duh). Obviously 99.9%+ of the workloads out there aren't suited to one off runs of a few hours never to be needed again..

You've been able to "rent" super computer time for a long time, no news here.

This is one of the very very few good use cases of public cloud computing (IaaS anyway - SaaS is a good model, PaaS not sold on either).

20 0 Reply
1. Wednesday 12th November 2014 05:25 GMT Anonymous Coward
  
  Agree with the above. I'm not against IAAS, but that "supercomputer" cost a hell of a lot more to build than $5500.
  
  And no wonder AWS is losing money, if it has that many cores lying around waiting for Western Digital or someone else to come along for 8 hours.
  
  1 0 Reply
Wednesday 12th November 2014 06:38 GMT Anonymous Coward

Fits every big data predictive analytics job I've ever done (1975+) just fine.

1 0 Reply
Wednesday 12th November 2014 07:09 GMT dan1980

@Trevor_Pott

Two things, mate:

1. - Nail, head, etc...

2. - Be nice to Richard.

'Cloud' is perfect for such tasks, where you have - before time - defined a set of operations, a program to carry them out, a method for running the workload in parallel and then, at go time, you feed it the input and set it to run.

In a way, it's like outsourcing a big data entry job to a third party. That works well because the workload is one that lends itself to such an arrangement. It's also, generally, a 'burstable' workload in that data entry often comes in big loads at a time, meaning sometimes you need 5 people and sometimes you need 50. Keeping 50 drones on the payroll to cope with peak load is silly but keeping 5 and then 'bursting' the rest to an outsourcer is a good idea.

You see similar things with call centers, where they have overflow to third parties.

Once you've got your processes down then outsourcing data entry can make great sense, just like this application does. But this, of course, is not a new concept and just as you have to assess things on a task-by-task basis to see if outsourcing is worthwhile (or even viable), so to must you do that with 'cloud' . . . stuff.

1 0 Reply
1. Wednesday 12th November 2014 07:53 GMT Trevor_Pott
  
  "Be nice to Richard"
  
  Always. He's fucking fantastic people. Without qualification, I'd be there for him, brother from another mother style. Doesn't mean we won't disagree about things from time to time.
  
  2 0 Reply
  1. Wednesday 12th November 2014 12:14 GMT Ian Bush
    
    "He's fucking fantastic people"
    
    Too much information ...
    
    7 0 Reply

Wednesday 12th November 2014 04:25 GMT pierce

don't think of that as $5500, think of it as $16500/day. they just happened to only use 1/3rd of a day.

7 0 Reply

Wednesday 12th November 2014 18:54 GMT Anonymous Coward

So?

If if it was $16.5K, it ran almost 100 times faster than their in-house solution.

1 0 Reply

Wednesday 12th November 2014 04:49 GMT Denarius

Wow, at last!

something that can run M$ Office at a reasonable speed with their new Skype/Lync app. Nuff /bin/sed

5 3 Reply

Wednesday 12th November 2014 05:41 GMT Anonymous Coward

But what about security?

0 0 Reply

Wednesday 12th November 2014 05:48 GMT Shannon Jacobs

Pretty sure they optimized the scheduling

Almost certain that they scheduled their job to run on slack periods. You can think of it as separate budgets for peak usage and slack time. Amazon would obviously charge much more when they have customers queued up, but if you're willing to wait for idle time, then the only cost is the electricity. Ergo, getting $5,500 is better than nothing.

3 0 Reply

Wednesday 12th November 2014 06:41 GMT Anonymous Coward

Re: Pretty sure they optimized the scheduling

Who says mainframe computing is dead. This would be CLASS M TIME=24.00.00 in IBM JCL.

0 0 Reply

Wednesday 12th November 2014 08:23 GMT Infury8r

Maybe the UK's Met Office should use it.

And so save UK taxpayers £97,000,000.

1 2 Reply

Wednesday 12th November 2014 12:19 GMT Ian Bush

Re: Maybe the UK's Met Office should use it.

And the Unified Model would run like a dog on it, you (and the Met Offices paying customers) probably wouldn't get tomorrow's prediction until next week at best, and it would cost a huge amount more in recurrent rather than capital expenditure.

Comparing embarrassingly parallel workloads on loosely coupled, rented hardware and communication sensitive codes that require tightly coupled, low latency hardware is at best an exercise in futility.

2 0 Reply
1. Thursday 13th November 2014 08:57 GMT Jason Ozolins
  
  Re: Maybe the UK's Met Office should use it.
  
  Yup. People where I work are looking at how to remove an up to 20% slowdown in large (>512 core) Unified Model runs that looks to be down to the rate at which the batch management system causes context switches on each node to track memory and disk usage. Never mind that the batch system doesn't use much CPU overall - the communication is so tightly coupled with the computation that a single slow process can make many other processes wait (and waste power/time). This is with all inter-node communication going over 56Gb Infiniband. Not very suitable code to run on loosely coupled cloud nodes.
  
  (BTW, this is an example of an HPC job where turning on HyperThreading helps, as long as you only use one thread on each core for your compute job; the other hyperthreads get used to run OS/daemon/async comms stuff without causing context switches on the compute job threads. The observed performance hit from batch system accounting and other daemons is much lower with HT enabled.)
  
  Anyway, this WD workload sounds very much like the sort of thing companies have long farmed out to their engineering workstations overnight using HTCondor:
  
  http://en.wikipedia.org/wiki/HTCondor
  
  If it can run well under Condor, it'd run just fine in the cloud...
  
  1 0 Reply

Wednesday 12th November 2014 08:24 GMT Turtle

“Gojira”...

“Gojira” is the Japanese name for Godzilla, should there be anyone here who does not already know this li'l factlet.

3 1 Reply

Wednesday 12th November 2014 08:44 GMT drand

Re: “Gojira”...

Or 'Godzilla' is the English for 'Gojira'...

7 0 Reply

Wednesday 12th November 2014 13:37 GMT phil dude

back of envelope calc...

ok 729000 Gflops/71000 core ~ 10Gflops/processors?

Doesn't seem very efficient...? Intel's Dual-E5-v3-2687 gets 788GF on LINPACK and that is on 20 cores =~ 40Gflops/core.

So this is 71000 of some chip?

or 925 dual-xeons (E5-v3-2687)

or 700 nodes of ORNL's titan.

P.

0 1 Reply

Wednesday 12th November 2014 20:42 GMT SJG

Re: back of envelope calc...

An EC2 Compute unit (aka virtual core) is, according to Wikipedia, based on 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. According to Amazon's instance definitions, you would probably need 16 of those to equate to a Intel Xeon E5-2680 v2 (Ivy Bridge) CPU.

0 0 Reply

Thursday 13th November 2014 08:02 GMT crediblywitless

How much did the Matlab licence cost for that job?

0 0 Reply

Topics

Special Features

Vendor Voice

Resources

COMMENTS

I don'know, wha'd you wanna do tonight?

Re: I don'know, wha'd you wanna do tonight?

Re: I don'know, wha'd you wanna do tonight?

So?

Wow, at last!

Pretty sure they optimized the scheduling

Re: Pretty sure they optimized the scheduling

Maybe the UK's Met Office should use it.

Re: Maybe the UK's Met Office should use it.

Re: Maybe the UK's Met Office should use it.

“Gojira”...

Re: “Gojira”...

back of envelope calc...

Re: back of envelope calc...

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

AWS must pay $525M to cloud storage patent holder, says jury

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

UK govt office admits ability to negotiate billions in cloud spending curbed by vendor lock-in

AWS severs connection with several hundred staff

Irish power crunch could be prompting AWS to ration compute resources

India and EU finally advance HPC collaboration project hatched in 2022

Amazon to lure upstarts with $500K in AWS AI credits each

Lenovo scores deal to build supercomputer at UK's Hartree Center

Butler Investments joins Atos rescue party

GenAI will be bigger than the cloud or the internet, Amazon CEO hopes

Stability AI reportedly ran out of cash to pay its bills for rented cloudy GPUs

Google is wrong to put AI search features behind paywall, says HPC leader

About Us

Our Websites

Your Privacy