Silicon Graphics has a need for speed. Or, more precisely, some of its customers do, and those customers also have money to burn. Supercomputer maker Appro International launched its HF1 overclocked server for hedge funds and other financial firms back in October, and now SGI is jumping into the game with a special Rackable half …
"We don't know what we're doing, so we have to do it really fast.", said a Wall St. Spokesman. "Gotta go, time and frequency are money, well, time is money ... frequency is commissions, but don't tell anybody I told you that", he continued, volatility-ly.
Classic Innovation theory example
The guys who buy it need to read "The Structure of Scientific Revolutions" by Thomas Kuhn (or the similar material in earlier talks by Polyanyi). What they are trying to do is continue to expand on existing knowledge - what Kuhn "calls normal science". What they need is to apply something truly innovative here - a paradigm shift.
Example - instead of oveclocking the server slightly and getting a small diminishing return build a proper "burn, baby burn" server that cranks itself to 4GHz or so but burns in half an hour with a built in CO2 extinguisher when it f*** catches fire.
Crank it up to a point where it survives for at around 2h on average. Then switch workload from server to server each hour and have ops swap the whole rack at the end of the day. Make the whole rack moveable on a forklift to make this possible. The margins on trading are such that this actually makes sense. Using Kuhn's terminology, this will probably fit the definition of a minor paradigm shift (small one) as it replaces the usual "rolling upgrade cycle" where a server is treated as an asset with a cycle where a server is a consumable and the invention is to look at servers really as "cannon shells" which you fire and forget. This means that you have to _BUILD_ them yourself as no vendor will set up a line for you for servers with 1hour warranty. This is an entertaining one to watch as there is probably no engineers with financial market experience who can set up DIY "rocket server" build shop as all of them are presently fully vendor indoctrinated into a Dell, HP, Rackable or some other server sect faith.
A major one compared to that will be to actually improve the trading system so it has lower latency in the first place or feeds the latency as a function into the trading algo so that the algo runs only when it can actually get a margin and shuts off otherwise.
And so on. Doing a "vendor" overclocked server however is the worst of all options - it is a minor dimishing return ("normal science" in Kuhn terminology). It is what everyone else is capable to match, it requires increasing investment as you go and you have to write off all of this investment when someone paradigm shifts and goes sidewise inventing something that makes these "slightly super server" obsolete.
What a load of shit you're talking. Theyre overclocking a bit an you call it unimaginative. You suggest overclocking it a bit more and call it a paradigm shift? Stop talking bollocks.
Using an oveclocked system for "stable" operation is unimaginative.
Using an oveclocked system for one-shot operation... Dunno the jury is out.
As far as is something innovative or not the best indicator is "does this disrupt the current market". The answer is that a _REALLY_ oveclocked system with its associated cooling will do so.
The current fashion is "proximity hosting". You cannot stick a system that needs a bucket of solid CO2 or liquid nitrogen or god forbid an extinguisher to function into such an environment. No way.
So if these catch on (there is a BIG IF) here people in outfits which provide proximity hosting and low latency routes are going to have an austerity xmas next year. After all, why bother with a fraction of a ms on latency if you can improve by several ms on computation.
can someone explain the physics?
Would they not run cooler if they used older chipsets with higher clockrates but lower densities? Or perhaps stick the systems in a freezer?
Forget the physics...
The issue is that these boxes have to be placed in leased/rented space because they have to be close to the exchange where they are trading.
Overclocking the cpu protects their existing investment in their software. In going with GPU/multi-core code, would mean a rewrite of the core system, hence a very *expensive* albeit longer term goal. Its actually cheaper for the company to buy a new box every 6 months than to spend money on new code.
The hardware costs are pennies on the dollar.
re: can someone explain the physics?
In my tiny bit of research it seems java is very popular amongst the hedge funds. Seems to me using C would work better than throwing money into servers.
You do realize that on Wall St., Ethics is a nano-technology ...
Mine's the one with the copy of Adam Smith's "The Theory of Moral Sentiments" in the pocket ...
What in the name of god are you talking about? Run older chipsets? The only chipset that works with these processors is the Tylersburg chipset. Maybe by chipset you mean processor, in which case the answer is no. Older processors, larger fabrication size. They'll not only not run as fast, they'll draw more power and dump more heat doing it. Sticking the systems in a freezer is also a pretty bad idea from a power standpoint.
Assuming that they're using 95W processors to being with (they may be using 150W processors, who knows?) and they overclock them 20% they'll probably get 50% more TDP (worst case scenario). Again, assuming the use of 95W CPUs they'll have to dissipate 142.5W of heat per socket. That's nothing out of the ordinary, intel sells Xeons with 150W TDPs. It get's a little bit more tricky if they are in fact using those 150W processors, but it's still possible to dissipate up to 250W per socket with massive heatsinks with powerful fans or with water cooling. They don't need any exotic cooling methods to get it done.
Now that's an idea.
I've worked in enough temperature controlled environments. Given enough frozen meat and veg to protect, it becomes financially viable to make a very large space go down to -30C.
Only problem I can see there is condensation. I've seen it snow indoors in such places, when something gets an accidental knock from a fork lift. Still, the other option is to use refrigerated boxes with built-in dehumidifiers. From what I recall, there are companies that specialise in this sort of thing, specifically for overclockers funnily enough!
It is more complicated
The extent to which you can oveclock a chip and exploit this in the context of solving a computing problem depends on a whole raft of factors.
It is a very entertaining part of physics this one. The amount of heat you take out from the chip is governed by a raft of nice partial differential equations (actually not entirely dissimilar to ones used in some of the market models). There are _LOTS_ of parameters that can possibly be tweaked here and as you guessed right - dropping the external temperature is one of them. It is however a parameter that probably gives you the smallest advantage. There are other ways here. Much better ones.
Some of it is actually rocket science (at least it is governed by similiar equations) so getting it right for a production environment and aligning it to a particular computational problem like trading is non-trivial.There are lots of ways here to achieve a breakthrough. However buying a factory overclocked system is definitely not one of them as it is something everyone can do so noone will actually get an advantage here.
In any case, it will be fun to watch the industry grapple with this one. It's popcorn hour time :)
Maybe showing some ignorance here, but if the load temp on the CPUs etc is lower they should last longer than an unclocked less well cooled solution?
Thus I would say they are less likely to fail.
Well certainly the watercooled stuff.
So who cares if they fail?
Who cares if they fail?
It is a massive misconseption which is entrenched all the way to recruitement droids that because it is financial services it should somehow be bullet proof and not fail. That misconception goes all the way to vendors and Rackable is showing it by the way.
Who cares if it fails at the end of the working cycle and is swapped out or a "cooler" spare comes online from standby. If during its working cycle it has outcompeted the competition and has allowed the trader to operate at let's say 5% lower latency - then, oh well, job done. And as Lui XIV used to say: "Apres moi le deluge". Or in this case "Apres moi le CO2 fire extinguisher".
Could they not get them to 4.77 GHz for old times' sake?
I suggest a visit to overclockers.com for them
Show them how to do it properly with pots and ln2.
Oh yes... good'ol days...
....when overclocking was king, and getting your hands on some liquid nitrogen (figuratively... use no gloves and your hands fall off) was all the rage.
What was the record back then? 5 GHz on liquid nitro? Anyways, enjoy some demented people doing it on an oldie machine.
The video actually belongs to Tom's Hardware Guide, repost from some overclocking fans from Brazil. Note that the build is actually in Germany, some refrigerator compressors are used, along with proper insulation. Note too, the entire motherboard is also frozen.
And it ran stable for 1 hour, before all the nitro evaporated.
And this one claims it ran some AMD chip at 6.9GHz... enjoy...
It looks like a steam train.
I was somewhat surprised when I saw EVGA's ginormous SR-2 motherboard in Appros system. Then again, unless you design it yourself where else are you going to get a 5520 motherboard that allows you to overclock? When I saw the headline, I figured this beast would be running EVGA's motherboard and I wasn't disappointed.
I seriously hope EVGA does this again for Sandy Bridge-EP processors. It would be awesome to have 16-cores and 32-threads all running at 4.5GHz. Too bad I wouldn't be able to afford it...
For every problem, there's an over-the-top solutiuon
They could have achieved this performance with lower temperatures just by going to PowerPC or SPARC chips. Is there some reason they're stuck with x86, or with SGI?
Nobody got fired for buying... rackable?
Heard they had a rather high DOA rate, somewhere before the sgi carcass buy. They've finally found a niche where the customer sees this as to be expected.
Anyway, I recall IBM does some spiffy mainframe CPUs with even higher _standard_ clock rates. Wouldn't it be fun to bolt a couple in a 19" box with some memory and a bit of low-latency networking and, well, that's it?
Of course the management would nevar approve --it takes IBM six months in paperwork just to start shipping an empty box, nevermind an actual product, double nevermind something like this-- but from a tech standpoint it seems obvious to me.
And if you're running linux, well, _it_ probably runs fine on that cpu so I hope your codar quants stuck to what they were supposed to be doing instead of sneaking in x86 dependencies. It's slow death.
Can someone please explain
I do know one or the other bit about trading companies both from a business and IT point of view (I used to work in this industry). Yet I cannot really see the connection between doing fast trading and cpu clock speed - other than poorly written code...
Poorly written code.
What more explanation do you need? At the prices they're commanding they can't afford to write good code. I'd expect this literally to be the most atrocious stacks upon stacks of iffy code ("middleware" seems to be the justifying euphemism) and no wherewithal in sight to clean that up a bit. So they'll just go right on thinking ever morerer cleverer ways to manipulate the snot out of each others' trades and throw some of that money on faster hardware.
Bitter, moi? Ah, financials love to use recruiters, so they basically get what they deserve.
They're hung up in a rat arms race and until there is some rule limiting trades to settling once per second or some other humanly measurable interval they're going to go right on spending on fancier kit. The link with overclocking gameheads is easy and apt to make; even less clue, and much, much more money to burn.
There is, of course, an event horizon of sorts where the costs of beating off entropy outmatch the gains from the game. Will they stop in time, or will we become another frogstar?
Re Poorly written code.
The OP was kind of rhethorical question (no, I don't feel urged to mark it as such). Nevertheless, thank you for your exhilarating answer!
X86 not the answer perhaps?
Doesn't the Power architecture clock much higher these days?
Can't they just overclock it even more?
I mean this clearly is a machine which would save society thousands of Euros for each second it's not working.
What about the high clockrate IBM POWER processors? Aren't they already running at 5ghz upwards?
And instead of throwing ever faster hardware, why not improve the efficiency of the code running on it...
For years, there has been a movement to increasingly high level and increasingly inefficient languages and relying on ever faster hardware to compensate for the inefficiency. Now that hardware is topping out, its about time people started moving back towards efficient development.
"Mannel says that all of the financial services firms that SGI is selling iron use Linux, and CentOS tends to be popular."
Could they donate 0.00001% of trading volume to CentOS? Might help keep the project going...
"Mannel says that all of the financial services firms that SGI is selling iron use Linux, and CentOS tends to be popular."
My first parse of the title lead me to interpret 'forges' as 'to fake', rather than 'to make (out of metal)', and thus the article was about SGI being caught faking the speed of their new boxes.
Did the study the Collateral Damages?
I hope they have thoroughly tested what happens to the other PC components when a CPU fails due to excessive prolonged strain, the effects should not be predictable.
I can't even start to think what could happen when such a CPU decides it can't hold no more. If this happens when the CPU is at full load, the PSU must have a hell of a design to tolerate the current surges and transients resulting either from a short-circuited CPU or the sudden change from say 50 Amperes (600W) to 2 Amperes (rest of the PC) when the CPU open circuits.
If the PSU fails, anything might get damaged beyond repair, from the motherboard to the hard disk(s).
I don't think they care.
Old server goes boom, new server gets swapped in.
If anything's still working, I'm sure whoever they've picked for salvage duty can enjoy pulling it out of the melted mess.