Back in May, when Nvidia divulged some of the new features of its top-end GK110 graphics chip, the company said that two new features of the GPU, Hyper-Q and Dynamic Parallelism, would help GPUs run more efficiently and without the CPU butting in all the time. Nvidia is now dribbling out some benchmark test results ahead of the …
Pretty charts. However, without Crysis 2 benchmark scores, this article is just a regurgitated press release.
Of course, this recycled press-release details the reason why nVidia decided to sell a midrange graphics chip (the GK104) as a high end GPU. The real high-end chip has been held back for (far more lucrative) sale to supercomputer companies. Thrilling.
The daft thing is if it hadn't been for gamers buying several generations of their products over the past decade and a bit, nVidia wouldn't even have been in a position to do what they're doing now. Yet gamers get sold overpriced slops whilst what would have been a really stunning GPU gets sold for a completely different use.
You need this level of double precision compute for gaming? ...and all that other complexity too?
I'm also not sure it's appropriate to describe the GK104 core as "mid grade" for graphics. The GK110 core was designed for a completely different processing load profile (i.e. compute instead of graphics) - it is quite possible, if not likely that it will perform worse than the GK104 core for gaming.
Anyone interested in scheduling and intercommunication issues in systems with massive thread count should take a look at the August issue of IEEE Computer. Of course, the IEEE has still not convinced itself to fully move that magazine in front of the paywall, but one can easily snarf it from the neighbour or the library.
Pray Tell ..... for Such is Immaculate Treasure Always to Be Best Shared.
Do you know, Destroy All Monsters, of a better live virtual computer operating systems magazine/arsenal of intercommunicating thread systems, in front of or even behind any great idiot firewalls, than El Reg?
In the Beginning v2.0, was there Truth for Light and Further Enlightenment ....
.... with Mentoring and Monitoring Virtual Instruction and Constructions.
And the really simple Cryptome/Wikileaks model of selfless sharing of any and all information for Advanced Intelligence [and for Advanced IntelAIgents too] with ITs inevitable destruction of perverse exclusive use corrupt content management silos is the relentless and unstoppable change that Internetional Pioneers and Fab ARGonauts are delivering to Create with AI, Futures and Derivative Capital Ventures for Virtual Machinery and Global Operating Devices to Share with SMARTR Operating Systems ...... and Mass Administrators with Proxy Governmental Bodies' Enduring Power of Attorney aka Remote Virtual Power Control of Puppet Mastery.
So let me get this straight....
If I'm using Blender with this baby installed and enable the cycles renderer (CUDA enabled) in GPU+CPU mode in real time, it's going to render like the proverbial waste material off a shovel?
Anyone know how much this beast will cost? 10k+?
Re: So let me get this straight....
K10 prices were not announced by Nvidia, but buy.com has one listed for $4,000. Which sounds about right based on the Fermi M2090 prices I have seen at server makers.
It must be tempting for Nvidia to think they can charge $10,000 for the K20. Three times the flops for three times the price is not a price/performance improvement. So I am thinking maybe $6,000 is a reasonable price if it can hit the performance levels El Reg is estimating.
We'll see. And a lot, of course, depends on Xeon Phi prices, which no doubt will be lower per flops than whatever Nvidia charges.
Re: So let me get this straight....
I think you'll need to wait for the more in depth reviews to come out closer to launch. I'm not really familiar with Blender but two things to consider: 1) is the compute load profile of Blender a good match for the GK110 cores (heavy on DP) vs. GK104 (more focused on SP) and 2) will Blender ever be optimised to make use of the more advanced features (Dynamic Parallelism at least, not sure if Hyper-Q would matter on this scenario) in the K20?
Raw horsepower is all well and good, but you need to keep in mind that this solution - like the Phi in many respects - has been built for a specialised processing profile that may not be (and on average likely will not be) a good match for more common workloads (gaming in particular, but the articles I have read point to graphics in general).
The big question for you I think, is if Blender is a DP heavy app... if so, then you might want to follow the K20 more closely, and if not then the lower cost options may be a better fit. Of course, Dynamic Parallelism could be a factor to consider too if Blender decides to support it.
Single Precision, Double Precision, and TDP on the K10 vs. K20
"The Nvidia Tesla "K10" compute card is based on two GK104 GPUs that deliver an aggregate peak performance of 4.58TFLOPS single precision, 0.190TFLOPS double precision." (Source) and a "225 watt thermal envelope" (source)
"El Reg expects for the GK110 to deliver just under 2 teraflops of raw DP floating point performance at 1GHz clock speeds on the cores and maybe 3.5 teraflops at single precision." (from the article today) and "[has a TDP of 300 Watts]" (source)
I can't find any stats on memory bandwidth for the K20 - which is another important horsepower type statistic. The K10 has 320GB/s.