Nvidia has staked a large part of its future on the idea that GPUs and their massively parallel architectures can replace CPUs for a big chunk of computational jobs. But parallel programming on one device is tough, across two incompatible devices is very difficult, and across clusters of hybrid machines can be very tricky indeed …
How about us mere mortals seeing some of this goodnes
So, if there is a sort implementation for GPU that is 100 times faster than qstort, what is the actual problem in making it available for joe average mere mortal via bog standard libc qsort (3) replacement? Ditto for a few of the other usual suspects.
If not, why?
Replacing libc functions a bad idea
I don't believe it would be a good plan to just replace standard library functions with faster GPU equivalents.
It would mess up the carefully designed balances between CPU and GPU in all the current games for one thing.
Imagine that your favorite zombie shooting game has to sort the list of all objects in the game before deciding which ones to display and which ones will attack you. After replacing the standard sort function with a GPU version, the game suddenly starts halting every few frames while the GPU tries to sort things at the same time it is trying to display the world.
Re: mere mortals
I think we both know the answer to that one. There isn't.
If your data happens to already be in the right place, there probably *is* a way of sorting it really quickly, but the overhead of putting your data in the right place for each wave of the magic wand will be comparable to the savings.
And of course you'll note that a tool was mentioned in connection with partitioning your problem. In the future, you'll be shipping something semantically equivalent to source code, to be compiled by the OS for however many processors the OS scheduler reckons it can make available right now. That's another overhead, unless you precompile for various possibilities, in which case you'll never be quite as efficient.
These problems are soluble, but it's not quite the free lunch they want you to believe in.
Representing the gaming community jokes...
...yes, these Nvidia graphics cards run Crysis. Oh yes they do it very well.
These cards pack so much punch, that I'm glad Nvidia is working on ways to make them more usable, besides gaming, of course. Some of these Fermi suckers have over 2 billion transistors (or 3? I don't care), and soon will become add-on multi-use cards, instead of being called just "graphics cards", provided they succeed.
All of this come at the price of TDP beyond 120W on worst-case scenarios, and operating temperatures beyond 100ºC. I bet many servers clusters and environments won't be happy about it. Implementation is key. Home users can't care less, because they are not planning 100% usage 24/7.
Boilers, valves, pistons, coal, water and steam were not that useful for propulsion until someone cobbled them together in what we call a locomotive.
Right now Nvidia is selling boilers. (Quite literally if you don't cool it right). These tools are the first valves and pistons, I guess...
100,000 active developers?
"Through the end of 2010, the company had more than 700,000 cumulative downloads of CUDA tools and estimates that this represents around 100,000 active developers."
Call me cynical, but I'd be surprised if the correct figure was within an order of magnitude of that.
First you have to discard all those who simply downloaded an SDK to read the documentation -- and then decided it wasn't appropriate for them (or knew beforehand that it wouldn't be, such as journalists and bloggers). That *alone* probably takes you down to the 100,000 mark.
Next you have to discard all those who are still interested, but have a day job that doesn't fit the CUDA hole. Yes, I expect you can contribute in your spare time, but most of us don't have enough spare time to make a real difference. Just look at how many open source projects can't get the developers they need.
Lastly, you have to discard that fraction who call themselves programmers but who struggle with anything harder than simple scripts and macros. No matter how committed they may be, they'll never be "active" in the sense of developing real CUDA applications.
Look at the Thrust library. If you are comfortable with the STL, Thrust will sort your data on the GPU with code that looks just like a stock STL algorithm.
Spent the afternoon playing with the Thrust library for CUDA. Very very nice work.
There is nothing wrong with OpenCL but it has some hoops to jump through. Thrust is very much like simple C++ STL programming. I like Thrust a lot!
I wsa surprised not to see a mention of OpenCL in the article ? or did i totally miss the point that if GPU computing was to become widespread it would have to run on different hardware from different manufacturers including intels future laughabee and AMD ?
- Vid Hubble 'scope snaps 200,000-ton chunky crumble conundrum
- Updated + vids WHOA: Get a load of Asteroid DX110 JUST MISSING planet EARTH
- 10 years of Facebook Inside Facebook's engineering labs: Hardware heaven, HP hell – PICTURES
- Very fabric of space-time RIPPED apart in latest Hubble pic
- Massive new AIRSHIP to enter commercial service at British dirigible base