Nvidia has announced the latest version of its GPU programming language, CUDA 6, which adds a "Unified Memory" capability that, as its name implies, relieves programmers from the trials and tribulations of having to manually copy data back and forth between separate CPU and GPU memory spaces. CUDA 6 Unified Memory schematic …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Saturday 16th November 2013 02:23 GMT totaam

download where?

Unless I am missing something obvious, I've signed up (something I rarely do), waited to be approved, and now that I am approved there is still nothing for me to download.. (just harvesting email addresses?)

Seems that this unified memory is a response to what AMD has been talking about. (hard to tell who planned it first since this is all behind closed doors but at least AMD was talking about it first)

It makes sense because it is a PITA to upload/download explicitly to/from the device.

0 0
1. Saturday 16th November 2013 06:24 GMT bazza
  
  Re: Unified Memory
  
  It is a response to what AMD have done with their APUs, but it's definitely a sticking plaster (band-aid for our transatlantic cousins) for NVidia's problem. Sure, the programmer doesn't have to worry about transferring data between CPU and GPU anymore, but the hardware does and the latency is still there.
  
  If anything this could make the situation worse for NVidia. Before this, when the programmer had to do their own data transfers, the latency was explicitly there in the source code. It was practically shouting "this is painful and slow, don't do this too often coz it'll be a slow steaming pile of shite". Now that it's all hidden from the programmer it is easy to write working code with little evidence of inefficiency in the source. Laziness will become harder to spot.
  
  In AMD land where the APUs have properly unified memory at the electronics level everyone wins. There are no inefficient data copies to be done at all, so source code that looks efficient ends up being efficient. That's a very good thing. It's not something that NVidia can compete with unless they start building themselves a serious x86 core.
  
  13 0
  1. Saturday 16th November 2013 10:57 GMT Voland's right hand
    
    Re: Unified Memory
    
    Quote: "unless they start building themselves a serious x86 core". That is "Plan A".
    
    Nvidia has also "Plan B" - burry the hatchet into x86 back and start shipping arm on their chips. Even a 64 bit arm core will be only a minor addition to the BOM and heat envelope and it will have synchronized memory and working spinlocks. The underlying x86 will become a mere carrier of Arm blades. One step further and it will disappear altorgether in some setups.
    
    I would not be surprised if we hear from them that they have implemented CUDA 6 properly (not as a steaming pile of hacks and cludges) on such hardware and are shipping it.
    
    4 1
  2. Saturday 16th November 2013 15:56 GMT Frumious Bandersnatch
    
    Re: Unified Memory
    
    If anything this could make the situation worse for NVidia. Before this, when the programmer had to do their own data transfers, the latency was explicitly there in the source code
    
    I got the same feeling on reading the article. The latencies are still there, but now they're just hidden behind a software translation layer. I'll agree that doing explicit DMA or other main memory <-> device memory transfers is annoying, but we already have a technique for hiding DMA latencies(*), namely double (multi) buffering.
    
    Multi-buffering can, for many problems, not only "hide" the latencies, but effectively eliminate them for all but the first block to be transferred. If this new feature does automatic loop unrolling and transparently adds multi-buffering (or even just double-buffering) when it detects it should be used, then that would be pretty nifty. Unfortunately, judging by the description in the article, this isn't what it's doing, and all we get is blocking, full-latency access to the "shared" memory, with "shared" in quotes because it's only a software abstraction, not a hardware feature. I could be pleasantly surprised, but from the article, it seems like it's only a sop to lazy programmers, and not real shared memory at all.
    
    (*) I'm not actually up to speed on CUDA, so I'm assuming it uses DMA to do data transfers?
    
    5 0
    1. Sunday 17th November 2013 21:58 GMT Nick Ryan
      
      Re: Unified Memory
      
      Agreed. While explicit memory transfers are a PITA unless the development environment provides good tools, having implicit, potentially unknown, memory transfers is just asking for inefficency. Pretty much a similar level to the inefficency that's anywhere near anything remotely .net where a "string" is involved.
      
      However massively parallel programming is a bugger to get your head around when it comes to the coordination of many processes that may, or may not rely on any of each other, and while forward planning by initiating a memory fetch of blocks that will be of known interest is easy at the first level, it very rapidly gets far too complicated. Eventually other than for a few, much more sadistic than I am, coders, it will turn out to be more efficient to have a suitably "smart" development environment perform many optimisations.
      
      3 0
Saturday 16th November 2013 04:55 GMT Anonymous Coward

gcc

The free software foundation might be interested to hear that gcc belongs to Mentor graphics! Do you think we should thrill them?

5 0
1. Saturday 16th November 2013 20:38 GMT Anonymous Coward
  
  Re: gcc
  
  Yeh, I read that like WTF exactly? You can take that 1 way and it can work (maybe they supply extensions to GCC?), but the way it is worded seems to be going the wrong way. I'm assuming the author believes that if you use C then you can only use the GCC compiler? All C is "GCC"? ... jeesh, I wish...or maybe I just wish there was only 1 compiler.
  
  0 0
Saturday 16th November 2013 10:03 GMT Anonymous Coward

I can't see how this fanciful wrapper can be anything other than a PR move.

"From the point of view of the developer using CUDA 6, the memory spaces of the CPU and GPU might as well be physically one and the same. "The developer now can just operate on the data," Gupta says."

Either a very ill conceived PR move, or they are just calling us retards at our face.

5 1
1. Saturday 16th November 2013 10:36 GMT bazza
  
  That's pretty unfair comment I think. Whilst some of us don't mind thinking in multiple different addresses spaces all at once (if you think 2 address spaces is hard, try the 69+ you get with VME...), anything that makes the programmer's job easier is a good thing for NVidia; they'll sell more product because of it.
  
  At the software level they are copying what AMD have done, which is understandable. They can't copy AMD at the hardware level though, so they may start to struggle to compete in terms of whole system performance.
  
  2 0
2. Saturday 16th November 2013 10:39 GMT Destroy All Monsters
  
  As long as you get loud warnings telling you that you are probably doing something that will bite you later in this and that piece of code, I'm for it.
  
  3 0
Saturday 16th November 2013 16:06 GMT William Boyle

64-bit memory space

With 32-bit systems, this was an insurmountable problem. With 64-bit ones, it is a matter of memory mapping of the GPU memory into the CPU's virtual memory space. In truth, this is not a difficult problem, and the fact that it hasn't happened until now is not a "cudo" to nVidia! Although, I will admit that the issues are more likely business process related than anything else, and those are always more difficult to overcome than the merely technical!! :-)

1 0
Saturday 16th November 2013 19:03 GMT Ken Hagan

Convergence

Meanwhile, Intel are adding very wide GPU-like operations to their own instruction set and these, obviously, enjoy the same single view of memory. Sounds like the cycle of re-incarnation is nearing completion, with everything ending up on the one [CG]PU.

2 0
Sunday 17th November 2013 14:25 GMT oldcoder

Did they fix the security?

Hope they fixed the security problems...

The original problem was that there was nothing preventing a user from loading data into the GPU, then downloading it into any location in physical memory. Been there saw that. The Cray MP line would crash (most often) because of that.

All they had to do was add an IOMMU to the board. That way the host system would be able to limit the IO to just one user... and if they chose to crap on them selves it didn't matter.

I know the Cray nearly didn't pass evaluation on low level security audits because of it. Only the nodes without the GPU passed.

2 0

This topic is closed for new posts.

Topics

Special Features

Vendor Voice

Resources

User topics

Article topics

User topics

Article topics

Nvidia reveals CUDA 6, joins CPU-GPU shared memory party

COMMENTS

download where?

Re: Unified Memory

Re: Unified Memory

Re: Unified Memory

Re: Unified Memory

gcc

Re: gcc

64-bit memory space

Convergence

Did they fix the security?

Other stories you might like

Los Alamos Lab powers up Nvidia-laden Venado supercomputer

Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100

Tesla decimates staff amid ongoing performance woe

Tesla asks shareholders to reinstate Musk's voided $56B pay package

Despite two previous court victories, Tesla settles third Autopilot liability case

AI cloud startup TensorWave bets AMD can beat Nvidia

China scientists talk of powering hypersonic weapon with cheap Nvidia chip

India and EU finally advance HPC collaboration project hatched in 2022

German state ditches Windows, Microsoft Office for Linux and LibreOffice

AlmaLinux 9.4 beta prepares to tread where RHEL dares not

Gentoo Linux tells AI-generated code contributions to fork off

Intel's neuromorphic 'owl brain' swoops into Sandia labs

About Us

Our Websites

Your Privacy