Feeds

back to article Nvidia welds together ARM-Kepler ceepie-geepie system for the impatient

Graphics chip maker Nvidia is also an ARM processor maker, and it wants hybrid ARM-GPU chips just as much as you want them. And in the meantime, if you just can't wait, the company is working with Italian electronics manufacturer Seco to kick out another ceepie-geepie card that can be used for software development or to build a …

COMMENTS

This topic is closed for new posts.
Anonymous Coward

If the intent is massive compute clusters....

If the intent is massive computing clusters, and not just Angry Birds 3D, then they need a way for those clusters to communicate all that data into and out-of the chip. I hope they use something like sRIO.

0
0
Go

The older system was CARMA, for CUDA on ARM Architecture.

The announced development systems are internally named Kayla, and the CARMA name is being dropped.

The two announcements are both "Kayla" devkits. The first is similar to the original CARMA, which had a MXM 3.0 GPU and Q7 processor module on a carrier board powered by a single DC power rail. Now, with the GPU updated to a Kepler class GPU, it's named the 'CUDA on ARM MXM devkit'.

The second system is a new mini-ITX carrier board that supports a Q7 processor module and has a PCIe slot. It uses a ATX power supply, and can run much more power hungry GPUs. Although strictly speaking it's "Kayla" when uses the same new GPU as the MXM module version.

The original devkit was developed around a existing Quadro 1000m MXM module with a GF108 Fermi class GPU. The GPU has 3 SMs, or 96 CUDA cores. The Q1000m has 2GB local memory. Only a portion of that can be mapped into the ARM's address space at one time.

The new devkit uses a Kepler class GPU with 2 SMX units ("SM35") for a total of 384 CUDA cores. Right now it's configured with 1GB of GDDR5 memory.

For both, the CPU module remains based on the Tegra 3. Neither newly announced Tegra 4 products (Tegra 4 and Tegra 4i) have PCIe interfaces. That's why this is a "close development model" rather than exactly the same as Logan.

BTW, the CPU module has 2GB of low power DDR2, and the GPU has 2GB of local memory. While the total is 4GB, only about 3GB is directly addressable. 2GB is pretty much the maximum main memory configuration of ARMv7, due to some sparse utilization of the memory map. Plus you have about 1GB of address space into which you can map PCI devices.

The A15 has a PAE feature to add a few address bits, but it's new, not really used and doesn't help most ARM use cases. The real fix for the cramped address space is ARMv8.

1
0
This topic is closed for new posts.