back to article Nvidia unveils Titan Z: An 8TFLOPS off-the-shelf supercomputer disguised as a gfx card

Nvidia has released its most powerful – and, in our memory at least, its most expensive – GeForce graphics card ever, the GTX Titan Z, which you can have for a hefty $2,999. Nvidia GeForce GTX Titan Z Five thousand, seven hundred and sixty CUDA cores of 'cool and quiet' GPU power That's not such a high price, said Nvidia CEO …

COMMENTS

This topic is closed for new posts.

Page:

JDX
Gold badge

Yowser

12Gb on a graphics card.

Can someone convert this thing's likely real-world performance to units I can understand, like approximate number of i7 CPUs for general computation or something?

0
1
Bronze badge

Re: Yowser

IRC a Core i7 reaches around a hundred GFLOPS or so. So we are talking about a facor of roughly 80.

Imagine the power supply and cooling needed to run 80 Core i7 at top speed!

That said, not all tasks are suitable for GPU processing. Also there is a good reason, why they talk only about single precision performance. Most GPUs sold on graphics cards, have significantly lower performance for double precision calculations. One of the reasons why AMD cards were more popular for GPU computing tasks is that they were less restricted regarding double precision calculation. I'm not sure, if this is still true, however.

4
0

Re: Yowser

If you get 10X speedup i'd be amazed and you'd need a really really good parallel programmer to do that

1
1
Anonymous Coward

Re: Yowser

We've already had cards with 12Gb - For example the NVIDIA Geforce GTX 580.

(Assuming you mean really 12Gib, 12 Gibibits, as RAM is never measured in Gigabits or Gigabytes, even though that's used in slang terms).

1.5GiB.

2
2
Silver badge

Re: Yowser

The closest I can get is

'A shitload of em'

1
0
Bronze badge

Re: Yowser

"12Gb on a graphics card."

If I understand this correctly, this is 2 Titans in SLI on a single card, therefore 6GB memory per graphics processor (same as a normal Titan.) Memory can't be shared between processors unless Nvidia have done something very funky, so it's effectively a 6GB card as far as frame buffer etc. goes.

12GB sure does sound impressive though!

2
0

Re: Yowser

@Naughtyhorse - I believe the accepted unit is the Shedload or the Metric ShitTonne.

0
0
MrT
Bronze badge

Or maybe...

... a boatload, Ma'am?

All hands to bathe? Aye-aye!

0
0
Silver badge

Re: Yowser

quote: "(Assuming you mean really 12Gib, 12 Gibibits, as RAM is never measured in Gigabits or Gigabytes, even though that's used in slang terms)."

*ibi as a differentiator has only recently been adopted though, and back in the 386 / 486 era all storage (RAM or non-volatile) used SI prefixes (Kilo-, Mega-) for size. I don't think I actually encountered a deliberate use of the *ibi prefixes until this century, and probably less than 10 years ago to boot. I have never seen (or purchased) SIMMs advertised in KibiBytes or MibiBytes, only KiloBytes or MegaBytes.

I remember buying a matched pair of 16MB SIMMs for a previous gaming rig, for the princely sum of £300. That'll be showing my age there, I think... ^^;

3
0
Silver badge

Re: Yowser (info needed)

> IRC a Core i7 reaches around a hundred GFLOPS or so

I've long lost track of cpu architectures (they're 'fast enough' for me to now not care) so I'm struggling to see how the above is possible. At 3Ghz that's about 30 flop/cycle. How?

I suppose if you mean per processor rather than per core, and you have say 6 cores then that's ~5 flop/cycle. If that was a multiply-accumulate (fmac) heavily pipelined over 2 vectors of effectively infinite length then that's 2 cycles/flop, say you have 2 such units that's 4...

I'm struggling to see how even a peak of 100 gflops can be reached, never mind in practice - anyone shed any light? TIA

0
0

Re: Yowser (info needed)

> At 3Ghz that's about 30 flop/cycle. How? If that was a multiply-accumulate (fmac) heavily pipelined over 2 vectors of effectively infinite length then that's 2 cycles/flop, say you have 2 such units that's 4...

Dual issue vec4 SSE = 8 per core per clock, add in a couple of non-SSE fp32 instructions = 10 per core per clock. 4 cores = 40 per clock.

1
0
Silver badge

Re: Yowser @ToddR

I think the difference is that a lot of the datasets that researchers work with are structured in an inherently parallel fashion — the expensive stuff is purely functional individual results from each of millions or billions of data points. It's the classic CPU versus GPU thing: CPUs are good when you want to do any of a very large number of variable things to a small number of objects, GPUs are good when you want to do a few very exact things to a large number of objects.

So OpenCL or CUDA just naturally get radically better performance than a traditional CPU for a certain segment of users, and those users are the niche at which this card is targeted.

1
0
Silver badge
Pirate

@Numptyscrub. You were lucky...

I paid £120 quid each for four 4MB SIMMs for my 486.

Ouch.

I've got a VHS sized case full of the things in my bits box now. Know anybody who might want them?

Icon? I was robbed.

0
0
Bronze badge

Re: @Numptyscrub. You were lucky...

Hah! Old enough here to remember gutting a 128mb machine and getting £1400 for the two 32mb SIMMs I 'liberated'.

Now THAT's old!

0
0
Silver badge

Re: @Numptyscrub. You were lucky...

I'll see your 32mb SIMMs and raise you a 256kb DIP RAM chip for a Trident video card (so it can have the blistering total of 512kb video RAM)... now get off my lawn, whippersnapper! ;)

1
0
Silver badge

Re: @Numptyscrub. You were lucky...

quote: "I'll see your 32mb SIMMs and raise you a 256kb DIP RAM chip for a Trident video card"

I got into PCs pretty late, although I did buy a 3rd party 512kB RAM upgrade "card" (some soldering required) for my original Atari ST, that was so well designed it caused a case bulge when mounted as per the instructions ^^;

0
0

Re: Yowser

It's able to render an area the size of Wales, and costs about as much as 3 football pitches.

0
0
Coat

Re: @Numptyscrub. You were lucky...

Vaguely remember having a total of 6MB of SIP RAM in 80386sx (20MHz) running (don't laugh) Coherent.

0
0

Re: @Numptyscrub. You were lucky...

Vaguely remember having a total of 6MB of SIP RAM in 80386sx (20MHz) running (don't laugh) Coherent.

0
0

well at least you will be able to paly crysis at full detail in 1080

4
0
Silver badge
Boffin

"well at least you will be able to paly crysis at full detail in 1080"

Not at 60 FPS, you won't.

1
0

Re: "well at least you will be able to paly crysis at full detail in 1080"

Hunh? I get 45 - 55 fps on crysis3 with a gtx 760!AMP and an overclocked q6600 with everything maxed except AA

1
0
MrT
Bronze badge

Whoosh...

...and whoosh...

5
1
Silver badge

Just a few more years and that desktop PC will be something really special.

1
0

BitCoin or passwords?

Which will this monster be better fit to churn/crack?

2
0

Re: BitCoin or passwords?

See https://en.bitcoin.it/wiki/Mining_hardware_comparison

It's a bit out of date now, but anyway, nobody's really mining btc with graphics cards any more. ASIC hardware is several orders of magnitude faster per $ spent. So yes, the more economically viable approach would be to use the hardware to brute force someone's password and steal their btc :)

1
0

Re: BitCoin or passwords?

Bitcoin mining should work very well

0
0
Bronze badge

Re: BitCoin or passwords?

BTC mining on anything other than ASICs is dead as of last year. Scrypt currency mining on GPUs is just about viable but the leccy consumption will eat a lot of your coins. I mined about £100 of various things (when converted to BTC and then GBP) in 6 weeks on a pair of GTX670s before giving up.

Scrypt ASICs are going to appear this year so GPU mining will be dead before Pascal hits the shelves.

0
0

They should make it a bit cheaper and remove the graphics output connectors. They're very expensive.

3
0
Silver badge

Wow!

That's 1.5 times my current system ram!

1
0

But...

... will it play Doom OK?

1
0
Joke

Re: But...

All 5760 instances should run just fine :D

Now you just need someone to rewrite it for CUDA...

2
0
Silver badge
Joke

Re: But...

No but I hear Wolfenstein 3D is just about playable

3
0
Anonymous Coward

I'm assuming that

half life 2 would be Ok on it?

1
0
Bronze badge

most expensive

El Reg never heard of Quadro K6000 or FirePro S10000 12GB, I guess?

On the other hand, 5760 cores is worth mentioning. FWIW, almost 18 months old S10000 is also dual GPU.

1
1

Surely this sounds like an application for the Tesla range of GPGPU cards rather than a graphics card?

0
0

This post has been deleted by its author

Bronze badge

So how many of these in SLI do I need to finally be able to run Crysis?

1
0
Silver badge

Would love to have one for certain HPC loads, not for others

It is amazing what can be done with "cheap" HPC systems. I do note that this kind of architecture is not that good at the type of compute load I tend to work on: heavily data-driven processing order. GPUs still prefer SIMD-like problems of "lots of the same" type seen in many physics problem. Alternatively, I need to rethink my algorithms, but at the moment the fastest kit we use for our problems is a 2U rack server with 64 AMD cores and 512 GB RAM.

As you might guess from the above, 12GB is also too small for most of my data sets, and we still haven't quite worked out how to do our work in distributed-memory machines. It would be great if we could find a way to harness these beasts for our kind of work, however

0
1

How about a game of Global Thermonuclear War?

5
0

Wouldn't you prefer a nice game of chess?

0
0
Anonymous Coward

I think many people need to think what we can use this for works beside the gaming. so we can consider to buy and use it

0
0

Use for ?

I think many people need to think what we can use this for works beside the gaming. so we can consider to buy and use it

0
0
Bronze badge
Linux

comparisons...

From my point of view it makes the Intel MIC look expensive, so perhaps the pricing these adapters (AMD,NVIDIA,Intel) will start to converge on a sane pricing model.

I must say however, that the CUDA infrastructure gives a "lightbuld" moment when working out how many molecules we can get into this card....

P.

0
0
Bronze badge

One Problem

If I wanted a supercomputer, I'd be more concerned about its 64-bit floating-point performance. Nvidia does make special parts aimed at the supercomputer market instead of gamers wanting high performance video, but this doesn't sound like one of them. The Tesla K40 is roughly comparable to the original Titan, but it costs $5,300, not $999.

2
0

This post has been deleted by its author

Showing my age...

I wonder how many VAX MIPs that is.

0
0
Silver badge

Re: Showing my age...

You can't directly compare MIPS and FLOPS because throughput depends on the architecture.

But for HPC, it's (nominally) slightly faster than a Cray X1 from ten years ago.

0
0
Silver badge
Childcatcher

Sure will get you closer to the WIRED

"Special order for Miss Iwakura Lain"

1
0
Bronze badge

Re: Sure will get you closer to the WIRED

I ♥ U lol no I don't but, I do love the Lain reference though!

0
0

Page:

This topic is closed for new posts.

Forums