back to article Nvidia unveils Titan Z: An 8TFLOPS off-the-shelf supercomputer disguised as a gfx card

Nvidia has released its most powerful – and, in our memory at least, its most expensive – GeForce graphics card ever, the GTX Titan Z, which you can have for a hefty $2,999. Nvidia GeForce GTX Titan Z Five thousand, seven hundred and sixty CUDA cores of 'cool and quiet' GPU power That's not such a high price, said Nvidia …

COMMENTS

This topic is closed for new posts.
  1. JDX Gold badge

    Yowser

    12Gb on a graphics card.

    Can someone convert this thing's likely real-world performance to units I can understand, like approximate number of i7 CPUs for general computation or something?

    1. Chairo

      Re: Yowser

      IRC a Core i7 reaches around a hundred GFLOPS or so. So we are talking about a facor of roughly 80.

      Imagine the power supply and cooling needed to run 80 Core i7 at top speed!

      That said, not all tasks are suitable for GPU processing. Also there is a good reason, why they talk only about single precision performance. Most GPUs sold on graphics cards, have significantly lower performance for double precision calculations. One of the reasons why AMD cards were more popular for GPU computing tasks is that they were less restricted regarding double precision calculation. I'm not sure, if this is still true, however.

      1. ToddR

        Re: Yowser

        If you get 10X speedup i'd be amazed and you'd need a really really good parallel programmer to do that

        1. ThomH

          Re: Yowser @ToddR

          I think the difference is that a lot of the datasets that researchers work with are structured in an inherently parallel fashion — the expensive stuff is purely functional individual results from each of millions or billions of data points. It's the classic CPU versus GPU thing: CPUs are good when you want to do any of a very large number of variable things to a small number of objects, GPUs are good when you want to do a few very exact things to a large number of objects.

          So OpenCL or CUDA just naturally get radically better performance than a traditional CPU for a certain segment of users, and those users are the niche at which this card is targeted.

      2. BlueGreen

        Re: Yowser (info needed)

        > IRC a Core i7 reaches around a hundred GFLOPS or so

        I've long lost track of cpu architectures (they're 'fast enough' for me to now not care) so I'm struggling to see how the above is possible. At 3Ghz that's about 30 flop/cycle. How?

        I suppose if you mean per processor rather than per core, and you have say 6 cores then that's ~5 flop/cycle. If that was a multiply-accumulate (fmac) heavily pipelined over 2 vectors of effectively infinite length then that's 2 cycles/flop, say you have 2 such units that's 4...

        I'm struggling to see how even a peak of 100 gflops can be reached, never mind in practice - anyone shed any light? TIA

        1. Anonymous Coward
          Anonymous Coward

          Re: Yowser (info needed)

          > At 3Ghz that's about 30 flop/cycle. How? If that was a multiply-accumulate (fmac) heavily pipelined over 2 vectors of effectively infinite length then that's 2 cycles/flop, say you have 2 such units that's 4...

          Dual issue vec4 SSE = 8 per core per clock, add in a couple of non-SSE fp32 instructions = 10 per core per clock. 4 cores = 40 per clock.

    2. Anonymous Coward
      Anonymous Coward

      Re: Yowser

      We've already had cards with 12Gb - For example the NVIDIA Geforce GTX 580.

      (Assuming you mean really 12Gib, 12 Gibibits, as RAM is never measured in Gigabits or Gigabytes, even though that's used in slang terms).

      1.5GiB.

      1. NumptyScrub

        Re: Yowser

        quote: "(Assuming you mean really 12Gib, 12 Gibibits, as RAM is never measured in Gigabits or Gigabytes, even though that's used in slang terms)."

        *ibi as a differentiator has only recently been adopted though, and back in the 386 / 486 era all storage (RAM or non-volatile) used SI prefixes (Kilo-, Mega-) for size. I don't think I actually encountered a deliberate use of the *ibi prefixes until this century, and probably less than 10 years ago to boot. I have never seen (or purchased) SIMMs advertised in KibiBytes or MibiBytes, only KiloBytes or MegaBytes.

        I remember buying a matched pair of 16MB SIMMs for a previous gaming rig, for the princely sum of £300. That'll be showing my age there, I think... ^^;

        1. Anomalous Cowturd
          Pirate

          @Numptyscrub. You were lucky...

          I paid £120 quid each for four 4MB SIMMs for my 486.

          Ouch.

          I've got a VHS sized case full of the things in my bits box now. Know anybody who might want them?

          Icon? I was robbed.

          1. DiViDeD

            Re: @Numptyscrub. You were lucky...

            Hah! Old enough here to remember gutting a 128mb machine and getting £1400 for the two 32mb SIMMs I 'liberated'.

            Now THAT's old!

            1. DropBear

              Re: @Numptyscrub. You were lucky...

              I'll see your 32mb SIMMs and raise you a 256kb DIP RAM chip for a Trident video card (so it can have the blistering total of 512kb video RAM)... now get off my lawn, whippersnapper! ;)

              1. NumptyScrub

                Re: @Numptyscrub. You were lucky...

                quote: "I'll see your 32mb SIMMs and raise you a 256kb DIP RAM chip for a Trident video card"

                I got into PCs pretty late, although I did buy a 3rd party 512kB RAM upgrade "card" (some soldering required) for my original Atari ST, that was so well designed it caused a case bulge when mounted as per the instructions ^^;

              2. Paul 77
                Coat

                Re: @Numptyscrub. You were lucky...

                Vaguely remember having a total of 6MB of SIP RAM in 80386sx (20MHz) running (don't laugh) Coherent.

              3. Paul 77

                Re: @Numptyscrub. You were lucky...

                Vaguely remember having a total of 6MB of SIP RAM in 80386sx (20MHz) running (don't laugh) Coherent.

    3. Naughtyhorse

      Re: Yowser

      The closest I can get is

      'A shitload of em'

      1. Admiral Grace Hopper

        Re: Yowser

        @Naughtyhorse - I believe the accepted unit is the Shedload or the Metric ShitTonne.

        1. MrT

          Or maybe...

          ... a boatload, Ma'am?

          All hands to bathe? Aye-aye!

    4. TitterYeNot

      Re: Yowser

      "12Gb on a graphics card."

      If I understand this correctly, this is 2 Titans in SLI on a single card, therefore 6GB memory per graphics processor (same as a normal Titan.) Memory can't be shared between processors unless Nvidia have done something very funky, so it's effectively a 6GB card as far as frame buffer etc. goes.

      12GB sure does sound impressive though!

    5. Justin Stringfellow

      Re: Yowser

      It's able to render an area the size of Wales, and costs about as much as 3 football pitches.

  2. Mussie (Ed)

    well at least you will be able to paly crysis at full detail in 1080

    1. Jedit Silver badge
      Boffin

      "well at least you will be able to paly crysis at full detail in 1080"

      Not at 60 FPS, you won't.

      1. larokus

        Re: "well at least you will be able to paly crysis at full detail in 1080"

        Hunh? I get 45 - 55 fps on crysis3 with a gtx 760!AMP and an overclocked q6600 with everything maxed except AA

        1. MrT

          Whoosh...

          ...and whoosh...

  3. Mikel

    Just a few more years and that desktop PC will be something really special.

  4. Notas Badoff

    BitCoin or passwords?

    Which will this monster be better fit to churn/crack?

    1. Justin Stringfellow

      Re: BitCoin or passwords?

      See https://en.bitcoin.it/wiki/Mining_hardware_comparison

      It's a bit out of date now, but anyway, nobody's really mining btc with graphics cards any more. ASIC hardware is several orders of magnitude faster per $ spent. So yes, the more economically viable approach would be to use the hardware to brute force someone's password and steal their btc :)

    2. ToddR

      Re: BitCoin or passwords?

      Bitcoin mining should work very well

    3. squigbobble

      Re: BitCoin or passwords?

      BTC mining on anything other than ASICs is dead as of last year. Scrypt currency mining on GPUs is just about viable but the leccy consumption will eat a lot of your coins. I mined about £100 of various things (when converted to BTC and then GBP) in 6 weeks on a pair of GTX670s before giving up.

      Scrypt ASICs are going to appear this year so GPU mining will be dead before Pascal hits the shelves.

  5. The Axe

    They should make it a bit cheaper and remove the graphics output connectors. They're very expensive.

  6. LaeMing

    Wow!

    That's 1.5 times my current system ram!

  7. akeane

    But...

    ... will it play Doom OK?

    1. quartzie
      Joke

      Re: But...

      All 5760 instances should run just fine :D

      Now you just need someone to rewrite it for CUDA...

    2. Anonymous Coward
      Joke

      Re: But...

      No but I hear Wolfenstein 3D is just about playable

  8. Anonymous Coward
    Anonymous Coward

    I'm assuming that

    half life 2 would be Ok on it?

  9. Bronek Kozicki

    most expensive

    El Reg never heard of Quadro K6000 or FirePro S10000 12GB, I guess?

    On the other hand, 5760 cores is worth mentioning. FWIW, almost 18 months old S10000 is also dual GPU.

  10. Callam McMillan

    Surely this sounds like an application for the Tesla range of GPGPU cards rather than a graphics card?

  11. This post has been deleted by its author

  12. Michael Habel

    So how many of these in SLI do I need to finally be able to run Crysis?

  13. Michael H.F. Wilkinson Silver badge

    Would love to have one for certain HPC loads, not for others

    It is amazing what can be done with "cheap" HPC systems. I do note that this kind of architecture is not that good at the type of compute load I tend to work on: heavily data-driven processing order. GPUs still prefer SIMD-like problems of "lots of the same" type seen in many physics problem. Alternatively, I need to rethink my algorithms, but at the moment the fastest kit we use for our problems is a 2U rack server with 64 AMD cores and 512 GB RAM.

    As you might guess from the above, 12GB is also too small for most of my data sets, and we still haven't quite worked out how to do our work in distributed-memory machines. It would be great if we could find a way to harness these beasts for our kind of work, however

  14. Bigbird3141

    How about a game of Global Thermonuclear War?

    1. nevski11

      Wouldn't you prefer a nice game of chess?

  15. Anonymous Coward
    Anonymous Coward

    I think many people need to think what we can use this for works beside the gaming. so we can consider to buy and use it

  16. utomo

    Use for ?

    I think many people need to think what we can use this for works beside the gaming. so we can consider to buy and use it

  17. phil dude
    Linux

    comparisons...

    From my point of view it makes the Intel MIC look expensive, so perhaps the pricing these adapters (AMD,NVIDIA,Intel) will start to converge on a sane pricing model.

    I must say however, that the CUDA infrastructure gives a "lightbuld" moment when working out how many molecules we can get into this card....

    P.

  18. John Savard

    One Problem

    If I wanted a supercomputer, I'd be more concerned about its 64-bit floating-point performance. Nvidia does make special parts aimed at the supercomputer market instead of gamers wanting high performance video, but this doesn't sound like one of them. The Tesla K40 is roughly comparable to the original Titan, but it costs $5,300, not $999.

  19. This post has been deleted by its author

  20. Catweazle666

    Showing my age...

    I wonder how many VAX MIPs that is.

    1. TheOtherHobbes

      Re: Showing my age...

      You can't directly compare MIPS and FLOPS because throughput depends on the architecture.

      But for HPC, it's (nominally) slightly faster than a Cray X1 from ten years ago.

  21. Destroy All Monsters Silver badge
    Childcatcher

    Sure will get you closer to the WIRED

    "Special order for Miss Iwakura Lain"

    1. Michael Habel

      Re: Sure will get you closer to the WIRED

      I ♥ U lol no I don't but, I do love the Lain reference though!

  22. Anonymous Coward
    Anonymous Coward

    "Titan Z doubles the Titan Black's 2,800-core GK110s to a total of 5,760 CUDA cores, doubles the memory to 12GB, and doubles – as you might imagine – the memory bus width to two 384-bit channels."

    ...and then you can use it to play games that are still designed for computers with 512mb shared between the CPU *and* GPU and less CPU stomp than your phone. Hell, even the PS4 - which is presumably going to defacto limit the state of the art in gaming for another ten freaking years - only has 8gb of *total* memory. Think about it: Ten years from now, games are going to be gimped to a spec that was around 1/16th the capacity of PCs when it was *new*. Is it too late to start drone strikes against console manufacturers?

This topic is closed for new posts.

Other stories you might like