back to article Fujitsu picks 64-bit ARM for Japan's monster 1,000-PFLOPS super

Fujitsu has signaled it will use 64-bit ARMv8 cores in the whopping exascale supercomputer it's building for Japan's boffins. Back in 2014, the Japanese IT giant was hired by the RIKEN Advanced Institute for Computational Science to construct the Flagship 2020 machine – dubbed the Post-K super because it will succeed Japan's K …

  1. Bronek Kozicki
    Happy

    well, at this moment I'm happy to be ARM shareholder.

  2. vir

    "The Flagship 2020 machine will be used to work on 'scientific and social issues'"

    No amount of processing power could divine the reasoning behind Mayonnaise Kitchen:

    http://www.mayokichi.com/

    1. Fungus Bob

      It is a Zen Koan, grasshopper.

  3. energystar
    Headmaster

    Well, on the omnipresent lack of real, solid info...

    No doubt that a lot of Strategic Decisions get to be slightly touched by mass media mythology.

    1. Destroy All Monsters Silver badge

      Re: Well, on the omnipresent lack of real, solid info...

      Yes. This Machine of the Rising Sun will be called the "Armato"!

  4. energystar
    Linux

    "...and is due to go live in 2020."

    Wishing the best to the Teams. On the Pro Side of the Chat, the effort will bring possibilities of providing to never before touched niches.

  5. Wade Burchette

    This is why AMD and NVidia are making ARM chips

    Every river starts as a trickle. It will be a long long time until ARM overtakes x86, because compatibility is important for people and businesses. But it can happen. If Intel loses the servers, they lose their high-markup cash cow. AMD's first generation Opteron ARM CPU's aren't that great, but with them now using FinFET the next generation will probably be very respectable. NVidia should also be able to crack the server ARM CPU's too.

    The only thing static in the computer world are retired standards. The Wintel dominance can be broken because Intel dug their heels in too deep in the x86 game after the disaster than was Itanium. And because Microsoft kept thinking that buzzwords and know-nothing know-it-all analysts all knew the future better than the actual people who use their products and who quite plainly tell Microsoft what they want.

    I don't know the future. But I do know ARM will continue to encroach on Intel's territory. Competition is always a good thing. What was once unthinkable is now very much possible. Vulkan can rival DirectX 12. If more games eschew DirectX 12 for Vulkan, it is not too much of a stretch to envision and ARM gaming machine running Linux. There are still a lot of things that must happen for the Wintel dominance to be broken. It is just now it is no longer unthinkable.

    1. phil dude
      Coat

      Re: This is why AMD and NVidia are making ARM chips

      If it runs LINPACK efficiently, we'll use it.

      There is very little legacy code in scientific computing, as we write it from scratch on libraries... e.g. solvers etc...

      What I (and many physicists want) is faster 3D FFT's....

      P.

      1. Anonymous Coward
        Anonymous Coward

        Legacy code in scientific computing

        There is very little legacy code in scientific computing, as we write it from scratch on libraries... e.g. solvers etc...

        There may be very little legacy code in your branch of scientific computing. However, this is not the case in many others. For example, your average quantum chemistry code contains many million LOC, some of those going all the way into the mists of time.

        However, this means little as far as the choice of CPU architecture goes: exactly because it survived for a long time, this code tends to be portable. Given a decent compiler, it will (and does) run on anything you want.

        Of course, if the architecture de jour does not have a decent optimizing compiler for the languages in which these codes are written, it will not be used - or it will be used at best as an offload accelerator.

        1. Hi Wreck

          Re: Legacy code in scientific computing

          Code generation for a RISC machine is easier too since you don't have to worry all of the funny non orthogonal instructions.

          1. Roland6 Silver badge

            Re: Legacy code in scientific computing

            Code generation for a RISC machine is easier too since you don't have to worry all of the funny non orthogonal instructions.

            This was one part of the RISC v. CISC argument I never really got. Having written an x86 code generator for a C compiler back in the mid 80's, there weren't that many non-orthogonal instructions you had to worry about. However, if you were generating code for an ICL 1900, VAX etc. ie. the previous generation of computers where you did stuff to squeeze the maximum performance out of the system and minimised the wasteful use of memory, then the situation is a little different.

            1. GrumpenKraut
              Mushroom

              Re: Legacy code in scientific computing

              > ...there weren't that many non-orthogonal instructions you had to worry about.

              I observed that quite a few people complaining loudly about the x86 instruction set cannot give a single pertinent example for their claim.

              Certainly no beauty contest will be won with it, ever. But from the perspective of a user the bang for the buck is good enough for me to stop complaining.

              The x86's SIMD extensions, however,... Oh. My. Fucking. God. -------------->

            2. Marcelo Rodrigues
              Pint

              Re: Legacy code in scientific computing

              "Code generation for a RISC machine is easier too since you don't have to worry all of the funny non orthogonal instructions.

              This was one part of the RISC v. CISC argument I never really got..."

              Today's CISC CPU work (loosely) first fetching the instruction, then decoding, then executing. They are, so to roughly speak, CISC outside and RISC inside. So, in order to use SSE3 your compiler must understand it, and create code with this instruction set.

              RISC CPUs work reverse to this. They fetch the RISC instructions - all very simple ones, and within a long chain of them. Then they assemble (inside the CPU) these many instructions in bigger, more complex ones, within a smaller chain. Then they do the processing.

              So, when a new set of instructions is created to RISC (let's say SSE4 for RISC), we don't have to change the code: the assembler inside the CPU will construct the instructions with the optimized code.

              Well, all this is an oversimplification. But You got the idea.

              1. Roland6 Silver badge

                Re: Legacy code in scientific computing

                @Marcelo Rodrigues

                Thanks for the thumbnail, basically the difference between x86 and ARM is in the microarchitecture and microcode. However, it would seem the x86 has become RISC on the inside while ARM has added CISC features...

                I suspect from your comment, you may already be aware of this article and the papers it refers to:

                "RISC vs CISC: What's the Difference?" http://www.eetimes.com/author.asp?doc_id=1327016

            3. David Halko
              Thumb Up

              Re: Legacy code in scientific computing

              Programming VAX Macro was a dream...

      2. Destroy All Monsters Silver badge

        Re: This is why AMD and NVidia are making ARM chips

        But aren't the libraries all legacy code, in effect?

        1. phil dude
          IT Angle

          Re: This is why AMD and NVidia are making ARM chips

          If you're lucky ($DISCIPLINE) the libraries will take the weight of whateveryouwanttodofast...

          That's why LINPACK is useful as it at least solves an important problem that is common across the sciences.

          The biggest limitation with *all* of these technologies, is the parallel infrastructure has a huge-latency built in.

          PCIE(v4?) or NVLINK have enormous bandwidth, but latency in the 1-4 us range just to get off the node.

          I could find alot to do with a 1 EF machine, but for tightly coupled problems, limit is the densest component.

          P.

    2. bazza Silver badge

      Re: This is why AMD and NVidia are making ARM chips

      Intel are a bit stuck with x86.

      Itanium should have been successful, but some bunch of upstarts made an alternative product that appealed strongly to Intel's lazy customers who didn't want to go through a painful cycle of recompiling / testing their software. That was AMD with AMD64, which Intel ended up copying (oh the irony). And if Intel try again to renounce x86/64, AMD will still be there, and will probably be able to do an incrementally improved x64 design, and will again pinch a load of custom.

      So Intel won't be able to ditch x86/64 until that architecture has been stretched to its absolute limit and cannot be improved upon any further by anyone.

      The thing is that the ARM core, at only 48,000-ish (ignoring the caches, etc) transistors isn't actually very performant. It's simply a very efficient way of marshalling the operation of a bunch of specialised coprocessors (targeted at media coding / decoding, GPU, network I/O, etc) that do the actual heavy compute whilst the core still has enough general purpose compute to make an OS run well. I don't for one moment imagine that Fujitsu are contemplating using ARM cores for the actual compute effort, they'll be slapping some monster maths core down on the same piece of silicon.

      BTW MS are in better shape to adapt than a lot of people give them credit for. They, just like Linux, can shift the entire Windows stack from Intel to, say, ARM; Windows has a hardware abstraction layer and can relatively easily be ported. That have in the past shown off Windows 7 running Office on and ARM board, which involved little more than a fresh HAL followed by recompiling everything.

      Personally I think that their mobile ambitions were an unnecessary (and as it turns out unsuccessful) distraction from what should be their real task - building up an Windows/ARM server/desktop stack. The biggest barrier to that is the fragmented nature of the ARM market, making it hard to push out universal binaries. MS have previously been involved in defining hardware architectures (anyone remember PC'97, PC2000, etc?). With MS themselves having an ARM Foundry license they could have done the same thing in the ARM server world, removed the fragmentation and made themselves immune to the changes going on in the hardware world.

      Had they done that it would have done us all a favour. Even Linux would have benefited from this - ARM fragmentation has made it hard for the Linux guys to support all ARMs too.

      1. your handle is already taken

        Re: This is why AMD and NVidia are making ARM chips

        Windows/Intel -> Wintel

        Windows/ARM -> Warm (or Winarm?)

        Warm, Winarm. Doesn't have the same ring to it as Wintel so I say let's resist the change with all our might.

        1. David Halko
          Happy

          Re: This is why AMD and NVidia are making ARM chips

          Windows / ARM = WiRM

          Rhymes with Squirm....

      2. Steve Todd

        Re: This is why AMD and NVidia are making ARM chips

        @bazza - you are joking about Itanium right? It was always going to be a technical disaster. It relied on static compile time organisation of the code rather than dynamic run time out of order execution, so it couldn't adapt to changes. It emulated the x86 (badly), and took a bunch of silicon to do that. It was very late, and never got the critical mass required to make a modern CPU profitable. All in all Intel had holes in their heads when they came up with the idea.

      3. oldcoder

        Re: This is why AMD and NVidia are making ARM chips

        No, Microsoft can't shift to ARM. They tried that - just look at the abortion of Windows RT.

        Too many of Microsoft products have builtin dependencies on X86 architecture, too many code fragments buried in files that won't port or can't be ported...

        1. Andy Nugent

          Re: This is why AMD and NVidia are making ARM chips

          "Too many of Microsoft products have builtin dependencies on X86 architecture" - is that the case? I thought Win RT failed in part due to the fact that people expected to be able to run the Windows software that they've always used, that was x86 and that Microsoft had no control over.

        2. GrumpenKraut

          Re: This is why AMD and NVidia are making ARM chips

          > ...too many code fragments buried in files that won't port or can't be ported...

          No doubt about it. Remember that "long" is 32 bits under Windows on 64-bit systems, you have one guess why that is.

      4. Stoneshop

        Re: This is why AMD and NVidia are making ARM chips

        Itanium should have been successful,

        You're quite funny, you know.

      5. kryptylomese

        Re: This is why AMD and NVidia are making ARM chips

        "BTW MS are in better shape to adapt than a lot of people give them credit for. They, just like Linux, can shift the entire Windows stack from Intel to, say, ARM; Windows has a hardware abstraction layer and can relatively easily be ported. That have in the past shown off Windows 7 running Office on and ARM board, which involved little more than a fresh HAL followed by recompiling everything."

        LOL - just about all Windows software is x86(x64) compiled and closed source so good luck with converting that!

        1. Anonymous Coward
          Anonymous Coward

          Re: This is why AMD and NVidia are making ARM chips

          "LOL - just about all Windows software is x86(x64) compiled and closed source so good luck with converting that!"

          Microsoft already did it on the Xbox One - that can run Xbox 360 sw without recompiling at close to native performance...

    3. jzl

      Re: This is why AMD and NVidia are making ARM chips

      An awful lot of business code is written on .Net and Java, so is bytecode portable. For the rest, there are emulation and recompile technologies, or software teams can change the target platform and rebuild in many cases without a crazy amount of extra work.

      The real dependencies for many businesses are at the OS level. When Microsoft releases a fully functional Windows running on Arm64, that's when Intel will start to panic. Particularly if it includes something similar to Apple's old Rosetta.

    4. oldcoder

      Re: This is why AMD and NVidia are making ARM chips

      It wasn't Intel that dug in. Intel designed the Itanium. Had MS followed through with support (I believe MS quit after three years), the Itanium would be using the foundation of the RISC currently used in Intels fastest X86 line.

      And be twice as fast.

      1. Mage Silver badge

        Re: This is why AMD and NVidia are making ARM chips

        I thought HP had a big input to Itanium?

        It was doomed from the start. Essentially a server only, high power consumption, flawed design concept.

      2. GrumpenKraut

        Re: This is why AMD and NVidia are making ARM chips

        Have you ever used an Itanium? I have: not good enough. Bang for the price asked: quite bad. It's dead for a reason.

    5. Mikel

      Re: This is why AMD and NVidia are making ARM chips

      There were two companies working on ARM servers a few years ago. HP and AMD were designated to take them out behind the barn and do the unpleasant needful.

      I said at the time that it wouldn't work. Progress can be delayed but not prevented.

    6. Charlie Clark Silver badge

      Re: This is why AMD and NVidia are making ARM chips

      I don't know the future. But I do know ARM will continue to encroach on Intel's territory.

      While I agree with this generally I don't think this announcement has much to do with that. It's more like a nail in the coffin of the Sparc. Of course, depending on how well the build goes, we may well see other HPC setups trying ARM out. But, then again cost / core, where ARM has an undoubted advantage, is much less relevant than in the data centre.

      1. chasil

        Re: This is why AMD and NVidia are making ARM chips

        The 64-bit ARM instruction set is relatively new, and it dispensed with a number of problems from the 32-bit set.

        From a programmer's perspective, 32-bit ARM was quite good compared to MIPS and SPARC. For a taste of using those architectures from the perspective of machine language, read this post:

        http://blog.jwhitham.org/2016/02/risc-instruction-sets-i-have-known-and.html

        A few highlights:

        MIPS... You can read from a register before that register is ready.

        SPARC also has a crazy feature all of its own, the "rotating register file", which makes code incredibly hard to understand.

        Both SPARC and MIPS share another horrid feature - delayed branches.

        ...on PowerPC, r0 has special properties. Usually, r0 means general-purpose register (GPR) 0. But for some instructions it means a literal zero.

        ARM-32-bit: Design errors, like having r15 as the program counter or making every instruction conditional, are problems for CPU architects rather than programmers, and it's no surprise that they disappeared in the 64-bit version of the ARM architecture. They must have made it awfully hard to implement superscalar, out-of-order execution.

        Fujitsu likely sees 64-bit ARM as an opportunity to retire a steaming pile of SPARC cruft.

        1. Destroy All Monsters Silver badge
          Holmes

          Re: This is why AMD and NVidia are making ARM chips

          Design errors [of ARM 32], like ... making every instruction conditional

          I am totally not au fait with CPU instruction sets as I have abandoned that particular specialization after writing floating point arithmetic operations for NS32032 at uni, but are these instructions for "predicated execution" as used in IA-64 "Merced" and the Zuse Z-3, meant to reduce (or eliminate) branches?

          See: Konrad Zuse deserves even more credit

          The IEEE Computer Article referenced in the above is actually 'Challenges and Trends in Processor Design', Janet Wilson, IEEE Computer Magazone, January 1998, with the item "Introduction to Predicated Execution" by Web-mei Hwu, University of Illinois, Urbana-Champaign, where we read:

          The story of Merced, Intel’s first processor based on its next-generation 64-bit architecture, will continue to unfold in 1998, Intel expects this product of its collaboration with Hewlett-Packard to reach volume production in 1999. To date, however, the two companies have released few details about Intel Architecture 64 (IA-64). One significant change they did admit to at the October 1997 Microprocessor Forum was the switch to full predicated execution, a technique that no other commercial general-purpose processor employs.

          [IEEE Computer] wanted to give its readers advance notice of this promising technique. We invited Wen-mei Hwu, a prominent researcher in this area, to explain predication, a topic you may be hearing more about in 1998. -- Janet Wilson

          Predicated execution is a mechanism that supports the conditional execution of individual operations. Compared to a conventional instruction set, an operation in a predicated-execution architecture has an additional input operand -a predicate- that can assume a value of true or false. During runtime, a predicated-execution processor fetches operations regardless of their prcdicatc value. The processor executes operations with true predicates normally; it nullifies operations with false predicates and prevents them from modifying the processor state. Using predication inherently changes the representation of a program’s control flow. A conventional instruction set requires all control flow to be explicitly represented in the form of branches, the only mechanism available to conditionally execute operations. An instruction set with predicated execution, however, can support conditional execution via either conventional branches or predicated operations.

          ....

          Providing compiler support for predicated execution is challenging. Current optimizing compilers rely on control flow representation as the foundation of analysis and optimization. Because predicated code changes the control flow representation, effectively handling it requires an extensive modification of the compiler infrastructure, particularly in the areas of classical and ILP optimizations, code scheduling, and register allocation. An effective compiler must balance the control flow and the use of predication. If resources become oversubscribed or dependence heights (the lengths of the chains of dependent operations) become unbalanced among paths, predicated execution can degrade performance.

          Predicated execution started as a software approach to avoiding conditional branches in early supercomputers. Vector architectures such as the Cray 1 and array-processing architectures such as Illiac IV adopted predication in the form of mask registers to allow effective vectorization of loops with conditional branches. During the era of mini-supercomputers, the Cydrome Cydra 5 became the first machine to support generalized predication. Parallel to the Cydra 5, the Multiflow Trace machine adopted partial predication by introducing a single instruction with a predicate input, a select instruction. Contemporary processors, such as the DEC Alpha and the Sparc V9, have adopted the partial-predication approach so they can maintain a 32-bit instruction encoding.

          1. Mike 16

            Re: This is why AMD and NVidia are making ARM chips

            -- are these instructions for "predicated execution" --

            Yes

            The issue some (otherwise) ARM fans had with them was the bits dedicated to them in a fixed-width instruction were thus not available for other things (like, larger displacements for addressing). Also as noted, some (most?) compilers were pretty bad at compiling for machines with predicated execution.

            I also recall when Fujitsu was a fan (and licensee) of the IBM 360 and successors. While purists may quibble, that series had many hallmarks of what would today be considered RISC. SPARC is often considered one of the canonical RISC machines (though I preferred MIPS and Alpha), so It's not like this is some wild departure for Fujitsu.

        2. GrumpenKraut

          Re: This is why AMD and NVidia are making ARM chips

          > http://blog.jwhitham.org/2016/02/risc-instruction-sets-i-have-known-and.html

          That's a worthwhile read, thanks!

  6. J. R. Hartley

    Acorn

    Chris Curry must be proud. It's a real shame Acorn aren't around any more.

  7. John Smith 19 Gold badge
    Go

    so it' s the V8 Instruction Set that Fujitsu are using.

    Which they reckon is pretty good.

    Because it is?

  8. EvadingGrid

    Wintel penal colony

    Innovation is encouraged in ARM world, unlike Wintel Penal Colony.

  9. Groaning Ninny

    Tofu?

    Love the idea of a tofu interconnect

  10. kmac499

    Hmm ARM cores.. What's the Japanese for Raspberry Pi....

  11. GraemeKilshaw

    Fujitsu should consider investing in the Friendship Cube Group fibre optic cables, the wireless superswitches I engineered, and the FCG computer I designed and manufactured.

    1. detritus

      What is this bilge - are you a nutter or a troll?

      Or both?

      1. Anonymous Coward
        Anonymous Coward

        I think it's a lame attempt at being as random as AManFromMars

  12. MT Field
    Trollface

    The trouble with Intel chips

    They start out all shiny and fast, but after a while they start to slow a bit. Then they get all bogged down and before you know it they are virtually useless and you need to go and buy a fresh box of them.

    1. Manu T

      Re: The trouble with Intel chips

      ".. but after a while they start to slow a bit..."

      No they don't! It's just that the world goes faster over time and then your computer "appear" to slow down. I'm sure that my home Intel Q6600 PC is just as fast as when I assembled it years ago. But then there was no UHD, 4K-youtube, 3D TV's or VR. There weren't even electrical cars on the roads back then.

      But now...

      My world has changed. Now, I need a faster home PC. A faster Intel and if they can't deliver then I'll choose something else. Just like anyone else who sees computers as a tool.

      And before you ask. The last time I was "brand loyal" was in the 80's but after Acorn's demise I couldn't care less. Yes, I really wanted Phoebe 2100 and got really disappointed when it got axed.

      1. Mage Silver badge

        Re: The trouble with Intel chips

        "There weren't even electrical cars on the roads back then."

        Really?

        There were Victorian Electric cars.

        Electric vehicles just weren't very pretty or good power to weight till they replaced Lead Acid with Lithium Ion.

  13. Mahhn

    I'd like to borrow the system for an hour, to crunch out a few coins :)

    1. Anonymous Coward
      Anonymous Coward

      ARM super

      Hmmm... it's probably still too big to fit under one's desk. Despite being "energy efficient".

  14. truemore

    Human level computing?

    Just a quick note. If they want to 100x their current computing power that would mean 1 exaflop which is considered the best computing power of the human mind which estimates range from 38 Peta flops to 1 Exa flop. Is it just me or is that a bit scary. I say that because if computer power is growing at that rate how long before we have desk sized computers with more computing power than the user 2030? While I understand we definitely don't have human level software yet, it is still a rather chilling realization that we have created a device with at least as much raw computing power as the best human.

    1. Anonymous Coward
      Anonymous Coward

      Re: Human level computing?

      Skynet?

    2. Destroy All Monsters Silver badge

      Re: Human level computing?

      > it is still a rather chilling realization that we have created a device with at least as much raw computing power as the best human.

      Not really. We have motors that develop more power, materials that are stronger, vehicles that are faster, optical instruments that can see further ...

      It's just another step in INGENUITY!

  15. Jim84

    Can't believe The Reg didn't make a Crysis joke.

  16. earl grey
    Trollface

    scientific and social issues

    So they can come up with an answer to lying polilticians?

    Have at it and more power to you.

  17. Anonymous Coward
    Anonymous Coward

    Fujitsu SPARC64 very much alive and continuing

    http://www.fujitsu.com/global/products/computing/servers/unix/sparc/20160623.html

  18. SeanC4S

    I really don't like the ARM 64 bit instruction set. There are a ton of pointless SIMD instructions that no compiler (or human) is ever going to use. They just waste area on the chip and make the number of bytes for each instruction far longer than needed. You could probably fit 10 cores minus the SIMD part in the area of one core plus the SIMD part. That would seem a better deal to me. Unfortunately the way ARM have set the 2 things up you can't separate them.

  19. EvadingGrid

    SIMD instructions

    When they kick in, they really make a big difference - that is how come tablets and phones can do what they do so well.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like