back to article Here's what an Intel Broadwell Xeon with a built-in FPGA looks like

At the OCP Summit last week in San Jose, California, Intel quickly mentioned it will later this year ship Xeon processors with built-in FPGAs. Chipzilla will also release open-source software libraries allowing people to program these customizable gate arrays to take workloads off the CPUs and perform them in hardware. Intel …

  1. Justin Clift

    This is actually pretty cool

    Personally, I've been interested in doing some FPGA stuff combined with Infinband for very fast data processing/extraction. Was thinking perhaps Spartan FPGAs could be used to make it happen. This Broadwell with FPGA built in could make it much more achievable instead, depending on its specs/capabilities of course.

  2. Sgt_Oddball

    timing seems interesting....

    How many of these fpga's can run on Windows boxes?

    If the answers next to or none then I wonder if that's part of the move to Linux that Ms is looking for with MsSQL on Linux.

    I'd be fascinated to see how dedicated silicon and the resent strides in memory efficiency with MsSQL will play out in the pure speed side of things.

    1. the spectacularly refined chap

      Re: timing seems interesting....

      How many of these fpga's can run on Windows boxes?

      If the answers next to or none then I wonder if that's part of the move to Linux that Ms is looking for with MsSQL on Linux.

      Yup. None. I can tell you that without even looking at it. Coincidentally that's the same number Linux supports. Something like is going to be outside the reach of general application code since it is essentially system-wide - you just wouldn't reprogram an FPGA at each context switch - so it's going to have to have explicit OS support as gatekeeper. I don't see adding that as some huge showstopper (essentially it's just another device to be managed) so speculating that it will be some great issue for Windows while Linux magically supports it from the get-go would be wide of the mark. Yes, I'm aware of existing systems with FPGA integration, but this is on chip and the details of interfacing are inevitably going to differ, so any support you have doesn't carry over unmodified.

      On the other hand I'd like to see how it works in a virtualised environment. My guess is simply that it doesn't and it won't be supported regardless of either host or guest OS.

      1. This post has been deleted by its author

      2. DropBear
        Devil

        Re: timing seems interesting....

        "you just wouldn't reprogram an FPGA at each context switch"

        Awww... I just came to say that hard disk swap thrashing will end up being remembered fondly once the SSL lib and the x264 encoder lib starts fighting over the FPGA many, many times a second...

        1. Alan Brown Silver badge

          Re: timing seems interesting....

          "once the SSL lib and the x264 encoder lib starts fighting over the FPGA "

          In a DP system you'd have 2 FPGAs to fight over.

          In short order, you'll see multi-FPGA computing to rival multi-GPU.

      3. Ken Hagan Gold badge

        Re: timing seems interesting....

        "On the other hand I'd like to see how it works in a virtualised environment. My guess is simply that it doesn't and it won't be supported regardless of either host or guest OS."

        It won't, to begin with. However, unlike GPUs (the nearest comparison) the on-chip FPGA is arriving after virtualisation has become mainstream and is clearly targetting server acceleration just as much as workstations. I expect once Linux and Windows settle on their respective APIs, we will find that those APIs have been designed with virtualisation in mind and the various virtualisation providers now support them.

        Since this *is* the direction of Intel's designs, it would be pretty suicidal for anyone not to follow this roadmap.

        1. Bronek Kozicki

          Re: timing seems interesting....

          Well this is obviously aimed at machines which are dedicated for running single application only. Something where one would normally use core isolation to set cores aside for the application. If this FPGA has the same access to cache synchronization as cores, I can imagine several uses in multithreaded programming, where synchronization cost would drop to small number of nanoseconds for operations which are currently too difficult to implement lock-free.

      4. Anonymous Coward
        Anonymous Coward

        Re: timing seems interesting....

        > you just wouldn't reprogram an FPGA at each context switch

        Typically with devices like this, an application requests exclusive use of the device. So one application at a time and no need for context switches. If you wanted to then share the configuration a driver application would manage that.

        In the same way a printer serves one application at a time (well, think back to when they were plugged into the PC rather than the network).

        As for Windows, it'll probably run on these devices without modification, you'll just need a driver to access the additional hardware, which is typically available for Windows first for IP protection reasons rather than technical ones. This will be too interesting for the HPC crowd for there not to be Linux drivers though, and of course Intel are actually quite committed to developing Linux drivers for their hardware.

      5. Nigel 11

        Re: timing seems interesting....

        you just wouldn't reprogram an FPGA at each context switch

        I was wondering about that. Conventionally, reprogramming FPGAs is rather slow because they are using some sort of non-volatile PROM to store the programming in the absence of power. But I cannot think of a reason it has to be that way. If the programming is RAM-fast, it becomes a matter of how many kilobytes of programming state have to be saved and how often. Every context switch would be excessive, but maybe rescheduling of the FPGA resource ten or even a hundred times per second might be OK. Most processes will not be using the FPGA so scheduling them won't involve saving the FPGA state - the OS will just not let them access it. The ones that do request the FPGA will be in FPGA-specific scheduler queues.

        Some work to be done on schedulers and resource queues but I cannot see any fundamental difficulty, neither for Linux nor for Windows.

    2. Ken Hagan Gold badge

      Re: timing seems interesting....

      "How many of these fpga's can run on Windows boxes?"

      With a suitable driver, all of them.

      It's no different from putting a new GPU on the chip, which Intel appear to do with each iteration. Someone has to write the bare metal Kaby Lake support, but once it is done the applications can just use DirectCompute or whatever you prefer.

      There will be a DirectFPGA. MS will write the driver, based on early information from Intel who have no reason not to tell them. You will write the applications that use it.

      1. picturethis
        Thumb Down

        Re: timing seems interesting....

        "There will be a DirectFPGA. MS will write the driver, based on early information from Intel who have no reason not to tell them. You will write the applications that use it."

        Care to make a guess at which versions of Windows will be supported? (It ain't gonna be anything but version 10+)..

        1. Ken Hagan Gold badge

          Re: timing seems interesting....

          Well, duh! But if you aren't switching to Win10 in the next year or two then you are presumably switching away from Windows altogether, because the older versions will go the way of XP in a few years anyway. (8.0 is already dead. 7 and 8.1 are on life support and the ongoing saga of Windows Updates makes it perfectly clear to me that MS would switch them off tomorrow if they thought they could do it without being sued.)

          1. Bronek Kozicki

            Re: timing seems interesting....

            This is only meant for HPC, and who would bother with Windows for serious HPC in the first place? I'm ready to bet that Intel will release GPL driver for Linux for this FPGA (just like they did with NVMe) and it is not going to lack in features or robustness compared to Windows one (assuming they bother with Windows at all).

  3. I sound like Peter Griffin!!
    Meh

    I once had a Sony C1-VFK with a Transmeta 'Crusoe' FPGA-type processor - ran Win2K ok at a time when netbooks were all the rage; I miss the little thing (stolen)..

    1. david1024

      Miss using mine too

      I miss my old Crusoe-PictureBook too. I used it while at the uni.... it was fun running benchmarks and searches in my Algorithms class... it'd hickup a little... make a custom instruction... and finish first every time. Usually in 4 or less cycles. I had to build a celeron system out of a spare SBC floating around the research dept so that my homework would have different runtimes. :) Oh the bad old days... BTW, it wouldn't run matlab6! not x86 compatible enough.

  4. Brian Miller

    Wanna play? Get it with ARM today.

    This has been available on ARM chips for some time. The Parallella board uses the Zync SOC, which is dual-core plus FPGA. I remember years back that someone came up with a FPGA for Opteron socket boards.

    Something like this is for specialized applications. Yes, Windows could use this, but it's not like its a general-purpose thing. You load your FPGA binary, and fire up the application that uses it.

    1. Ammendiable to persuasion..

      Re: Wanna play? Get it with ARM today.

      Not only does the Parrallela board come with an FPGA but it comes with a parallel fabric array of 16 or 64 processors which can in theory be scaled up to neighbors.

    2. Grikath

      Re: Wanna play? Get it with ARM today.

      Specialised...definitely..

      Very yummie though..

  5. bazza Silver badge

    Hmm, developing for FPGAs is pretty hard, and especially so if programming and starting the FPGA part means a power cycle of the whole computer. Any word on whether they've made that easy?

    And unless every Intel chip from now on comes with one of these then it is going to remain very niche indeed. No mass market hardware sales, no mass market dev effort. And I can't see some killer application suddenly materialising out of thin air...

    One thing I don't understand is, why? Everyone else from ARM to Oracle are busily doing specialised accelerated for functions relevant to the target market. An FPGA is the ultimate do anything DIY accelerator but they normally don't clock that fast; they're not as good as dedicated silicon. So unless Intel has improved the clock rate then it'll be not as good at doing the same things that everyone else is laying out gates for.

    Nice experiment though, not so far from the FPGAs-in-an-AMD-socket that were doing the rounds a few years ago.

    1. Anonymous Coward
      Anonymous Coward

      "[...] if programming and starting the FPGA part means a power cycle of the whole computer."

      Why should that be necessary? The whole point of "soft" FPGAs is the fact that you can load - and reload - them with a bit stream from any source. A power-on ROM device is a limited way to use an FPGAs versatility.

      Back in the 1980s my PC ISA prototype cards were populated with early Xilinx FPGAs. The cards interfaced as a peripheral address for both loading and passing data. The DOS program sent the download as a bit stream. The application reprogrammed the FPGAs' functions as needed - by generating its own bit maps on the fly in hundreds of permutations.

      FPGAs nowadays apparently allow you to do partial reloads of a running chip.

      It's a surprise that it has taken 30 years for the Holy Grail of FPGA program assists to achieve their potential in this way.

      I really should donate my prototype card to a museum. Clocking at 100ns it was about the limit for hand wire-wrapping. Worked first time too.

    2. Paul Crawford Silver badge

      Re: One thing I don't understand is, why?

      As someone else pointed out, for things like software-defined radio where you need lots of small integer-like operations performed essentially in parallel to process the signal as it is shifted in frequency and sample-rate. Those steps can be implemented in dedicated chips, but there are only few of them off the shelf and often not quite what you wanted. So being able to push the "simple but massively parallel" tasks to FPGA and keep the "complex but slow" stuff on the CPU makes sense.

      Except that programming tools for FPGAs suck donkey balls big-time. Really, you think that developing for C is a pain, just try VHDL with tools that lack any sort of usable context-sensitive help for the vast number of uber-pedantic problems you will encounter. And weep....

      1. Ken Hagan Gold badge

        Re: One thing I don't understand is, why?

        "Except that programming tools for FPGAs suck donkey balls big-time."

        That's partly because embedded hardware designers have no clue whatsoever about programming languages. (Half of them are still push home-grown C compilers, FFS.) I expect that will change once Linux and Windows define an API to let anyone write new tools.

        Yes, there *will* be Visual Basic for FPGAs, but there may also be rather nice innovations.

        1. Paul Crawford Silver badge

          Re: One thing I don't understand is, why?

          "That's partly because embedded hardware designers have no clue whatsoever about programming languages."

          I don't think so. It seems to be down to (usually) having only one choice of tool, that blessed by the FPGA supplier, and they have little incentive to do any better. I really hope you are right and programmable hardware accelerators become popular enough to have multiple vendors competing to supply the tools, but I double it will come soon.

        2. 8Ace

          Someone doesn't understand programmable logic at all !

          "That's partly because software developers have no clue whatsoever about hardware."

          FTFY

          FPGA's are not "programmed", there is no "software". VHDL (and Verilog) is a hardware description language, what's loaded into the FPGA is not a set of instructions to execute, it's a description of how the hardware is to be implemented - internal "connections", clock domains, hardware behaviour and timing.

          Apart from VHDL Process statements, the closest analogy to sofware would be to think of code where in theory all lines of code have the potential to be executed concurrently with no sequential flow or order whatsoever.

          1. Ken Hagan Gold badge

            Re: Someone doesn't understand programmable logic at all !

            "... the closest analogy to sofware would be to think of code where in theory all lines of code have the potential to be executed concurrently with no sequential flow or order whatsoever."

            As it happens, I'm fine with that. Like many software developers, I *started* from a mentality where a=a+1 is Just Plain Wrong and I've spent the last decade or two so looking for easy ways to make my code less sequential.

          2. This post has been deleted by its author

          3. Anonymous Coward
            Anonymous Coward

            Re: Someone doesn't understand programmable logic at all !

            "FPGA's are not "programmed", there is no "software"."

            A "program" defines a set of states and their time relationships - whether the subsequent behaviour appears concurrent or serial.

            The FPGA is "soft" in the same way that early computers were "soft" compared to relays and diodes. In both "soft" cases logic is programmed by settings that are not hard-wired.

            That was the breakthrough that Xilinx made in the 1980s. Many people just saw it as gate array that could be re-used and reprogrammed in the field - hence "Field Programmable Gate array". It took some imagination to realise that the configuration was soft enough to be changed in flight. The classic early example was to reload the configuration between send and receive modes.

  6. JeffyPoooh
    Pint

    Good for Software* Defined Radios

    (* with a bit of FPGA Firmware on the side)

    A Software Defined Radio (SDR) is where you promise everyone that "this radio will be upgradeable to future waveforms", then discover only 24 months later that your Company standard 50% reserve of CPU cycles and memory is insufficient for even the very next generation of waveform. Now, with an on-chip FPGA, you'll be able to add "not enough gates" to your list of reasons why the expensive "but futureproof" SDR is obsolete already. It's good to have excuses.

  7. Anonymous Coward
    Boffin

    Useful for automatic trading?

    Banksters could put their high-frequency trading algorithms in the FPGA to exploit 'dark pools' where a nano-second advantage in exploiting pricing arbitrage means $M in bonuses.

    1. oliversalmon
      Go

      Re: Useful for automatic trading?

      They've been doing it for years in HFT and Stat Arb. Plenty of examples of FIX gateways built on FPGA chips.

  8. To Mars in Man Bras!
    Facepalm

    = Field Programmable Gate Array

    Don't mention it!

    [Oh. You didn't]

  9. Matt Bucknall

    OpenCL

    I think it is pretty likely that the development tools will involve OpenCL. Altera already provides an OpenCL SDK for its FPGAs.

    I don't think time-slicing an FPGA is a particularly efficient way of sharing its capabilities across processes - Way too much state to save/restore during context switches. Given that FPGA fabric tends to be highly homogeneous, I don't think it is far-fetched to imagine a scheme where processes could request blocks of gates from the OS, not dissimilar to the way they request RAM (i.e. sbrk/mmap) albeit through some sort of driver interface.

  10. Anonymous Coward
    Anonymous Coward

    This would be great for Encryption

    Just think, you could program this with a very complex encryption algorithm and make it wipe the data if more then 10 tries at the password...... Humm someone is knocking at my front door......

  11. fpx
    Boffin

    Nothing Revolutionary

    FPGAs with embedded CPUs (hard-core or soft-core) have been available for many years. Having a CPU with an embedded FPGA is little different in principle. Certainly a Xeon is a much mightier beast than what you commonly get embedded in an FPGA.

  12. david1024

    Haven't we done this before?

    This is code-morphing the 'hard' way... (see what I did there?)

    Unlike the VLIW auto-code morph... now we can really have a go!

  13. phil 27

    Excellent, look forward to this being generally available for tinkeration.

    Looking at a fpga implementation of a zx spectrum running on a altera cyclone iv on the desk near me currently and trying to program a cpld into a sewing machine stitch regulator in another window.

    A man's got to have a hobby after all...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like