back to article Need speed? Then PCIe it is – server power without the politics

How many PCI Express (PCIe) lanes does your computer have? How many of those are directly provided by the CPU? With only a few exceptions, nobody except high-end gamers and the High Performance Computing (HPC) crowd care. Regular punters (maybe) care about how many PCIe slots they have. So why is it that gamers and HPC types …

  1. phil dude
    Boffin

    pcie vs hypertransport...

    I did some research on this a few years ago, looking into the limitations of MD scaling.

    Hypertransport has less latency than Pcie. More importantly it permits multiple-streams to exist (rather than waiting for a transaction to finish). More lanes helps to move data, but doesn't help the first byte. Useful for GPU's which I read have been too fast for PCIe for a while

    This is part of my reason for messing Xeon-phi's, to see if there are close tolerance latencies possible within the CPU's.

    The use of the 512-bit instructions is looking interesting though...

    P.

  2. Justicesays

    Going to be slow regardless

    Modern day CPU's are so fast that the 1/2 light speed (+) that electrical signals travel is pretty slow over any significant distance, which is why

    a) on-die and incorporated caches are getting larger and larger and contribute such significant performance increases

    b) Why three dimensional chip design and die stacking is a major focus of interest

    1 Gigahertz means 1 cpu cycle happens while light travels 30 cm.

    Chips run at up over 5 Ghz, so 6cm.

    If it turns out the data you needed was on a RAM chip 2 meters away, 66 cpu cycles (at least) will pass while you are waiting for it to turn up, regardless of what bus, technology or whatever you are using for an interconnect.

    1. bazza Silver badge

      Re: Going to be slow regardless

      @Justicesays,

      "1 Gigahertz means 1 cpu cycle happens while light travels 30 cm.

      Chips run at up over 5 Ghz, so 6cm.

      If it turns out the data you needed was on a RAM chip 2 meters away, 66 cpu cycles (at least) will pass while you are waiting for it to turn up, regardless of what bus, technology or whatever you are using for an interconnect."

      It's actually worse than that in practise. Light in a vacuum travels at 30cm / nanosecond, but down a fibre or through a wire light / electricity goes at about 20cm / nanosecond. That's a whole lot worse!

    2. theblackhand

      Re: Going to be slow regardless

      On a similar note:

      Latency Numbers Every Programmer Should Know

      http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html

      Some of the details are a little out (i.e main memory access is now lower than 100ns - mid-70's looks more accurate) and memory access across QPI links is around 300ns (+- 20ns).

      Main issue with QPI scaling (and any switch based scaling) is that latency quickly jumps as you chain switches to add more cores/PCIe lanes

      1. Anonymous Coward
        Anonymous Coward

        Re: Going to be slow regardless

        This slightly different presentation of those numbers drives the point home I think: https://twitter.com/PieCalculus/status/459485747842523136/photo/1

    3. IanDs

      Re: Going to be slow regardless

      You think you've got problems -- the ADCs and DACs I'm working with are clocked at just under 100GHz, which means the analogue samples travelling down a PCB track would be about 2mm apart if you could see them...

  3. bazza Silver badge

    PCIe? Yeurk!

    This is already a solved problem. Take a look at the K computer's Tofu Interconnect (more detail here (pdf)).

    Ok, it's a bit specialised, but it is a very high speed wide area CPU-CPU interconnect without any intervening nuisances like PCIe, etc. The benefit of that interconnect shows up in its Rmax/Rpeak ratio, which is not far off 1.0. That means that the interconnect works very well, there's not much latency in the system. A lot of the faster supercomputers have a much worse ratio.

    On Tofu each CPU gets 100GByte/sec to other CPUs, which is very good (Xeon manages about 50GByte/sec to memory), and it's effectively a 4+ year old piece of technology.

    Other supercomputer guys like Cray are also pretty good at this kind of thing.

    PCIe by comparison is childs play. Rather than trying to bend and stretch PCIe they should go and have a good chat to the supercomputer folk, especially Fujitsu.

    1. Trevor_Pott Gold badge

      Re: PCIe? Yeurk!

      Patents.

      Standards.

      WIiespread adoption.

      Those are the barriers. Hypertransport is faster than PCI-E as well. It hasn't won because of...

      1. bazza Silver badge

        Re: PCIe? Yeurk!

        @Trevor Potts,

        Patents, standards, yes they all cost money. But they're nothing like as expensive as reinventing it all over again.

        Hypertransport doesn't work outside the box because it was never designed to do so. Going a few centimetres across a motherboard is, from an electrical and protocol point of view, very different to going between chassis. PCIe is a bit better than Hypertransport, but it's slower and wasn't designed for inter chassis connections either.

        As Justicesays points out in the post above you cannot ignore the speed of transmission over a longer distance. Taking account of it means changing the protocol and charging how you utilise it in an application. You are more or less forced into an openmpi style approach to application design. You cannot ever hope to have a single memory address space such as an SMP architecture gives you, it's performance would be terrible. Protocols like hypertransport, QPI and PCIe (which are all about SMP really) are not very appropriate.

        1. Trevor_Pott Gold badge

          Re: PCIe? Yeurk!

          Don't be so sure that paying the patents isn't cheaper than inventing it all over again. If your assertions were correct, we wouldn't have companies reinventing interconnects over and over. Sorry mate, but which you are correct that proprietary interconnects are technologically and technically superior, that does not mean they'll win.

          I know that's hard for the tried and true nerds to grok, but it's true. The technologically superior option only wins when it is as easy and cheap to consume as an inferior option. Which is sort of the point of the article.

          PCI-E will become the mainstream intersystem interconnect because of it's ubiquity. The ultra high end stuff where it's taxpayers' money being spend will continue to be proprietary.

          1. bazza Silver badge

            Re: PCIe? Yeurk!

            @Trevor Potts,

            "Don't be so sure that paying the patents isn't cheaper than inventing it all over again. If your assertions were correct, we wouldn't have companies reinventing interconnects over and over. Sorry mate, but which you are correct that proprietary interconnects are technologically and technically superior, that does not mean they'll win."

            Yes, you are completely correct! The wheel keeps getting re-invented because someone somewhere thinks they can do it better / cheaper. Sometimes they're right, sometimes they're wrong. Personally I think that trying to create an inter-chassis interconnect around PCIe would take a long time and won't be as good as Tofu, which would be a technological pity. However that won't necessarily stop it turning into a commercial success.

            When I read this article the first thoughts I had was, had they (whoever 'they' are) even heard of the K Computer / Tofu, and if so had they ever thought to even ask Fujitsu about doing a deal? Super computers are fairly obscure, so there's a high probability that the answer to the first question is no. However if someone somewhere really, really wanted to bring this sort of thing to market quickly then Tofu is there. In a sense it already is on the market; Fujitsu will sell you a mini K computer to have all to yourself :-) I want one but haven't got one :-(

            Even Intel are considering going down the Tofu-esque route (i.e. putting the Interconnect on the CPU). El Reg covered this back in 2012. However it won't be high up Intel's list of things to do; the full benefit of such things can be realised only if significant changes to OSes and software are made. The OSes and software that everyone has today assumes an SMP environment. As that article says, SMP doesn't work well over a wide area. It doesn't sound like there's a lot of profit to be made, yet.

          2. Mellon

            Re: PCIe? Yeurk!

            PCI-E may be ubiquitous as a board level interconnect, but as bazza says, that doesn't mean it's suitable for connecting boxes. Electrically it's optimized for short distance signaling - 20in over FR4 going 14in in Gen4. I know people are trying to take it all the way to the top of the rack, but as far as I think they are using either more capable physical layers like 12G SAS or dropping back to Gen2 speed with a Gen3 PHY. Ethernet is far more ubiquitous for connecting boxes, it has good reach, and the RDMA variants have improved latency and made it easier to overlay dual submission/completion queue protocols like NVMe etc.

    2. Anonymous Coward
      Anonymous Coward

      Re: PCIe? Yeurk!

      holy crap that is awesome. Funny how things in the computer industry can seem to come out of nowhere, but it's just long-running projects formed around a very cool concept that eventually becomes mainstream.

      Current interconnect technology is pretty much brain-dead, when you consider that there is no link between software architectures and the hardware they run on - as a mind-experiment, think of the order-of-magnitude difference in wire distance, cycle count, energy consumption, and wall-clock time difference between:

      1+1 in PHP

      1+1 in Java

      1+1 in C

      digital increment

      by the time you scale these inefficiencies out to the cloud, you could be talking millions or billions of times less efficient than if the software could just correctly connect to the hardware to get what it needs.

      Thanks for posting that, I'm going to eat some TOFU!

      1. bazza Silver badge

        Re: PCIe? Yeurk!

        @AC,

        Thanks for posting that, I'm going to eat some TOFU!

        No worries. It's a cool architecture, you can even buy one if you want to!

        Current interconnect technology is pretty much brain-dead, when you consider that there is no link between software architectures and the hardware they run on

        It's not that bad. Intel and AMD have both done very well in making an old fashioned programming model (SMP) work well in a general sense by emulating it in NUMA architectures (That's what QPI and Hypertransport do). We've all got better performance without having to redevelop software or OSes. In that sense Intel and AMD have both done very well. However if you want all out performance then you have to do something different, e.g. Tofu.

        Interconnects and their switches are becoming a bit of a problem. They're now costing $billions to develop, and the markets that can support that development cost are few. Ethernet will be the only medium / long range interconnect worth having in a few years. We're already seeing HDDs with Ethernet instead of SATA. Ethernet switches will get developed regardless, in which case would it ever be worth doing, say, a competitive PCIe switch chip? I think it highly likely that computer architectures will coalesce around Ethernet and DDRx eventually.

  4. Anonymous Coward
    Anonymous Coward

    Simple fix for southbridge bandwidth limitation

    Get rid of it. No reason the southbridge shouldn't be integrated into the CPU and make it an SoC. There are no longer any high power parallel I/Os, so there's no reason it should be a separate chip. Intel already revs the southbridge at the same pace they rev the CPUs, time to make the x86 an SoC.

    1. Trevor_Pott Gold badge

      Re: Simple fix for southbridge bandwidth limitation

      Intel is already moving there. This is why they are soldering CPUs onto motherboards for everything but high-end workstations/gaming rigs and servers.

    2. Malcolm Weir Silver badge

      Re: Simple fix for southbridge bandwidth limitation

      " time to make the x86 an SoC.

      Yeah... they could call it something exotic,like Atom Z2460 or....

      Oh, wait. That was from 2012.

      Latest x86 SoC is the Xeon-D. CPU + 10GigE, what's not to like?

      1. Trevor_Pott Gold badge

        Re: Simple fix for southbridge bandwidth limitation

        The limited amount of RAM you can connect to that SoC. That's what's not to like.

        1. Anonymous Coward
          Anonymous Coward

          Re: Simple fix for southbridge bandwidth limitation

          Why would an SoC be any more limited in RAM than a socketed CPU? If you use the same number of I/Os for DRAM, the max capacity will be identical. I'm not talking about a SoC that includes RAM on chip, that's not practical except for non-embedded use.

          1. Trevor_Pott Gold badge

            Re: Simple fix for southbridge bandwidth limitation

            Because the company that ships the SoC decides to artificially limit the amount of RAM you can attach to their lower end (SoC) CPUs in order to make you pay for the much (much) more expensive ones if you want a usable amount of RAM.

            As for the "why" of that, well...greed.

  5. Ian Michael Gumby
    Boffin

    PCIe won't work well outside the box...

    I'm not saying that expanding PCIe as an interconnect isn't a good idea, but that trying to extend it outside the box may not yield the desired effect.

    However, that doesn't mean you can't redesign the box. Imagine creating smaller cards that have CPU, Memory and RRAM storage. These cards could then be inserted in to a chassis with a larger bus. That contains an interconnect and power. Then you would need to create a daughter card for external communications including video and input to the daughter cards.

    You could then take the same card and put it in a smaller frame so that it could operate as a single PC.

    Note that if the RRAM proves to be relatively the same speed as regular RAM, you just need the CPU and the RRAM.

    That would be the future.

    1. Trevor_Pott Gold badge

      Re: PCIe won't work well outside the box...

      Except it would be a future that belonged to a single vendor, who owned the RRAM patents. You're describing HP's lock-in fetishist utopia. No thanks.

      1. Ian Michael Gumby
        Boffin

        @Trevor Re: PCIe won't work well outside the box...

        Actually Trevor, HP doesn't hold all of the cards.

        There is a company called Crossbar which holds some patents and interesting technologies.

        Their goal is to be 1+ TB on 1cm^2 (postage stamp) die.

        And there's a group down in Austin TX.

        Clearly you have a closed mind and haven't thought about this long enough. ;-)

        1. Trevor_Pott Gold badge

          Re: @Trevor PCIe won't work well outside the box...

          Not at all. There are many potential successor technologies to flash and/or RAM. None of the likely candidates would appear to be the sort of thing that will affordable by the mass market. (Oh, and I am entirely aware of Crossbar).

          The argument is no different than that of Nutanix versus VMware. Where should the power in the relationship rest: with the customer, or the vendor?

          If you're happy to hand your genitals over to ViceCo Inc then, by all means, go buy something for where there is only one vendor. Maybe the commercial benefit you see from using that technology will be greater than the cost of licensing and implementing it. I doubt it, however.

          Unless the proprietary technology is dramatically superior to the more pedestrian alternatives it won't get adopted by the mass market. Lock-in is a bitch, and value for dollar matters. This is why, despite all the problems with existing standards entities, standards (and FRAND) still matter.

          1. Ian Michael Gumby

            Re: @Trevor PCIe won't work well outside the box...

            So you don't mind Intel? having a hold on the market? ;-)

            1. Trevor_Pott Gold badge

              Re: @Trevor PCIe won't work well outside the box...

              I mind Intel owning the market a lot. Unfortunately, we've collectively lost that battle already. Intel succeeded in killing AMD in the face with a jeep, and AMD is not like to recover. ARM is a joke for server workloads.

              So, okay, we lost that. Do we need more monopolists in our datacenter?

          2. Anonymous Coward
            Anonymous Coward

            Re: @Trevor PCIe won't work well outside the box... FRAND

            FRAND... unless the IP holder is not part of a group, in which case FRAND has no meaning unless you are will to go to the beak. The holder does the usual, NDA on the charges as part of the buy in by anyone, so they can tighten the vise just as far as they think they can for the new supplicant regardless of what the previous bloke paid.

            Even seen the clever ploy-- its FREE FREE FREE (fine print: only for the first three years), then the price has be re-negotiated. Right. After spending to build in the IP, the IP holder can charge whatever and you have to pay or titsup the product. What was that about avoiding single supplier choke holds?

  6. GrumpenKraut
    Happy

    Just thanks for the fine article.

    Learned a few things, thanks!

    1. Trevor_Pott Gold badge
      Pint

      Re: Just thanks for the fine article.

      (beer)

      1. Ian Michael Gumby

        Re: Just thanks for the fine article.

        Hey!

        While I don't agree with your opinion, I do appreciate the fact that you do respond in the comment section....

        1. Trevor_Pott Gold badge
          Pint

          Re: Just thanks for the fine article.

          (Additional beer)

          1. Toidi
            Pint

            Re: Just thanks for the fine article.

            I think I own Trevor a beer too... So, next round on me.

  7. Flocke Kroes Silver badge

    USB Ports/Interfaces

    Port: something you can pug a cable into

    Interface: some silicon the multiplexer connects to one or more ports.

    Type lsusb (or whetever the MS equivalent is) and the result starts something like this:

    Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

    Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

    Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

    Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

    Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

    So, this computer has one USB2 interface, and 4 USB1 interfaces. When you connect USB1 devices to ports, each one gets assigned a USB1 interface by the port multiplexer until there are none left, then any further USB1 devices share the USB2 interface, trashing its bandwidth. Likewise, USB2 devices get assigned there own USB2 interface until there are none left, and then they have to share the same interface.

    USB3 is a bit more troublesome. Computers come with separate USB2 and USB3 ports. I do not know if the multiplexer can assign a USB2 interface to a USB3 port. USB3 devices understand USB2, and will work slowly on USB2 ports. USB3 ports can speak USB2 or 1 to slow devices. A modern machine may have a few USB2 interfaces, but only one USB3 interface - even if it has two or even three USB3 ports. USB3 eats up to 10Gb/s per interface (5Gb/s x full duplex), regardless of the number of ports. If the south bridge is limited to 20Gb/s, I can see why people are not rushing to release chips with two or more USB3 interfaces.

  8. Justin Clift

    IB tech...?

    "You see, in the HPC world, applications just don't fit into the RAM you can cram into a single node. Many HPC setups are hundreds, if not thousands of nodes lashed together into a single supercomputer, with each node being able to address the memory of each remote node as though it were local.

    Our existing networks – for example Ethernet, Infiniband and so on – simply weren't designed for this. Believe it or not, this is not a new problem."

    Um, isn't this exactly what Infiniband *was* designed for? Expressly for interfacing CPUs and peripherals in lots of computers together (HPC audience). Back then, it wasn't using PCI-E, and was more intended to interface with things directly.

    Intel's purchase of QLogic's IB tech a while back is interesting, as they don't seem to have put the people onto developing new IB versions. Instead they seem to be integrating the IB concepts into other parts of their tech and going in a different direction (Omni-Path). It sort of seems like they'll try to integrate it directly onto Xeon cpu's, and not have the PCI-E bus be in the way (unsure though).

    1. Trevor_Pott Gold badge

      Re: IB tech...?

      Yes and no. Infiniband was never designed to handle the kind of load that modern supercomptuers are putting on it. It was also not designed to lash together as many nodes as seems to be the requirement these days. While it is way better than Ethernet for the task, Inifniband was designed for an earlier era of supercomputer and there are some pretty big changes it would have to go through to stay relevant today.

      1. Anonymous Coward
        Anonymous Coward

        Aries Interconnect

        It's been three years since this:

        http://www.theregister.co.uk/2012/04/25/intel_cray_interconnect_followup/

        Has Intel announced/done anything with the Aries interconnect technology they acquired from Cray, aside from things like the Aurora supercomputer?

  9. roger stillick
    Linux

    IBM Power 8 or better interal busses...

    address this quite nicely... sorry Intel, my Haswell chipped Chinese Laptop has the port... nothing fast to hook to it... so it's a workstation-nothing more... Apache server ?? sure, just like my long gone IBM PC.

    IMHO= hubris heard here does not equal a server solution... RS

  10. Uncle Ron

    The "Printer" Icon

    Boy I sure miss the little printer icon that used to be on the top of Reg stories. Especially one like this with 5 pages. I know they all want page clicks and looks or whatever they call it, but this is a great article and I'd like to print it for future reference. I don't look at El Reg at all, as much as I used to, because of this... Sad.

    1. Trevor_Pott Gold badge

      Re: The "Printer" Icon

      http://m.theregister.co.uk/2015/04/14/pcie_breaks_out_server_power/

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like