new pcie is nice
Balances all that processor, storage, RAM speedups we have been getting. But eventually the data has to leave the datacenter. We are going to need a very fat pipe.
The near-ubiquitous PCI Express interconnect – aka PCIe – is finding its way into mobile devices, working its way into cabling, and is on schedule to double its throughput in 2015 to a jaw-dropping 64 gigabytes per second in 16-lane configurations. So said Ramin Neshati, the Marketing Workgroup Chair of the PCI-SIG, when The …
Q: PCIe technology is in every server, workstation and laptop PC. Why is PCIe over M-PHY a suitable I/O technology for tablet and smartphone devices?
A: As a broadly adopted technology standard, PCIe benefits from several decades of innovations with universal support in all major Operating Systems, a robust device discovery and configuration mechanism, and comprehensive power management capabilities that very few, if any, of the other I/O technologies can match. PCIe technology has a flexible, layered protocol that enables innovations to occur at each layer of the architecture independent of the other layers. In this way, power-efficient PHY technologies, such as MIPI M-PHY, can be integrated with the familiar and highly functional PCIe protocol stack to deliver best-in-class and highly scalable I/O performance in tablet and smartphone devices.
-taken from http://www.pcisig.com/news_room/faqs/FAQ_pci_express_and_m-phy/
One important aspect is that PCIe supports enumeration. So your operating system can find out what hardware there is, so you don't need an individual image for every platform. Eventually this could lead to a free choice of operating systems on mobile device and hence the same speed of innovation you are used from the PC world after Linux became popular.
PCI-e over external cabling has been on the market for a couple of years - for external interconnect to DAS RAID boxes and for additional PCIe slots via external expansion boxes, the latter sometimes combined with PCI-e IOV. Besides IOV, there are also simple rackmount switches to connect multiple external expansion boxes to a single "external PCIe socket" on a host computer. Ever fancied an industrial PC with some 30 PCI-e slots? Well it's been available for a while... As for PCI-e generations, in the external DAS RAID boxes I've seen PCI-e 1.0 and 2.0 (Areca). Even the connectors seem to be somewhat standard (no idea about their name). The interface between a motherboard (PCI-e slot) and the cable is in the form of a tiny PCI-e expansion board - interestingly it doesn't carry a switch, it's just a dumb repeater, or a "re-driver" as Pericom calls the chips used on the board. Apparently the chips provide a signal boost / preemphasis / equalization for the relatively longer external metallic cabling.
As for HPC: apart from storage and outward networking, HPC typically requires low-latency memory-to-memory copy among the cluster nodes. The one thing that to me still seems to be missing, for bare PCI-e to successfully compete against IB in HPC, is some PCI-e switching silicon that would provide any-to-any (matrix style, non-blocking) host-to-host memory-to-memory DMA, that combined with a greater number of host ports. IMO it wouldn't require a modification to the PCI-e standard: it would take some proprietary configurable swiching matrix implemented in silicon, providing multiple MMIO windows with mailboxing and IRQ to each participating host, combined with OS drivers and management software, that would interface to the HPC libraries, would take care of addressing among the nodes, and maybe provide some user-friendly management of the cluster interconnect at the top end.
The switches currently on the market can do maybe 4 to 8 hosts of up to 16 lanes each, and the primary purpose is PCIe IOV (sharing of network and storage adapters), rather than direct host-to-host DMA. Check with PLX or Pericom. Perhaps it would be possible, with current silicon, to do the sort of a matrix DMA interconnect in a single chip, to cater for about 8 hosts of PCI-e x8 or x16. That's not too many nodes for an HPC cluster. For greater clusters, it would have to be cascadable. Oh wait - that probably wouldn't scale very well.
As for PCI-e huddling with the compute cores: the PCI-e actually has an interface called a "PCI-e root complex" or a "host bridge" to the host CPU's native "front side bus" or whatever it has. The PCI-e is CPU architecture agnostic - and has some traditional downsides, such as the single-root-bridge logical topology. No way for a native multi-root topology on PCI-e - that's why we need to invent some clumsy DMA matrix thing in the first place. And guess what: there's a bus that's closer to the CPU cores than the PCI-e. On AMD systems, this is called HyperTransport - AMD actually got that from Alpha machines, but that was probably before PCI-e even existed. Intel later introduced a work-alike called QPI. The internal interconnects between the cores in a CPU package (such as the SandyBridge ring) are possibly not native HT/QPI, but these cannot be tapped, so they don't really count. So we have HT/QPI to play with: these are the busses that handle the talks between CPU sockets on a multi-socket server motherboard. Think of a cache-coherent NUMA machine on a single motherboard. And guess what, the HyperTransport can be used to link multiple motherboards together, to build an even bigger NUMA machine. There are practical products on the market for that: a company called NumaScale sells what they call a "NumaConnect" adaptor card, which plugs into an HTX slot (Hypertransport in a connector) on compatible motherboards. Interestingly, there is no switch, but the NumaConnect card has 6 outside ports, that can be used to create a hypercube or multi-dimensional torus topology of a desired dimension.
The solution marketed by NumaScale uses HyperTransport to build a ccNuma machine = it keeps cache coherence across the NUMA nodes. There's a somewhat similar solution called HyperShare that seems to use a cache-incoherent approach... Either way it seems that memory-to-memory access between the nodes is an inherent essential feature.
I've never heard of Intel making its QPI available in a slot. PCI is originally an Intel child, if memory serves. Maybe that's a clue...
Makes me wonder how much sense all of this makes in terms of operational reliability and stability. Are the NumaScale and HyperShare clusters tolerant to outages? Can nodes be hot-swapped in an out at runtime? One part of the problem sure is support for CPU and memory hotplugging and fault isolation in the OS (Linux or other) - another problem may be at the bus level: how does the HT hypercube cope when a node or link goes out to lunch? Makes me wonder how my theoretical PCI-e "matrix DMA" solution would cope with that (perhaps each peer would appear as a hot-pluggable PCI device to a particular host, with surprise removal gracefully handled). Ethernet sure doesn't have a problem with that. Not so sure about IB.
Thanks for your response - I don't have hands-on experience with IB so I didn't know there. I did have a feeling that with IB being so omnipresent in HPC, "node hot swap" would probably work well.
PCI-e is also inherently hot-swap capable and so is the Windows driver framework handling it - just my theoretical matrix crossconnect thing makes node hot-swap a bit more interesting :-)
And yes I'd really love to know how a "homebrew" ccNuma machine would cope with a node outage. If this can be handled, what OS is production-capable of that etc. Except I guess I'm off topic here, WRT the original article...
...for simple peripheral links, it's bob-on - mice, keyboards, etc.
Anything data intensive though and it's lack of DMA (I think?) makes it CPU intensive and a bit scabby - large USB drives etc.
However, could you really engineer a 'simple' version of a PCI-e interconnect to deal with that sort of thing? Particularly at the time when USB came around, I think not.
These days? Well, maybe....
Be interesting to see if this takes off with consumers, or if it just stays in the enterprise space.
It is, for sure. Computing is just now starting to get really interesting. We thought it was all very exciting to go from 4mb memory to 32 mb for only $1,000 per 32mb stick, and from 33mhz cpus to 90mhz, but this kind of transfer tech that does 64 GB per second: amazing. It's hard to keep up with the changes, they're coming so fast.
That's right, Intel is already shipping Oak Trail sans PCIe however the Federal Trade Commission has ORDERED Intel to continue PCIe support to at least 2016. You can read this in the Intel vs AMD settlement agreement.
Intel reasoned to FTC that mobile devices have no need for high speed interconnects. FTC "allowed" Oak Trail to be released but future silicon must support PCIe.
Without PCIe nVidia discrete GPU's do not find their way on Intel motherboards. Intel wants to go it alone without nVidia discrete cards. The believe that by 2016 on-die cpu/gpu soc will make the discrete gpu obsolete for all but the most demanding rendering. And I'm sure that Intel graphics MIGHT be up to speed by then.
PCIe in mobile devices? Why bother if it amounts to unnecessary processing overhead. True, Linus has commented that ARM SoC designers should make all the busses enumerable (PnP fashion), which combined with low pin count points to PCIe, rather than PCI... but Linus has his specific background and aspect. He's not exactly a mobile phone hardware developer.
Even SoC's for tablets are pretty much single-purpose.
Generic support for peripherials is needed in the industrial/embedded segments.
As for desktop / full-fleedged notebook machines... if Intel thinks that its own GPU is strong enough, why not let it skip those 16 lanes of PCIe straight from the CPU?
As for servers, a beefy PCI-e x8 is certainly useful.
I'm sure Intel knows better than to shoot itself in the leg. They'll keep PCI-e around where applicable and useful: multiple x1 channels from the south bridge and maybe a couple lanes straight from the CPU socket in servers and high-performance desktops. I haven't yet noticed any future PCI-e replacement for x86 peripherial expansion.
Biting the hand that feeds IT © 1998–2019