Virtensys has delivered pre-production units of its server I/O virtualising VMX5000 switch to potential server and storage OEM partners. Its use enables the sharing of I/O adapters and storage among connected servers using PCIe connections. The idea is to use PCIe cabling and extend a server's PCIe bus outside the server to a …
I just love this PCIe cloud idea.
PCIe 3 is due to be ratified later this year which means each 1x channel will handle 1GBs so with a 32x channel that some serious bandwidth available compared to FC & iSCSI
Hot swap is going to be a problem - but I guess the switches might be able to handle this.
All they need to do now is standardise on 2 x 16x backplane chassis slots...Now we might actually start getting some interoperability in the datacentre. Racks of NICs, racks of graphic cards.... do the same with a HyperTransport backplane & now we can have racks of CPUs as well. At last proper hardware virtualisation!
PCI-e IOV - multiple root complexes can't talk to each other
Ahh well. So you can put your NIC in an external expansion box, rather than into the server itself. But unless it's a very special NIC that supports IOV, you can only use it from one root complex (= from only one host computer). The IOV standard simply says that the external multi-root PCI-e switch can carry traffic for multiple PCI-e bus trees that don't know about each other (like VLAN's). Each bus tree is "owned" by a particular "OS partition" running on the host computer. At least the part of the bus tree carried by the external switch runs virtualized on the switch, though I guess IO virtualization at PC server chipset level is already in the works too.
Any peripherial board that is to be shared by multiple PCI-e trees must have special "virtualization" support, to be able to keep track of several simultaneous PCI-e conversations with multiple root complexes. Not so very sexy...
I bet the HPC folks would appreciate much more if the external PCI-e switch could cater for direct root-to-root data transfers - for HPC networking purposes. Imagine DMA from one root complex to another root complex (memory to memory). This doesn't necessarily mean that a standard IP networking stack would be involved - perhaps as a slow fallback solution or for management purposes. Rather, I could fancy some dedicated low-latency mailboxing framework. It would really get the multi-root PCI-e monster fairly close to a NUMA, except that we're still speaking distinct "OS partitions" in the IOV lingo. The way I understand PCI-e IOV, such direct transfers are impossible. Maybe via an intermediate "virtual NIC" or some other peripherial along those lines (call it a DMA engine with some dedicated memory of its own) implemented within the external PCI-e switch.
The sort of bandwidth available from PCI-e at least makes very good sense for direct RAID storage attachment. Perhaps not via an additional intermediate storage bus technology (that could be useful as a slow lane for some wider-area SAN interconnects).
from IDT and PLX - some PCIe switch chips with a prospect of multi-root architectures. Maybe pre-IOV, but the docs attached give a thorough technical background, answer many of my basic questions.
Interestingly for me, Pericom has not much to offer in that vein... Unsurprisingly, neither has Intel. That LSI paper on IOV mentioned before makes me wonder what LSI has up its sleeve.