Virtensys has the capability of being profoundly disruptive three times over; to network interface vendors, to switch vendors and even to server vendors. Who would have thought extending the internal PCIe bus outside servers could be like opening holes in three dikes simultaneously? The technology is an I/O-aggregating box, the …
As described, this sounds like several single points of failure combined to make one big Single Point Of Failure. Or is there some fail-over scheme that the sales droids and hacks either forgot about or didn't understand?
Can anyone say...
...Single. Point. of. Failure.
So the NIC fails and not 1, but 16, servers can be down at the same time...
This extended/external bus doodad fails and your whole datacenter can be offline!
2 or 3 of them
You could easily add redudancy and still come out on top. I'd be more interested in what the downsides are in terms of latency etc. There must be an overhead in servicing 16 servers.
Latency to ship PCIe info across a presumably medium-length wire ought to be fairly high (compared to on-the-board channels). Slicing up the hardware interfaces does make some sense though, considering 10G ethernet can be hard-pressed to be saturated, let alone 100G in the near future. Shuffling RAM off the board and stuffing that in a nearby box seems a touch more daft to me. Serious OCers tweak timings to maximize the "laggy" link between CPU and RAM as it is. Pushing this over a connector, across a wire, into a processing unit of some sort, and back just sounds slow. Perhaps this could be supplimented a bit by a mega L3 cache or somesuch.
As for Single Point of FAILure, if this was used to host VMs running in some form of cluster enviornment, with a NAS-based setup, you could essentially afford the risk of an entire blade cluster going down, due to the "Live VMotion"-esque capabilities of VM systems. With all the cost and power savings (not to mention supporting hardware such as switches, rackspace, etc), one could afford to practically double their hardware to provide hot failover systems without incurring increased cost over a traditional solution. More computing power, with the same amount of wattage and (arguably) less money, with the potential for BETTER redundancy/failover. No wonders the slant in this story is toward extreme knife-in-the-back hype.
Ammaross Danan is correct. But I would go further.
This single 10G Ethernet port won't replace sixteen 10G NICs. It'll replace sixteen 1G NICs. If a server has been upgraded to a 10G NIC in the first place, chances it it *needs* that kind of bandwidth. So this isn't nearly as painful to the NIC/switch vendors as the article suggests.
Putting DRAM on the far side of a PCIe link is totally boneheaded. CPU vendors are now putting memory interfaces on-die whenever performance is required. Even the new Atom has an on-socket DRAM interface, though this is for power savings (eliminating the power-hungry 945 northbridge) rather than performance. DRAM should be considered effectively part of the CPU in system design.
The only part of this that makes any sense is the shared disk storage. I wonder how wide the PCIe link to each server is? I hope it's at least x4, since only then will it compete effectively with 10G Ethernet and iSCSI. This assumes that there are enough disks and bandwidth demand to use that kind of performance, and that the application can make effective use of a shared storage pool. And that the storage management of this new box is up to the standards set by existing storage vendors - storage reliability and disaster recovery is critical.
Can it be retrofitted to existing servers? If not, I would dismiss it out of hand unless I was building something from scratch, and then I would want a thorough demonstration including disaster scenarioes.
Great article and excellent comments from every one. Let me shed some more light on some of the points mentioned …
In reference to redundancy (all of you touched on this subject)
In current deployments, a single Ethernet or FC top-of-rack switch is a single point of failure, this is why multiple Ethernet and FC switches are installed to provide redundancy and independent network connectivity paths. Virtensys fully supports these redundant deployments models as the VIO-4000 switches replace the traditional Ethernet and FC switches. Two or more VIO-4000 switches can be deployed in either active - active or active - passive configurations to provide servers with a full redundant path and eliminate any single point of failure. Furthermore, VIO-4000 switches also provide independent network and storage connectivity paths. Each adapter within a VIO-4000 switch is dual ported and each of the two ports can be placed on independent network path. Consequently, a single server can be given multiple, dual-port virtual I/O adapters that can be configured for greater Ethernet or Fibre Channel resiliency.
In reference’s to Gordon’s question regarding latency and overhead
The VIO-4000 switches perform the virtualization of the I/O adapter in hardware using switching and virtualization silicon designed by Virtensys and architected to support 16 servers simultaneously accessing the I/O adapters. The silicon switching is non-blocking and supports wire speed transfers. It has more than 1.5x the bandwidth required to support the 16 servers. The latency is also extremely low – lower than an Ethernet or IB switch. Doing the virtualization in hardware also enable the VIO-4000 switches to sustain the line rate of the I/O adapters with negligible overhead.
In reference’s to Ammaross and Jonathan’s point regarding the external memory
As you mentioned, the external memory is initially intended to be used as an extra / L3 cache memory for the CPU and servers. The density supported can be quite large.
In reference’s to Jonathan’s points
Even when servers are directly populated with 10G NICs, the bandwidth to the corporate networks is limited by the top-of-rack Ethernet switch’s uplink capability which is usually 2 or 4 10GE links, resulting in an uplink bandwidth of 20-40GB which is divided by all the servers, so servers don't really get 10G bandwith to the network. The Virtensys VIO-4000 switches populated with 2 dual-port 10GE adapters will provide the same uplink bandwidth at a fraction of the cost. The switches will also provide 20Gbps PCIe links to each server (double the bandwidth that the 10GE links provide).
The PCIe links between each server and the VIO-4000 switches are x4 lanes and support both Gen 1 (2.5gbps/lane)and Gen 2 (5 gbps/lane) PCIe speed providing the servers with up to 20Gbps of bandwidth to the I/O Virtualization switch.
The VIO-4008 switches use the LSI MegaRAID HBAs and use the Storage Management standards and capabilities established by LSI without any modifications.
The VIO-4000 series works with existing servers without requiring any modification to the server OS, I/O device driver or application. A “dummy” PCIe adapter is needed inside the server to convert the PCIe edge-card connector to a PCIe cable connector. It is the also the same when using the VIO-4000 switches with new server deployments.