The InfiniBand Trade Association, the champion of the InfiniBand protocol, has announced that after a year and a half of development, it's releasing the spec for its technological crown jewels - for use in its most notable rival, Ethernet. There are a number of things that keep InfiniBand networks relevant in a world …
With all the known unresolved security problems of networks, we are now supposed to happily open the front door, back door and all the windows (sic).
so i wonder if i could build a linux kernel with iwarp or roce, and use this to turn a compute cluster into a "poor mans" single system image computer (with some extra kernel support or daemons). There used to be openmosix but this project had been halted a while ago, but a more or less standard rdma stack could make it perhaps easier to implement.
re: ssi system
You're probably better off just using Infiniband directly though, as it's significantly cheaper than 10Gb ethernet and much higher performing. As you mention you're using Linux (like us), supported Infiniband drivers come with most distributions (RHEL, SLES, etc) and include RDMA. ie. RDMA over NFS, RDMA over iSCSI, and so on. You can get the "latest and greatest" drivers and software directly from the openfabrics.org download site too:
For ball park pricing of adapters/switches, check out Colfax Online:
For second hand stuff, useful for home systems, check eBay. Quite a bit on there, adapters start at roughly US$70.
Hope that helps. :)
Unless Cisco signs up for this, the solution is dead. Without the 800lb gorilla of networking, this is a 'what we might do' pipe dream.
Additionally, I believe the converged 10ge one-connection-to-rule-them-all to be all pipe dream. Once it's all connected, simple, and it just works, great.
Now I just need to figure out how to afford ports and cards that cost more than the server I'm connecting them to.
I thought Rocky was Cher's ugly kid.
The gorilla here is Intel, whose NetEffects 10GE NIC's already point in this direction.
And I for one will look hard at something which is n% as good as Infiniband for m% of the price.
Ethernet, even with RDMA, will still not scale as well as IB. Can I get 40Gb ether today for my clusters? Think not.
Already have 40Gb in place using IB. No contest.
IB and RoCE are not intended to compete
Alien Anthropologist is quite correct. IB has traditionally led Enet by several generations in bandwidth and latency and has always been the lowest price/performance.
That said, there are environments where it is simply not possible to transition to a new IB infrastructure. For example, many commercial enterprise data centers could benefit from IB's value propositions other than its raw bandwidths and industry leading latencies. For example, RoCE can bring major decreases in CPU utilization and reductions in memory bandwidith demand to an existing Ethernet infrastructure thus significantly reducing the number of servers required and reducing power consumption and footprint. These are the types of markets at which RoCE is aimed. RoCE does not compete with IB in terms of pure horsepower or price/performance. The fact that RoCE delivers 'IB-like' latencies is a very nice advantage.
Chief Scientist, System Fabric Works, Inc.
Chair, IBTA RoCE Technical Working Group
> Unless Cisco signs up for this, the solution is dead
Perhaps. Perhaps not. Cisco dropped IB (and its IB customers) like hot potatoes - refusing to even resolve existing support cases with their broken (and very expensive) IB switches with FC gateways.
Despite this, IB has grown significantly in market share, courtesy of other vendors. Cisco may be a 800lb gorilla. But it could very well be fed peanuts if it refuses to learn new tricks.
iWARP & 10GbE Clarifications
1) iWARP does not "use TCP/IP drivers". TCP/IP is implemented in the network controller and presents an RDMA interface to the host, just like IB. In fact, same APIs (e.g. OFED).
2) 10GbE switching is better than quoted ("4.5 microseconds "). There are multiple vendors who sell 10GbE SFP+ swtiches today that deliver sub-microsecond port-to-port latency.
iWARP and TCP
What Tom says is true. iWARP is usually implemented in an RNIC which effectively offloads TCP into hardware. This is how iWARP is able to deliver its latency numbers. The drawback is that you cannot differentiate iWARP traffic from conventional Ethernet traffic until the packet has been cracked at the TCP layer. At that point, if the packet is conventional Enet, and not iWARP, it is effectively forked off to another process running a standard network stack. I would guess that this would make it challenging to run both conventional Ethernet (for networking and FCoE) and iWARP for IPC over the same Enet fabric although I believe the iWARP vendors are able to do so.
RoCE, on the other hand, is identified by a unique identifier in the Ethertype field. This means that Ethernet traffic is sorted and handled at the very bottom of the network stack, before any processing has been executed on it. Conventional TCP/IP packets are sent to a TCP/IP stack and RoCE traffic is sent through the RDMA stack .
- Product round-up Too 4K-ing expensive? Five full HD laptops for work and play
- Review We have a winner! Fresh Linux Mint 17.1 – hands down the best
- Vid Antarctic ice THICKER than first feared – penguin-bot boffins
- 'Regin': The 'New Stuxnet' spook-grade SOFTWARE WEAPON described
- You stupid BRICK! PCs running Avast AV can't handle Windows fixes