Scale Out is Necessary for All-Flash Arrays
There are a couple of challenges with All-Flash Arrays which drive scale-out. The first is controller performance. If you consider the typical modular SAN array controller, such as the EMC VNX2 and similar products from other storage vendors, these array controllers are based on dual-processor Intel Xeon system boards. The highest-end version the VNX2 controller can support 1,500 hard drives--seven cabinets of disks. But even with 15,000 RPM drives, this is a back-end disk IOPS of about 300,000. At 20,000 IOPS per enterprise SSD, fifteen SSDs will provide 300,000 back-end IOPS.
So when you look at All-Flash Arrays which use a similar approach of dual, fail-over, high availability controllers based on two-socket Intel Xeon system boards, you can see the controller is the performance bottleneck.
If you add to this inline efficiency features such as deduplication and compression, those features increase effective capacity and reduce SSD wear, but also require controller resources, which further limits the performance scalability of the dual-controller array.
Finally, inline deduplication requires keeping deduplication metadata in memory, and either logged into NVRAM (most AFA vendors) or protected via UPS (in the case of XtremeIO). The inline deduplication metadata database will limit capacity scaling. More capacity means more metadata which means a bigger database which means more RAM and more NVRAM. This is why many AFAs with inline deduplication have a fixed capacity for a controller pair.
This capacity limitation means the only way to scale capacity is through horizontal scale-out. If a contiguous data storage space is desired, it becomes more complicated. It could be simpler without that contiguous space, but if an app needs a lot of space, host base volume management will be required to concatenate discreet flash array capacities.
The previously mentioned fact that controllers are the performance bottleneck, to scale controller IOPS with the back-end capacity, All-Flash Arrays have to scale out.
There is a happy medium. If a dual-socket Intel Xeon based controller can support, say 500,000 IOPS, and that is all a customer needs, but they need more than one 25-SSD disk tray of capacity, they can scale to multiple disk trays, but the overall IOPS will remain the same. Some scale-out All-Flash Arrays, such EMC's XtremeIO, do not support more than one disk tray per controller pair, while others, such as Kaminario's latest, allow for scaling of capacity to two disk trays per controller pair.
SSD capacities are increasing dramatically (faster than the Intel Xeon CPU performance increases). All-Flash Array vendors will choose SSD capacities which offer the best $/TB, and build arrays around them. The Intel Xeon CPU will continue to be a bottleneck. Because of this, I think the one disk tray to dual controller archetype and scale-out will be the norm for All-Flash Arrays for some time. The other alternative is SolidFire's design, which, because of its aggressive CPU to SSD ratio, seems to assume very high density SSDs are coming. The greatest risk to Pure Storage's design, which is very well-suited to the "happy medium" described above, but will be pressured to offer a scale-out solution as SSD densities increase. Alternatively, Pure could build bigger storage controllers based on four-socket Intel E7 CPUs.