* Posts by markkulacz

13 publicly visible posts • joined 15 Apr 2014

Go and whistle, IDC. The storage world's going to hell in a handbasket

markkulacz

Its all about the cloud

Most industrial data storage growth is being provided by hyperscalars. There is falling to flat revenue for traditional on-prem IT, 40% YoY growth for cloud/SaaS.

Its not that the world need less storage. Storage is just purchased differently.

Why the USS NetApp is a doomed ship

markkulacz

Re: Baby, I'm just gonna shake, shake, shake, shake, shake I shake it off, I shake it off

Hey Trevor - Apologies, somehow my brain blended up several "NetApp is doomed" posts yesterday and I erroneously quoted you on Isilon. Embarrassing, and good for you for pointing that out.

I have a well-functioning 30 year old Port-A-Matic nail gun, by the way. Great tool, but many fastening applications still use other tools. Higher-end wood building construction is moving back to pre-drilling and coated screws, which Ive always been a fan of. Not sure if nail guns changed building construction "warfare" forever, but they are definitely part of the picture. I also still use my Grandfather's hammers, built by Estwing in Bridgeport CT over 80 years ago.

Everything else - I dont have time to respond. The weekend is too nice.

markkulacz

Baby, I'm just gonna shake, shake, shake, shake, shake I shake it off, I shake it off

"Monolithic" has been revived. Once left for dead after the open systems movement, the word now has new life getting slapped as a label on the open systems that (I thought) replaced the "evil empire of monolith" systems. Open systems shared storage is the new monolith.

Ugh.. I try not to get hung up on these "tech X is dead" articles, A sort of techno-geek version of a rap mic contest. Maybe I'm just not in the right social circles. "Tech X is dead... Tech Z is the future" is what people love to talk about at Star Wars conventions (speaking of the evil empire), Which is really odd.

To the author - Why is Isilon, a company nearly 15 years old, is still described by you (and others) as "the future"... a span of time approaching twice the active life of the Enterprise CV-6?

But if 10 years is somehow the "that tech is obsolete" moment, then explain why the Lockheed C-130 has been in continuous operation for over 50 years (https://en.wikipedia.org/wiki/Lockheed_C-130_Hercules).

Its all about the right tool for each job, which makes the best use of known technologies available at the moment, using abstractions that applications and people can use effectively.

Oh yea.. I'm with NetApp. You know the drill. Drills, remember them? Homo sapiens learned of the benefits of rotary tools around 35,000 BCE. Oddly, still useful.

Want to super scale-out? You'll be hungry for flash

markkulacz

Scale Out w/ Erasure Coding

Scale Out with Erasure Coding typically (always?) requires re-striping after node addition. Administrators are never well informed of this - More nodes means a different layout, and that leads to having to fully read in all impacted data, recalculate parity, and re-write everything. As the cluster scales, the protection required to statistically sustain the same protection level also means additional parity is required. Some of this may be handled automatically with pooling, but some is not.

Solidfire (which uses 2x protection if Im not mistaken), or traditional HDFS with 3x protection on an all-flash box, are the ways around this. Or using Isilon with a mirroring protection (not N+M). Scale Out requires mirroring, not erasure coding, to support the non disruptive scaling while also avoiding re-striping (which is not the type of activity we want to see SSDs exposed to).

Deduplication and Compression have a big impact in making all-SSD scale out economically viable. Problem there is compression is probably using a packed-segment approach (common in LFS schemes), and that increases metadata activity, snapshot overheads, and can secretly lead to a lot of segment packing/unpacking and then we are right back to additional SSD wear simply to offset the compression... which is there is decrease the SSD wear in the first place.

There is no magic in this world.

I am an employee of NetApp, and my comments do not reflect the position of my employer.

No biggie: EMC's XtremIO firmware upgrade 'will wipe data'

markkulacz

Reminds me of that shelf expansion issue they just fixed with XtremIO, where you have to remove all of the data after backing it up, flatten the cluster, reinstall the OS ... that is just one other example.

The problem is that XtremIO is still a new product, with no code written until 2009 and no real customers until 2012. The frequency of disruptions experienced on XtremIO should be no surprise.

With an architecture that introduces significantly more coupling between the controllers due to the in-memory design, upgrade complexity is going to be an enduring challenge with XtremIO. Symmetrix/VMAX avoid that by using a more simpler slice-n-dice layout; cDOT reduces this by using a more sustainable virtualized aggregated scale-out model. An enterprise product doesn't get coded up over the course of a couple of years, and EMC is just insulting an entire industry (even many of its own employees) by suggesting otherwise.

If you need enterprise and extreme performance - get a cDOT FAS solution with all flash. It is the only, 6 9s, flash-optimized (naturally by being a LSF variant), proven scaleable storage product in the world. WIth more features than anyone can grasp.

I am an employee of NetApp. My comments do no reflect the opinion of my employer.

Welcome to the HYPER-converged bubble of HYPE! Enjoy it while it's here, storage folk

markkulacz

Consider that most of the time in many hyperconverged solutions, the data that is being accessed by a VM (or compute thread) isnt on the same physical server as the VM. Most IO will ultimately go over the interconnect. There is no relationship with which physical server a VM runs on, and the server that has all (or even some) of the data within the VSAN datastore. Isilon will attempt to co-locate, but since the object is distributed across 2-20 nodes (maybe even more), the odds that any specific IO to the "datastore" is actually to a disk on the local node is LOW. The SSD read cache is a L2 cache (on VSAN and Isilon.. coincidentally...), and this is on the disk group "under" the interconnect layer. Not that this is a bad thing - but technically speaking, once the cluster grows large enough the compute is still not on the same physical node as the storage. Even when the data is SSD read cached. The only way to resolve that is to either fully mirror data objects (and not allow a mirrored copy to span more than one node), and then to limit the compute threads that access that data to running on those nodes.

The interconnect is critical to get right. We are seeing 1Gb not enough, now 10Gb not enough... so going up to 40Gb ? Isilon requires Infiniband, as it offers reliable transport. VSAN hacked in a "RDT" driver (dont worry.. you wont see it because it is hidden within the hypervisor-converged storage stack) to simulate reliable transport over commodity ethernet - my bet is that Infiniband support (maybe even preference) is in the future.

I have nothing against hyperconvergence. Massive scale-out needs highly scalable ways to disperse computation across many nodes, even allowing for elasticity that stretches into the cloud. But the driver for hyperconvergence should be realistic about what it is and is not - and adopt the architecture when it makes sense. The "co-location of storage and compute" really is a myth.

FIRST LOOK: Gartner gurus present all-flash prognostications

markkulacz

Re: Flash and Hybrid?

Hello Ashminder. What is that... at least 12 10Gbit links at 100% utilization? Somebody check my "up for 26 hours no sleep" math. What kind of all-SSD storage system was this replaced with? Regarding the need for SSD, my napkin math suggests SATA would have been a far more economical choice. A SATA HDD can handle 100 512KB writes in a second. 226 4TB SATA drives (lets just say 300, accounting for some type of parity). 600 disks if the array does smaller disk transactions. But the real bottleneck would have probably been the controller compute in this case, and the PCIe/SATA(or SAS) disk connections.

I am an employee of NetApp Corp. Thoughts and comments are my own, and do not represent the opinion or position of my employer.

VSANs choking on VMware's recommended components

markkulacz

The RDT driver is no replacement for Infiniband

Note - I am an employee of NetApp, but my thoughts here are my own and do not represent those of NetApp.

VSAN would be better off, with respect to the cluster interconnect, if it used a true Infiniband physical cluster interconnect, like Isilon. Using a layered "RDT" driver in the IO stack of the interconnect to abstract 1Gb Ethernet into something that provides the reliable data transport of InfiniBand is the root of many of the problems that the HCL is trying to fix. Even VMware has commented in the past that Ethernet over InfiniBand is superior to over Ethernet. The 10GbE interconnect will help, but Infiniband (or FC or RapidIO) is the correct way to go for a #tightly-coupled# scalable storage cluster providing synchronous block IO transactions on the interconnect. 10GbE is an exceptionally capable cluster interconnect for more loosely associated storage clusters (Clustered ONTAP), or storage clusters which tend to involve less small transactional IO (such as append+read only Hadoop).

VMware versus Nutanix: With Dell charging in, it's time to end the war

markkulacz

What have I been missing?

[Employee of NetApp, thoughts are my own]

What have I missed? I have not seen rocks being thrown between VMware and Nutanix. If anything, they seem quite open and transparent that they are each on a mission to promote server-side storage..... and the success of one could validate the other.

Nutanix vs VCE: Battle of the converged upstarts rages on

markkulacz

Publishing actual info on actual HW not possible

Unredeemed said - "While you're at it, use actual hardware in a repeatable benchmark and not a whitepaper."

Unfortunately, licensing agreements prevent the publishing of any performance or availability information on a product without written approval form the manufacturer. Most IT storage vendors have such policies in place these days, with EMC corporation being the first to defend their right to enforce this policy.

[I am an employee of NetApp, but my opinions and statements are my own and not my employer]

Dropbox defends fantastically badly timed Condoleezza Rice appointment

markkulacz

Data Condolezzation ?

Cisco reps flog Whiptail's Invicta arrays against EMC and Pure

markkulacz

Whose Domain is it anyway?

Mark Kulacz from NetApp here. My thoughts and comments do not reflect NetApp's - they are my own.

Data Domain - Interesting product. Is there another product on the market that only does RAID 6 ? Maybe one that puts everything into a 4.5MB container, regardless of RAID group size ? Limit of one raid group per shelf? Maybe one that strives for data locality? One where the metadata discourages a true dual-controller architecture? One that doesn't use per-sector 520-byte checksums (ie - knowing the RAID stripe checksum is bad doesn't help you know which sector went bad)? Is there another product that relies heavily on the NVRAM to sort inbound data - that is limited in stream count, or maybe even LUNs ? Maybe a system where the performance is limited by the CPU and must do all compression before data is committed to disk? Maybe a system that is fairly true log-structured file system and demands enough compute to keep up with the container cleaning? Gee - I just don't know. Probably not. Yea, definitely not. Ya know if you put some SSDs on a Data Domain, what would lit look like? Nevermind, Im just rambling - just back from the gym and my energy drink is fading, and really wishing the contestants on The Voice would stop trying to sing Journey songs.