back to article Are SPEC file benchmarks broken?

Recent EMC SPEC filer benchmarks have annihilated competing systems. EMC has used virtually all-flash systems to dramatically up its benchmark game, leaving disk-based rivals in the dust. Are the SPEC filer benchmarks now so unrealistic as to be worthless? If you think the question extreme just look at this pair of SPECsfs …


This topic is closed for new posts.
Anonymous Coward

??? Bizarre

I don't see how making a faster product means you are gaming the system.

Or why you need a separate benchmark for slower products. Just because they are more expensive? That's not a good enough reason.

The benchmark is exactly that. It lets you compare speeds across different implementations.

What a bizarre story...


Just proves ...

There's several kinds of lies in the computer industry:

- Lies

- Damn lies

- Marketing lies

- Benchmarks.

Anonymous Coward

It's all clear now!

Going by the recent surge in EMC advertising - such as clean spray-painting their logo and hiring advertising vans to park up near their competitors buildings I can read this SPEC benchmark stuff with my eyes closed as you see similar behavior all the time on the internets.

EMC is trolling.

EMC is trolling *hard*.................and it’s working going by the fan fare.

What better way to ruin a game you didn't want to play in the first place?


What is the point of the SPEC exactly.

Maybe the SPEC benchmarks should be broken down in to categories.

SPEC unlimited: As it says, the benchmark as it is now, latest technology with no limits, which is a useful test. This shows us what e we can reach in the future. Moores law and all, it will become much cheaper over time.

Then the

Spec $100,000 and/or SPEC $/op test: What performance businesses can reasonable expect at the price range they'll be paying.

Then we'll have two useful benchmarks.


Ummmmm... Nah.

Not really.. the price performance benchmark don't really cut it either.

The problem is that all kinds of weird lease and support contracts sneak into this.

Try to look at the cost for example of the Oracle Database on all TPC benchmarks. It's a lease for 3 years only, cause that is how much your cost picture has to cover on that benchmark.

The truth is that you have know what the F word you are talking about when you look at benchmarks and try to interpret the benchmark values.

The big problem is that Pointy Haired managers don't really understand sh*t unless it's reduced to.. *Oh* this number is bigger than that number... *Oh Oh*.

// jesper

Anonymous Coward

Pricing format may not be perfect, but it sure helps..

The pricing info is still very helpful for comparison purposes. The discounts and pricing basis must be clearly stated. Some info is *far* better than no info. There is clear incentive to keep costs under control in a benchmark configuration when publishing benchmarks that require price/perf metrics.

Real customers do not have unlimited budgets. (If you are a real customer with an unlimited budget, I really want to get to know you...)

Thumb Down

Flash or drives, it's non-volatile fast storage

The major point is that it's non-volatile storage, not how exactly this is achieved. This is what makes it different from a huge memory pool, say. Of course there should be a price element, as well as an operational element (electricity used, cooling, wear and tear etc.) in order to make a buying decision, but running the benchmark in itself it clearly fair. It's as if the people selling punch cards said that disk drives shouldn't be allowed to compete against them because they're more expensive (although they're faster).

Anonymous Coward

Are SPEC file benchmarks broken?



Gold badge

I vote for cost

I vote for the spec to have a cost attatched to it that anyone could reasonably purchase the tested system for. It could make for much more interesting benchmarks.

If I was to ignore all the systems costing multi-millions, then the vendors would have to also list benchmarks for lower-cost systems, or I will just ignore all of their benchmarks.


Nothing wrong

There's nothing wrong with the benchmark, it's just people's interpretation of that benchmark.

Just because company A built a system that holds the benchmark record, doesn't mean that any system bought from compnay A will be better than company B.

The value of benchmarks where you're not just buying off the shelf is pretty insignificant. Unless you're benchmarking the actual systems you're intending to buy then these results are meaningless comparisons between systems with (I suspect) vastly different costs.


Wrong emphasis

Typically storage benchmarks focus far too much on throughput and IOPs and not enough on latency. It's very common to have major systems where the limiting factor on application throughput, batch runtime of transactional response time is simply down to how long it takes a storage array to perform a random read (good enterprise arrays being capable of caching all writes and optimise read aheads). Flash in an array can reduce SAN random read latency times by an order of magnitude, even given the protocol overheads.

So what we need is benchmarks that are capable of demonstrating the impact ot I/O latency on typical applications, and not simply mega-large numbers based on I/O patterns that are unachievable in real workloads. As every more complex transactions are mashed up together, it's the latency issue that comes increasingly to the fore. That's not what most storage benchmarks are aimed at.

nb. not to say that gross throughput isnt' an issue at times - storage arrays often fall woefully short of the theoretical capabilities of all the disks installed due to internal bottlenecks even before flash appeared on the scene. However, it isn't always the relevant issue.

Thumb Up

Brings back memories...

A nice time to revisit:

The issue is not flash vs. disk. Systems with flash (and mixes

of flash/disk) are becoming commonplace, and will top the

charts until the next great thing comes along. Folks will just have to

read the fine print in the configs to understand what is under the


I agree 100% with the previous poster who mentioned a serious shortage of tests that

look at latency at application levels.


Simply another useful analysis input.

I come form the 'More information is better' school and welcome ALL benchmarks that include published details.

Yes it would be useful to include pricing metrics - but honestly these change from the date of publication quite rapidly, and differ internationally with local pricing affected by shipping, exchange and taxes.

I have used spec benchmarks (since LADDIS) since they began and they have ALWAYS provided a good insight into comparative performance and sizing. They also give a flavour within a vendor of their range and highlight what is needed to get best results.

So the SPEC NFS was not broken when NetApp published a Million IOPS in May 2006

AND it is certainly not broken now when EMC published half that 5 years later on a newer version of the benchmark. (yes I know they are not comparable - but for marketing a million trumps half a million).

The item above highlights the problem - the marketing use which is purely aimed at producing a winner, which is only relevant to the single biggest baddest configuration which has been lashed together.

Properly using the benchmarks provides access to information on entry level boxes, comparisons on midrange and comparisons on the use of flash, fast and slow disk and differing caching approaches including 'super cache extensions' using flash.

Despite the fuss about the 'pure flash' entry - it is simply another documented entry to compare - and gives a good reality check to what you will get from those expensive drives.

Thumb Up

EFD is the future..AND..customers can now see who's setup for it and who's not

Customers tell me this benchmark is useful because it allows them to clearly see what vendors offer storage architectures that are designed with EFD in mind versus those that are clearly resisting the inevitable transition that is taking place away from an ancient technology(disk drives) to EFD.

Remember an EFD was 40X the cost of an FC drive (when measured per GB) less than 3 years ago when EMC brought it to the enterprise storage array market. Today EFD has dropped to merely 6-8X the price. Over the next 3 years, which a new array is expected to last at least that long, EFD may only be 2X the price or who knows even less. The question is simple: Which storage systems are setup for the future and which will fight it tooth and nail because they only have an affinity for spinning rust?

Netapp is whining about this benchmark because they simply can't make EFD as a tier perform OR scale like this in their FAS systems. The max # of EFDs in a NTAP box is only 96 100GB EFDs. No support for 200GB or 400GB EFDs. Perhaps their new LSI arrays will help them better prepare for the next wave? If what I'm asserting about WAFL not playing nice with EFDs as a tier, why not clear the air and just fill their systems with to the max with 96 EFDs and run the same benchmark?


Vive' le Debate

The REAL value of these benchmarks?

The debate.

Many will remember that I have repeatedly challenged the value of benchmarks over my entire career, often making many of the same arguments being made here. Each time, I would be chastised that I had no justification for my position, since EMC did not participate in the practice of benchmarketing.

Well, now they do, with a vengeance.

And still my arguments against benchmarks remain the same - I have not changed my position. Just now, I have a bona-fide soapbox from which to proclaim them.

Glad to see there are so many fewer Deaf Ears in the anti-benchmarking community.

Welcome aboard!

This topic is closed for new posts.


Biting the hand that feeds IT © 1998–2018