Ingres descendent Actian says its Vectorwise analytics database tech doesn't need to rely on a flash memory boost: it uses multicore x86 features so well it's more than twice as fast as Oracle and SQL Server, and uses server, storage and networking hardware up to 40 times cheaper - or so we're told. Actian is actually Ingres, a …
interviewed with them - nice bunch
I interviewed with them, as a developer, and was offered a place (I ended up going to Sweden instead, different offer).
It's a small team, but they're bloody good.
I thought the line of research they were taking was viable and modern.
I have no need right now to replace anything which already exists, but if I were starting a new project, I'd try this, because I think becoming familar with it so it can easily be used in the uture it would be well worth having in the toolbox.
A benchmark you missed
Since PostgreSQL also shares common root code with Ingres it would have been interesting if they'd also benchmarked PostgreSQL 9.1 on the same kit as Actian.
Re: A benchmark you missed
Every DBMS (exaggeration) borrowed from INGRES of one vintage or another. Not always actual code.
Postgres is quite nice but this is a big shift and there has already been divergence over many years.
This article is exciting to me because large cache's have traditionally been LESS than useless for at least some database cases I care about. I can believe these techniques might improve things significantly.
In my opinion INGRES suffers from the same-name effect. "I know all about INGRES, I used it in 2002" or "Postgres and INGRES shared a lot of ideas and code in 1990 so it must be similar in performance".
Probably why they did the Vectorize/Actian name change which I really didn't understand at the time.
Can't believe this without solid evidence
and that graph isn't it.
The biggest speedup for complex queries is likely to be through the optimiser, like wot rewrites queries to make them more efficient and decides what resources (like indexes) are used, and how.
I wouldn't be surprised if little attention was paid to cache in dbs because I can't see how it matters typically; if the db is not disk constrained then it is memory constrained, and "may result in the average computation against a single data value taking less than a single CPU cycle" hardly gets round that.
There's also contention due to locking etc.
Also the tpc does a lot of writes, which is very much a bottom line. You can't (I think) get round that but with faster hardware.
So it doesn't add up. It's possible they use a data structure which is better for tpc but less good in other areas (not a btree), or some other dodge. This is all too dubious.
(I know nothing about tpc benchmarking, fyi)
Strength of evidence and missing benchmarks
Well, TPC-H is more about warehousing type of queries, it is no TPC-C, and I believe the article tells only one half of the story, that about CPU use optimisation. The other half is columnwise data storage, and that helps optimise I/O.
Standard Postgres does not have column storage, so it would fare no better than Oracle, but its derivatives Netezza and Greenplum just might. Then again, Sybase IQ is on the list, with rather unimpressive results.
Also, it would be interesting to see how the CWI's own column store database MonetDB performs.
"cheapo database kit" is extremely misleading
Their software license cost MUCH more than hardware - for example for 1TB database for 3 yrs (not permanent!) license cost $225,000+$67,500 maintenance (http://tpc.org/tpch/results/tpch_result_detail.asp?id=112060401). For SQL Server it's more like $16,000 (for both SQL Server and Windows).
And since they've never published 3TB benchmark it's unclear how fast (if at all) it'll run on some bigger data sizes.
So while performance at 1TB is excellent and hardware requirements very modest, Actian's greed severely limits widespread use [and no, I don't work for any of their competitors either directly or indirectly]
Note that the benchmark says "non-clustered". If you look at the TPC-H benchmarks website you'll see that Exasol beats the pants off Actian (and everybody else who bothers to benchmark - most don't) but it's clustered so that doesn't count?! In any case, these sorts of benchmarks have little real-life value.
Two rather well-known database servers were very conspicuous by their absence from those benchmarks.
Anyway, I bet the licensing ends up costing you more than simply buying faster hardware to run PostgreSQL or MySQL (with whatever compile-time optimisations you might choose to apply) on. Particularly as said hardware will still be running just as fast when a licence comes up for renewal.