IBM and Oracle agree about little these days, and they are coming at it from different angles, but both IT giants believe that some companies don't want general-purpose machines; they want machines tuned to run a specific stack of software for a particular kind of workload. IBM calls them "workload optimised systems" and Oracle …
Exadata: 2 Grids, 2 sets of roles.
>The Exadata storage nodes compress database files using a hybrid columnar algorithm so they take up less space and can be searched more quickly. They also run a chunk of the Oracle 11g code, pre-processing SQL queries on this compressed data before passing it off to the full-on 11g database nodes.
Exadata cells do not compress data. Data compression is done at load time (in the direct path) and compression (all varieties not just HCC) is code executed only on the RAC grid CPUS. Exadata users get no CPU help from the 168 cores in the storage grid when it comes to compressing data.
Exadata cells can, however, decompress HCC data (but not the other types of compressed data). I wrote "can" because cells monitor how busy they are and are constantly notified by the RAC servers about their respective CPU utilization. Since decompressing HCC data is murderously CPU-intensive the cells easily go processor-bound. At that time cells switch to "pass-through" mode shipping up to 40% of the HCC blocks to the RAC grid in compressed form. Unfortunately there are more CPUs in the storage grid than the RAC grid. There is a lot of writing on this matter on my blog and in the Expert Oracle Exadata book (Apress).
Also, while there are indeed 40GB DDR Infiniband paths to/from the RAC grid and the storage grid, there is only 3.2GB/s usable bandwidth for application payload between these grids. Therefore, the aggregate maximum data flow between the RAC grid and the cells is 25.6GB/s (3.2x8). There are 8 IB HCAs in either X2 model as well so the figure sticks for both. In the HP Oracle Database Mahine days that figure was 12.8GB/s.
With a maximum of 25.6 GB/s for application payload (Oracle's iDB protocol as it is called) one has to quickly do the math to see the mandatory data reduction rate in storage. That is, if only 25.6 GB/s fits through the network between these two grids yet a full rack can scan combined HDD+FLASH at 75 GB/s then you have to write SQL that throws away at least 66% of the data that comes off disk. Now, I'll be the first to point out that 66% payload reduction from cells is common. Indeed, the cells filter (WHERE predicate) and project columns (only the cited and join columns need shipped). However, compression changes all of that.
If scanning HCC data on a full rack Exadata configuration, and that data is compressed at the commonly cited compression ratio of 10:1 then the "effective" scan rate is 750GB/s. Now use the same predicates and cite the same columns and you'll get 66% reduced payload--or 255GB/s that needs to flow over iDB. That's about 10x over-subscription of the available 25.6 GB/s iDB bandwidth. When this occurs, I/O is throttled. That is, if the filtered/projected data produced by the cells is greater than 25.6GB/s then I/O wanes. Don't expect 10x query speedup because the product only has to perform 10% the I/O it would in the non-compressed case (given a HCC compression ratio of 10:1).
That is how the product works. So long as your service levels are met, fine. Just don't expect to see 75GB/s of HCC storage throughput with complex queries because this asymmetrical MPP architecture (Exadata) cannot scale that way (for more info see: http://bit.ly/tFauDA )
nobody runs “reports” that perform 5 TB of disk I/O
Are you sure? We don't ATM but that's just because it will take forever (although some of our reports are allowed to run for _days_, 5TB is not yet within sight). And if we used Google tech and stored CDRs (Call Detail Records) on 50000 cheap servers, then we could, and the said 5TB report could finish in 1sec BTW (100MB of CDR data on each server can be processed in 1 second, if the weather and moon phase is right :)).
IBM and Oracle, but where is HP?
You can argue who is better but there is one thing there is no argument about and that is HP is missing.
Itanium is contracted for only two more chips which Intel is not investing to make competitive.
Project Odyssey is the 2014 project to get x86 into the superdome chassis to replace Itanium and HP-ux with Linux.
And Meg is doing a firesale of touchpads on eBay and bragging about project moonshot which is not even their technology.
pureScale also on Linux
>> DB2 PureScale.. is available only on Power Systems at the moment – and only running AIX.
This is not correct. It is also available on Linux, but only officially supported with very specific hardware and software stacks.
The hardware in question is all currently IBM xSeries -
System x3650 M3
System x3690 X5
System x3850 X5
IBM BladeCenter HS22
The only Linux officially supported is SuSE Linux Enterprise Server.
I know of at least one company which will be running production pureScale on Linux before year end (if they are not already doing so).
Interestingly, DB2 Workgroup Server Edition licenses give you pureScale licenses included (for a modest number of CPUs : just enough to run a two node cluster), whereas it is an extra-cost option for Enterprise Server licenses. Obtaining some WSE licenses would be a very cost effective way to get your feet wet with what I expect will become an increasingly important technology.
purescale on linux...
"There was talk in October 2009, when PureScale was announced as Oracle was Sunning up its Exadata clusters, that PureScale would be ported to Windows and Linux systems, but this has not happened."
Or it did happen? It's good to check before posting the article. DB2 pureScale on System x servers:
Damning with faint praise
"And it did a pretty good job at differential diagnosis, even if it is still not quite as convincing as House".
Considering House and his high-powered team rarely get the right diagnosis until the fourth or fifth wild guess - and that their patients often have near-death experiences en route - that's not as big a compliment as you might think.
Re-distributions and head nodes
How does Exadata execute a large fact:fact table join with non co-located data?
Either fact data must be re-distributed at query time between the storage nodes, so that matching rows exist in the same location, or the data is shipped from the storage nodes to a single head/master node for collation...which defeats the object of a parallel system, surely?
Can any system that ships data to a head node for collation/sort be truly regarded as MPP???
join of non co-located data
I suspect it's impossible to do efficiently with MapReduce family of algorithms at all. For such kinds of tasks there are plenty of multimillion "superdomes" and alikes on the market...
IBM/SAP HANA APPLIANCE
One of the most interesting IBM appliances was not mentioned - the HANA in memory database developed with SAP. Since IBM uses GPFS in it's version, it ultimately will be able to do multiple engines like DB2 Purescale. It also runs on Linux. It is not available for general purpose use yet, but a version that sits under SAP BW will soon be available to be followed later by versions that run the entire SAP stack.
PureScale is available on Linux platforms...
...and has been for quite some time. If I am wrong then my blog has a pretty silly name... https://www.ibm.com/developerworks/mydeveloperworks/blogs/pureScaleOnLinux/?lang=en
Couple of IBM clarifications
Good article, Tim. A couple of clarifications given recen updates to the IBM potfolio.
- DB2 pureScale is supported on Linux on several System x models, including BladeCenter (as others have commented)
- We recently updaed the 9600 model of Smart Analytics System to 9700 (on z196) and added a new 9710 model (on z114)
- A new DB2 Analytics Accelerator is also available for these or other System z environments - it integrates a Netezza appliance transparently to the DB2 applications or users to deliver top analytic query performance while maintaining the unmatched security and confidence of System z data management
- It is important to note that the x/Linux based Smart Analytics System 5710 include Cognos Reporting and InfoSphere Warehouse software already on the server, with storage - all starting at around US$50K. Others in the market are offering products called "appliances" that do not include any software for the starting price quoted.
I will do a better job of reaching out to you directly to brief you on our next updates when they are ready.
"Oracle has sold 1,000 of these Exadata machines to date and says it will sell another 3,000 before its fiscal 2012 ends in May."
No, no, no.... If you listen to Oracle, they (as always) choose their words very carefully. They have "installed" 1,000 Exadatas, not "sold" 1,000 Exadatas. Oracle is giving away Exadata to large customers and then counting them as "installed." Often times they are wrapped into a software true-up deal. Often times they are demos.
Second, the 3,000 number is, again, an "install" number and that is based on sales pipeline reports. Oracle can make the pipeline look however it wants to make it look. They just tell their sales force that they need to have at least one Exadata opportunity in the CRM.
The number of customers that have paid anything close to full price for an Exadata is extremely small.
re: Another Clarification
You make your statements as if they are facts. They sound like sour grape FUD to me. Ask any Oracle customer how much free they have ever gotten from Oracle. I'm sure you'll get a very hearty laugh.
Oracle won't even put a free unit on site for you to try out and maybe buy. They will however let you buy a unit and then try it out. Does any other vendor really do "Buy and Try" like Oracle?
- Vid Hubble 'scope snaps 200,000-ton chunky crumble conundrum
- Updated + vids WHOA: Get a load of Asteroid DX110 JUST MISSING planet EARTH
- 10 years of Facebook Inside Facebook's engineering labs: Hardware heaven, HP hell – PICTURES
- Very fabric of space-time RIPPED apart in latest Hubble pic
- Massive new AIRSHIP to enter commercial service at British dirigible base