In this series we're looking at the myths and legends of the database world; some turn out to be true, others false. This myth is about why we use OLAP. If you follow the Inmon model, you use a relational data warehouse for flexibility and OLAP cubes in the data marts for the speed. On the other hand, if you follow Kimball, you …
Stating the Obvious
I cant help but feel that this article was a very long winded way of stating the obvious!
Where does the myth come from in the first place? I've never heard of it after years of working with OLAP.
The only myth I've heard is the exact opposite.
You sometimes get that special breed of ignorant Relational DBA, expressing shock about how slow the batch cycle time of an OLAP product can sometimes be. (convieniently ignoring the fact that the same can be true of a Relational DB).
You then have to give the standard 'advantages/disadvantages of multidimensionality lecture'.
Interesting that Cobb was at Hyperion for a bit way back - didnt know that.
Obvious to some, certainly.
If you fully understand OLAP then, by definition, you must understand that this myth is invalid. No question.
However not everyone does. My experience is that people who are currently learning OLAP often believe that it is just about speed; and it is easy to see why.
OLAP has two main advantages; speed and the ability to present data in a multi-dimensional way. If you have been bought up on relational databases then the first advantage makes immediate sense. You know that users always want increased performance, so this advantage sticks out like a sore thumb. The bit about multi-dimensional access to the data…… well, until you actually get your brain around what this means (and that often takes considerable time) then it is difficult to see this as an advantage. Indeed, if you are familiar with relational tables and are happy using SQL to query them, then a whole new data structure, with a whole new language (MDX), can seem like a positive disadvantage.
I guess it all depends on the people that you meet. I certainly meet people with this view in the training work I do but, even in my consultancy work, I still meet people who subscribe to this myth.
One acid test is to search the groups for a phrase like “OLAP is about”. You’ll certainly find people there who believe (and are telling others) that OLAP is about speed and aggregations.
OLAP is not just MOLAP!
Like very many articles on the subject, you use the acronym OLAP when you are actually talking about MOLAP i.e. multi-dimensional OLAP. There are other forms of OLAP such as relational OLAP (ROLAP) and hybrid OLAP (HOLAP) out there as well...but you knew that :-)
In addition to presenting a conceptually multi-dimensional view of the data to the business user, the MOLAP approach differs from the relational approach in two key areas - data aggregation and the the (normally partial) pre-calculation of query results.
It is as a result of aggregation and results pre-calculation that we get the speed up when compared to a relational system. As you point out this speed up is traded against the query flexibility offered by the 'go anywhere' relational schema.
The majority of queries are from 'farmers' rather than 'explorers', and this group can normally be provided with access to a series of MOLAP cubes that support the predictable KPI style reports that they run repeatedly. Explorers are more typically provided with access to detail level data in the relational schema.
Interestingly, it is OLAP users themselves that cite consistently fast queries as by far and away the most important benefit of delpoying a MOLAP solution. In fact, it is cited as more important than all other benefits combined, if the OLAP Report is to be believed, and I have no reason not to!
Re: OLAP is not just MOLAP.
> but you knew that :-)
Yes I did and, without wishing to be contentious, I don’t think I was using OLAP to mean MOLAP. All three flavours of OLAP are inherently multi-dimensional, at least in terms of the way in which users can visualise their data. The proof of that is easy. If you use a front-end tool (such as ProClarity) then the user experience is absolutely identical (dimensions and measures) no matter which flavour of OLAP is used. Where MOLAP, ROLAP and HOLAP do differ is in terms of physical implementation; but that is typically completely transparent to the user.
Part of the difficulty in providing even multidimensional data is that most consumers of reports are not quite accustomed to navigating multidimensionally. The real power of multidimensional analysis is that it can lead analysts not only to answers but master data answers.
For example, If you have a customer dimension that breaks out your customers by industry, some multidimensional analysis might eventually show that your most expensive customers are in health care and transportation. The real power of OLAP is that you can now make the suggestion to breakout health care and transportation to smaller sectors than currently exist in your transactional data. Which is to say that you should evolve your dimensions as conditions change.
A proper OLAP system allows for a rapid evolution of master data based upon the results of prior analysis.
Very few implementations get so far as to manage the master data question from the results of OLAP. It is more likely that companies will throw all attributes willy nilly into a huge ODS and then 'boil the ocean' via data mining. This can yeild good results, but nothing really compares to human analysis and feedback. This is the real strength of OLAP and why I think it will continue to be useful in the future.
Calculations and slices
It's not just about speed, it allows for calculations, calculated columns (members) plus arranging reports by dragging and dropping dimensions.
It's very interactive.