back to article IBM gets handle on unstructured data

It is perhaps easy to assume that the notion of BI (business intelligence) for the masses - or 'DIYBI', as espoused here, is most likely to involve a sawn-off version of an existing BI tool - probably a mature one where the development costs have already been recovered. In practice, this is somewhat less than likely, if only …


acronyms galore without expansion

A truly wonderful article which launches straight into deep technical terms without explaining anything. My eyes glazed over faster than Steve Ballmer breaking a chair over Steve Job's head.

Not really. But it is a fluff piece.

I think the main acronym is DIYBI or "Do it Yourself" BI (where BI Is the industry term "Business Intelligence".)

Nelson is actually one of the better guys in the lab, but I do agree that this article was a boring fluff piece.

I think the point is that IBM is extending their BI vision and that since more "unstructured" enterprise data is being captured, there needs to be a way to drill down and find meaning in that data.

I think that IBM is on the right track, however, a lot of the "unstructured" data is industry if not enterprise specific, and trying to create a "standardized toolkit" is about as far as you can go. Really it would be more of a toolkit recognizin g the patterns of the "unstructured" data.

Using the Google-ing of webpages to find information as an example, the tool kit could comprise of some HTML structure knowledge and indexing scheme. It is this form of "intelligence" which is needed.

Of course IBM would need to rethink their extensibility beyond the limited capabilities found in DB2's extenders and apply this DIYBI to IDS first which has a robust enough engine to decrease the time to market and time to value....

But hey! What do I know? I'm just Gumby. ;-)


Nice thoughts, but already implemented in InfoCodex.

Thanks for the interesting article. Once again IBM is giving us a great vision about the future and how unstructured information can be searched.

InfoCodex already does all this today with the help of a linguistical database and synonym and/or similarity search across 5 languages (German, French, Italian, English and Spanish). With InfoCodex you can search for a block of text in one language and it will find you all the similar documents in the other languages as well. All of this is done without one single minute of training - because of the linguistical database that contains 2.9 Mio words and terms (i.e. "European Court of Justice" or "The President of the United States" are terms and reconized as such).

