back to article Welcome to the Petabyte Club

Hype alert; hype alert; Big Data is coming our way. A new volcano has blasted its way above the surface of the marketing sea, spewing out "big data" messages in enormous flows of thought leader bullshit. What the heck is this big data thing? EMC says it's to do with handling data at the petabyte scale, where things like …

COMMENTS

This topic is closed for new posts.
Silver badge
Black Helicopters

If Theodore Sturgeon was right...

...when he said 90% of Sci-Fi was crap because 90% of everything is crap; Then 90% of the Petabytes stored in the "cloud" is crap too.

1
0
Flame

I think Theodore was on the right lines...

As 90% of Reg articles are crap as well.

Come on kids, THEIR / THERE ... really?

0
0
Silver badge
Happy

Indeed...

All those SMS messages and Twitter feeds from people making posts about their pets have to kept somewhere while the Gov sifts them looking for evidence of terrorist activity!

0
0

Depends.

Is it deduplicated? 5 million people posting the same damed lolcat is a hell of a dedupe ratio...

0
0

De-dupe

Not sure on the value of de-dupe in this space. I could understand wanting to de-duplication data in the ETL layer, although a lot of products there are DB-based rather than file-based. However, since the aim is for a 'single source of truth' via third normal form, where's the value of de-dupe in the DW? Assuming the reporting tier is ROLAP (as part of that single source of truth that everyone's striving for), there's very little data there apart from cube dimensions.

I suppose you might want limited MOLAP for performance reasons, then de-dupe that, but that ought to happen at the DB level, surely?

0
0
Grenade

Big (file) data is very real

Chris, come spend a day with Isilon and you'll see that 'big data' is very real. It's something we talk to customers about every day. We can talk about the members of our 10PB club too!

0
0
Silver badge

There's big data ..

and there's the LHC - in a different league

The Large Hadron Collider will produce roughly 15 petabytes (15 million gigabytes) of data annually

0
0
Thumb Up

RainStor's goal is to de-dupe Big Data

Great article as usual Chris and thanks for the mention.

As you pointed out RainStor de-dupes structured data without sacrificing the original form. We preserve the immutable structure of the data while magically de-duplicating the values so that the footprint physically shrinks 40-1 or more.

We are all about taking petabytes of data and reducing it to terabytes thereby allowing limitless amounts of data to be stored at the lowest possible cost.

Many thanks again for the article and the reference to RainStor.

Ramon Chen

VP Product Management

www.rainstor.com

0
0
This topic is closed for new posts.

Forums