Data is big business. These days they've even started calling it “Big Data”, just in case its potential for unbridled magnitude had escaped anyone. Of course, if you have Big Data you need somewhere to put it. Hence storage is also big business. On the one hand this is a good thing, but that's just because several of my …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Thursday 21st March 2013 08:13 GMT Anonymous Coward

Stopy crying.

Stop crying about the cost of big data.

I don't know how many times I hear the crying at work about how a 2 terabyte server cost too much money. Hell, I have 2.5 terabytes on my server at home, and If I can aford it, I don't think some multi million dollar corporation that makes money off that data should be complaining.

0 2
1. Thursday 21st March 2013 09:02 GMT tony
  
  Re: Stopy crying.
  
  "I have 2.5 terabytes on my server at home, and If I can aford it"
  
  Out of interest, is that available space or capacity?
  
  1 0
2. Thursday 21st March 2013 09:28 GMT JimC
  
  But which is more expensive
  
  The hardware for extra storage, or the staff time required to reduce the hardware requirement?
  
  No arguments when it comes to 275 almost identical windows server images and the like, but when it comes to the other stuff...
  
  I for instance, keep all non-trivial email, and its saved an awful lot of effort on numerous occasions when I've been able to go back and find out what why or who on decisions made previously... If those who will not learn from history are condemned to repeat it then it makes sense to have that history available.
  
  1 1
3. Thursday 21st March 2013 09:31 GMT Anonymous Coward
  
  Re: Stopy crying.
  
  And what is the IOPS of that setup, and is that RAID10 or just RAID5?
  
  0 0
4. Thursday 21st March 2013 13:07 GMT Anonymous Coward
  
  Re: Stopy crying.
  
  Yeah and I have about 9TB at home, but it's comodity SATA. I certainly couldn't afford 9TB of storage at the costs that a proper disk array such as a VMAX would incur, even the SATA disks in a VMAX.
  
  That said, the fact that you specify the server as having disk suggests that you don't do SANs. You've probably not considered port costs, switch costs, ISL costs, redundancy, duplication of the whole SAN fabric, RAID/Mirroring, etc. etc.
  
  Big data is big money and multi million dollar operations become multi million dollar operations by not chucking money all over the place on a whim, rather by questioning their costs.
  
  0 0
  1. Thursday 21st March 2013 15:32 GMT Drummer Boy
    
    Re: Stopy crying.
    
    Exactly right - show me a properly costed business case every time.
    
    No compelling business case, no money for big data!!
    
    0 0
Thursday 21st March 2013 08:59 GMT Anonymous Coward

Big Data Article?

Not.

Epic fail.

1 0
This post has been deleted by its author
Thursday 21st March 2013 09:28 GMT DaLo

Big Data

I would hope that an article on the Register talking about 'Big Data' would be jumping all over the industry for their recent hype machine and latest fad. 'Cloud' is so last year, 2013 has obviously been designated 'Big Data' year.

It's amazing that that amazing breakthrough happened in Big Data late last year...oh it didn't.

Well maybe no-one has had massive amounts of data before...oh they have.

Well maybe there wasn't a way to store and manipulate it before...oh there was.

Data warehousing has been around since I can remember, if a company has only just realised they have large quantities of data just because of a flashing Intel advert then they must have been hiding out somewhere dark.

'Big Data' the worst of the buzzwords so far...

2 2
1. Thursday 21st March 2013 14:05 GMT gmdata
  
  Re: Big Data
  
  Big Data very little to do with data warehousing, it's about the processing of unstructured data (as opposed to nicely defined tabular data)
  
  You try storing billions of ad hock images, raw web log files and signals from remote devices in your relational database and you might start to understand what Big Data is about
  
  0 2
Thursday 21st March 2013 09:33 GMT DaLo

IOPS Requirement

"What's interesting is that even today it's rare to see a software product's data sheet cite the IOPS (per-second storage operation capacity) requirement of the product"

How would this really be possible for most products? There's so many variables that the figure would be largely meaningless.

0 2
Thursday 21st March 2013 12:10 GMT Platypus

O RLY?

If you think deduplication is a no-brainer, you've just never tried to implement it. Like Fat Data itself, it's a tool that can be used well (reducing storage cost) or poorly (killing system performance), and users deserve to be educated about the difference.

1 0
Thursday 21st March 2013 13:11 GMT Anonymous Coward

Hmm...

What I've found interesting as a solution is to create large thin provisioning pools, use a system such as EMC's FAST to move hot tracks up and down disk tiers from flash drives at the top to SATA at the bottom and provision 1:1. This spreads the data across a large amount of spindles, which gives you speed and IOPS, it works quite well, you also don't need to worry about running out of space in your thin provisioning pool, which is always a worry for me.

0 0
Thursday 21st March 2013 21:18 GMT Jeff 11

IOPS

...is a term that makes me think of trendy teenagers verbally masturbating over SSD benchmarks in their gaming PCs. It doesn't mean anything real; the performance you get out of a SAN is going to depend more on the workload you give it on top of variables like how you partition it, which filesystem you use, the size of the controller caches, the underlying network media and so on. The big name vendors all have their own proprietary technologies that dictate, on top of these variables, how well they map on to the underlying technology. Raw hardware capabilities mean very little in real world environments and that's why vendors are reluctant to harp on about them. EMC, Dell/Equalogic and Netapp might quote similar figures if pushed but the experience you'll get with each platform will be markedly different in a fair comparison.

0 0
Friday 22nd March 2013 13:13 GMT JLH

HSM

Don't simply blindly expand the capacity of disk on your SAN.

If you are dealing with big datasets (and I do) you should look at a Heirarchical Storage Management system.

You can select a secondary tier of cheaper SATA disks, or a tier of MAID (didks which automatically idle down when not used) and a tier of tape in an automated library.

Less often used data will be pushed to tape automatically.

0 0

This topic is closed for new posts.

Topics

Special Features

Vendor Voice

Resources

User topics

Article topics

User topics

Article topics

Time to put 'Big Data' on a forced diet

COMMENTS

Stopy crying.

Re: Stopy crying.

But which is more expensive

Re: Stopy crying.

Re: Stopy crying.

Re: Stopy crying.

Big Data Article?

Big Data

Re: Big Data

IOPS Requirement

O RLY?

Hmm...

IOPS

HSM

Other stories you might like

AWS must pay $525M to cloud storage patent holder, says jury

Backblaze cloud storage buzzes with added Event Notifications

Snowmobile, Amazon's truck-powered migration service, reaches the end of the road

AI boom is boosting demand even for HDDs, raising prices by up to 20% since Q3

San Francisco's light rail to upgrade from floppy disks

Samsung enterprise SSD prices skyrocket thanks to AI's appetite for storage

RISC-V PCIe 5 SSD controller for the rest of us hits 14GB/s

We talk to W3C board vice-chair Robin Berjon about the InterPlanetary File System

Microsoft sends OneDrive URL upload feature to the cloud graveyard

China breakthrough promises optical discs that store hundreds of terabytes

Snowflake share price falls after revenue forecasts dip below expectations

FOSS replacement for Partition Magic, Gparted 1.6 is here to save your data

About Us

Our Websites

Your Privacy