Forty-five disk drives, ten parity drives, and 33 spare disks: that's the optimum array size to protect data for four years with no service visits, according to a study published at Arxiv. The problem the study addresses is that the world's rush towards hyperscale data centres puts an awful lot of disks in one place, and the …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Page:

Wednesday 28th January 2015 05:05 GMT Anonymous Coward

The concept of spares needs to go

All those spindles provide additional performance. Use them. Arrays are virtual these days anyway, so as drives fail and rebuilds occur you lose spindles (performance declines) and you lose available capacity (not a big deal until it gets too close what you're actually using)

It may not be cost effective to send out a guy to replace a single drive, but surely it is still cost effective to have them replace dozens of drives if you need a mid life performance/capacity kicker until the array has fully depreciated.

4 6 Reply
1. Wednesday 28th January 2015 16:32 GMT theOtherJT
  
  Re: The concept of spares needs to go
  
  I completely agree. Distributed scale-out storage across many nodes is clearly the way we're going. Let the software handle where any given data block is written / read from and just keep feeding it disks and CPU cycles as necessary.
  
  Breaks down a bit if you need to really slam a _lot_ of data down on the disks very fast because you end up IO bound by the speed of the network interface(s), but come on - in that case you're probably using some sort of flash storage on the local box anyway.
  
  2 0 Reply
  1. Wednesday 28th January 2015 18:38 GMT Anonymous Coward
    
    Re: The concept of spares needs to go
    
    Four downvotes and not one comment as to why they think I'm wrong? Did the fanboys take a wrong turn on the way to the article about Apple's record quarter?
    
    1 2 Reply
2. Thursday 29th January 2015 14:09 GMT Rebecca M
  
  Re: The concept of spares needs to go
  
  All those spindles provide additional performance. Use them. Arrays are virtual these days anyway, so as drives fail and rebuilds occur you lose spindles (performance declines) and you lose available capacity (not a big deal until it gets too close what you're actually using)
  
  How do you use that performance? The kind of medium scale array that is studied here will have no problem saturating a couple of 10GbE links even with relatively slow drives and dumb controllers. If you make the reasonable assumption that a mid range array is tied to a mid range network where is that performance going to go?
  
  I'd sooner have the spares in place and spun down when not in use. Less power, less cooling, less noise and the opportunity to force asymmetric wear on each drive, so that come the end of the array's life you don't get clumps of failures in quick succession according to what batch they were from.
  
  Four downvotes and not one comment as to why they think I'm wrong? Did the fanboys take a wrong turn on the way to the article about Apple's record quarter?
  
  I can't speak for everyone else but for me there comes a point were a comment is so far removed from real world experience it simply isn't worth commenting on in the first instance.
  
  1 2 Reply
3. Thursday 29th January 2015 15:59 GMT Solmyr ibn Wali Barad
  
  Re: The concept of spares needs to go
  
  Good suggestion - unless it gets taken literally. Which is probably the cause for downvotes. It's not an either/or proposition. There is no actual need to forget about the concept of spares, it is a valid concept with many usage scenarios, and will remain viable.
  
  Distributed, massively parallel storage is also a valid approach. With or without dedicated spares. XIV (bought by IBM few years back) was pretty much founded on this idea, and they did indeed put all spares to work. Zillions of ZFS-based setups can be configured whichever way you please. GPFS (or whatever it's called these days) too.
  
  0 0 Reply
Wednesday 28th January 2015 05:11 GMT Anonymous Coward

Oops...

"an array has 45 data disks, 10 data disks, and 33 spare disks"

I'm assuming that was to read 10 parity disks like it did in the beginning of the article.

1 0 Reply
1. Wednesday 28th January 2015 05:35 GMT Richard Chirgwin
  
  Re: Oops...
  
  Oops indeed - thanks, I have fixed this.
  
  RC
  
  1 0 Reply
Wednesday 28th January 2015 06:18 GMT Robert Helpmann??

Not terribly surprising

Hardware and software usually contribute less to the cost of ownership of a system than the support staff to maintain it, at least in my experience. I would have liked to see a broader sample of disks for comparison, though, as altering variables just a little bit might result in much different outcomes. For example, HDD reliability varies greatly by manufacturer and SDDs are missing entirely from this study. Also, much of the reliability data made available by drive vendors does not count failed drives that are replaced by warranty, which is perhaps why Backblaze's data was the only set used.

1 0 Reply
Wednesday 28th January 2015 06:33 GMT Sampler

I'm no storage king

So this is a genuine question, dedicated parity drives?

My understanding was RAID6 spreads the parity across the storage drives as dedicated parity drives end up dying sooner due to the high usage of the parity drive compared to the storage.

Or are they talking simply in terms of storage and the parity will be effectively written across all drives and the figures are just in terms of storage lost to parity (ie ten drives worth from fifty five disks, leaving forty five disks worth of space)?

3 0 Reply
1. Wednesday 28th January 2015 08:20 GMT A Non e-mouse
  
  Re: I'm no storage king
  
  I think you're confusing RAID 4, 5 & 6.
  
  RAID 4 has a dedicated parity disc.
  
  RAID 5 has 1 extra disc for storing parity data, but that parity data is spread across all the discs in the RAID group.
  
  RAID 6 has 2 extra discs for storing parity data. Again, that parity data is spread across all the discs.
  
  1 2 Reply
2. Thursday 29th January 2015 02:29 GMT John Tserkezis
  
  Re: I'm no storage king
  
  "My understanding was RAID6 spreads the parity across the storage drives as dedicated parity drives end up dying sooner due to the high usage of the parity drive compared to the storage."
  
  Not quite. The 45 data drives, and the 10 parity drives are actually part of the entire working set (55 drives). They're numbered like that so you can more easily determine how much space you have to work with (45 drives). Your regular data, AND the parity data is spread evenly across the 55 drive set. Losing any one (or two for RAID6) drives results in only slightly slower data transfer operations, but otherwise, your users may not even notice. Hopefully, the adminstrators do though...
  
  The spares, depending on confguration may be power up, or not, but either way, do nothing till a drive in the working set (whichever drive that is) fails, and a spare is called apon to take it's place. A lack of any spare drives in the array is the same (regardless if they're all in use due to failures, or you didn't have any spare at all), but you need manual intervention to swap the faulty drives for new ones instead.
  
  1 0 Reply
Wednesday 28th January 2015 07:09 GMT Fazal Majid

Theory and practice

Typical academic paper making simplistic and very optimistic assumptions about failure modes. In my experience about one third to one half of storage faults are Byzantine, I.e. the drive doesn't just go down, it is actively attempting to sabotage your array by sending interfere down the bus (specially on buses where this is theoretically impossible like FC-AL) or all sorts of crippling behavior. Something like that will still require physical intervention.

Here is an excellent introduction to the subject:

http://dtrace.org/blogs/wesolows/2014/02/20/on-disk-failure/

And of course John Gall's immortal classics about systems thinking.

9 1 Reply
1. Wednesday 28th January 2015 11:48 GMT theOtherJT
  
  Re: Theory and practice
  
  Or the controller card lets go.
  
  Or the power goes out to one of your racks and then the storage array's bios has a shit fit and refuses to come back up.
  
  Or one of the ram banks in the array is acting up leading to constant re-writes as the checksums fail and now the controller thinks there's something wrong with the disks and starts removing them from service.
  
  Or ONE of the network interfaces on the box goes down so the replication traffic to the other boxes in the cluster stops, but the outside world still thinks the disks are accessible for a while before STONITH kicks in and kills it, but it's too late by then and now you have a 14 hour rebuild on your hands when you bring the thing back up again.
  
  Or the firmware on the controller card said "JBOD" but wasn't really jbod, and was still writing some sort of header to every disk, so when the drive fails and the spare fires up ZFS refuses to accept it as a replacement because there's some data on there already and you wouldn't want to risk over writing it would you?
  
  You get 5 9's uptime by having a person on site who actually checks for shit like this. HA storage is _weird_
  
  10 0 Reply
  1. Wednesday 28th January 2015 13:22 GMT yoganmahew
    
    Re: Theory and practice
    
    Absolutely. The six sigma events, three days in a row of the financial crisis seems to have taught modelling boffins very little.
    
    For me, 5 9s is a result of speed of recovery from failure moreso than preventing failure itself. Statistically 'impossible' (yes, they're just unlikely) events happen quite often. Not having to wait for an engineer to travel with a spare part is what distinguishes a quick recovery from a slow one...
    
    0 0 Reply
    1. Wednesday 28th January 2015 17:17 GMT Destroy All Monsters
      
      Re: Theory and practice
      
      The six sigma events, three days in a row of the financial crisis
      
      These have as much to about "reliability" and sigma whatever (which is something that comes from manufacturing, too AFAIK) as does hoping to survive repeat attemps at playing russian roulette. Just saying.
      
      "Yes M'lord we never managed to reach relibability significantly above 3 clicks in this game."
      
      1 0 Reply
2. Thursday 29th January 2015 15:15 GMT Solmyr ibn Wali Barad
  
  Re: Theory and practice
  
  "the drive doesn't just go down, it is actively attempting to sabotage your array by sending interfere down the bus (specially on buses where this is theoretically impossible like FC-AL)"
  
  Oh yes, misbehaving drive can cause lots of grief. Even in the modern SAS fabric.
  
  Original FC-AL, by the way, was very susceptible to bad acts. It's not even a bus, but an arbitrated loop. Which already says a lot about it. Loop devices must play nice with each other - unique loop ID, obligation to forward the traffic to other members, and in cases of conflict they must obey the elected arbiter. Hah. Like that's going to happen. Only when everybody is in a really good mood.
  
  Hub variants of FC-AL did away with big loops. Physical connections moved a bit towards star topology, whereas logical setup still emulated a loop. Small hubs were built into the drive enclosures, so it became possible to disconnect an offending drive without its consent, and set the slot circuits to bypass (so that loop traffic would be forwarded). Definitely better than original FC-AL, but disturbances are still quite possible.
  
  Many thanks for the link!
  
  0 0 Reply
Wednesday 28th January 2015 08:11 GMT Duncan Macdonald

On site support

If you have a large data centre then the cost of one person on site who can swap disks is not going to add much to the costs. (That person could even double as one of the security guards - a high level of ability is not needed to swap disk drives.)

6 0 Reply
1. Wednesday 28th January 2015 13:21 GMT phuzz
  
  Re: On site support
  
  I wouldn't trust any of the data-center security guards I've met to swap a disk. Don't get me wrong, some of them are bloody good guards, but I've never met one who was interested in what they were guarding.
  
  0 0 Reply
  1. Wednesday 28th January 2015 14:21 GMT Anonymous Coward
    
    Re: On site support
    
    If they found someone local willing to pay for disks, I bet they'd get swapped quickly.
    
    (I worked my way through college doing security, I met some dubious characters on the job)
    
    0 0 Reply
Wednesday 28th January 2015 08:16 GMT A Non e-mouse

Costs

the cost of calling someone to replace a dead drive far outweighs the price of the disk

Someone's making the wrong comparison. You need to look at the cost of replacing the disc versus the value of the data on the disc. I suspect the disc is tiny in value, compared to that of the data it holds.

7 2 Reply
1. Wednesday 28th January 2015 09:51 GMT CraPo
  
  Re: Costs
  
  " I suspect the disc is tiny in value, compared to that of the data it holds."
  
  Cat videos?
  
  3 0 Reply
2. Wednesday 28th January 2015 12:58 GMT Patrick R
  
  Re: Costs
  
  When the disc gets replaced, it's first unusable, then it's gone, so is the data on it. Where do you see value?
  
  0 1 Reply
3. Wednesday 28th January 2015 15:44 GMT the spectacularly refined chap
  
  Re: Costs
  
  Someone's making the wrong comparison. You need to look at the cost of replacing the disc versus the value of the data on the disc. I suspect the disc is tiny in value, compared to that of the data it holds.
  
  No, that is the wrong comparison. If you have data that you can't afford to lose on one device (or even one array) that is your problem - if you have a backup of the data on a drive the value of the data on the dead one is meaningless.
  
  However, that still isn't the point they are making. It is being taken as read that the data must be protected and in that sense your point is the very opening premise of the study. They are not arguing over whether data should be protected but the most cost effective way of assuring that.
  
  Having said that I'm still not convinced the comparison is valid. I'll admit my experience is at the lower end of the scale, only going up to a few tens of terabytes but in my experience the cost of the drives is usually around half of even the capital cost of the array. You have semi-fixed costs such as computer smarts and software on top but the extra costs per unit are not inconsiderable, i.e. physical enclosures, controllers and power supplies, which inevitably scale with the number of drives.
  
  3 0 Reply
Wednesday 28th January 2015 08:22 GMT John Robson

Assuming no batch failure modes

Because that's a good assumption.

A decade ago I learnt to mix'n'match batches in RAID arrays, prefereably mix'n'match manufacturers...

Batch failures are common, even if not due to fault - as they all see the same lifecycle they all tend to fail together, or at least fail during the rebuild, when having been brought to within 1% of it's lifespan the disk is then thrashed for dozens of hours to get all the data read as fast as possible.

9 0 Reply
1. Thursday 29th January 2015 08:55 GMT Anonymous Coward
  
  Re: Assuming no batch failure modes
  
  Agree, we once lost 9 drives, one after another , faster than the new ones were coming online, the others were failing.
  
  Thank <deity> for decent backups.
  
  0 0 Reply
This post has been deleted by its author
Wednesday 28th January 2015 09:01 GMT Anonymous Coward

Something new every day

Except that my employer has been doing this for years. The drives are protected by RAID-6 and there are a bunch of automatically assigned spares in the box. Maybe not 33 spares for 45 drives, but enough that visits to replenish to spare pool are rare. When the pool of spares is getting low the box calls home to the service centre and a service guy is dispatched to the site on a non urgent basis.

It looks like academia has just caught up to the idea.

7 1 Reply
1. Wednesday 28th January 2015 11:48 GMT Jonathan Richards 1
  
  Re: Something new every day
  
  Not so much just caught up to the idea, as actually quantifying it with real-world failure numbers, and working out an optimum with a bit of maths. Just a bit more precise than your "bunch of". See icon!
  
  5 0 Reply
  1. Wednesday 28th January 2015 13:12 GMT Tom 38
    
    Re: Something new every day
    
    Its real world failure numbers for a specific type of load.
    
    If your real world load is not the same as theirs, I'm not sure you can tell too much from this.
    
    Personally, I think their entire premise is bogus - "How many disks do you need to plug in to a server so you can just leave it for 4 years?" is not a question that needs answering because the opex of providing someone to support your boxes is dwarfed by specifying an array of that size (in terms of extra initial cost, extra PDU, extra rack space).
    
    They haven't even eliminated the person to maintain the server - every server needs an admin or two, even if you don't have to go put disks in it occasionally.
    
    1 0 Reply
2. Wednesday 28th January 2015 17:21 GMT Destroy All Monsters
  
  Re: Something new every day
  
  Except that my employer has been doing this for years.
  
  No he hasn't.
  
  You are confusing "I'm gonna do something along these lines like a rabid monkey with some fast guesses" with optimization.
  
  0 0 Reply
  1. Thursday 29th January 2015 01:38 GMT Anonymous Coward
    
    Re: Something new every day
    
    You are confusing "I'm gonna do something along these lines like a rabid monkey with some fast guesses" with optimization.
    
    Setting your spare level at 73% of your data drives is certainly not optimisation. What I am talking about is enterprise size storage servers not some little box of commodity drives stuck under somebody's desk. Failure rates are fairly well known, obviously the author of the original article must have used them in his calculations. You put in enough spares for an optimal service interval, it is a whole lot cheaper to have an entry level tech to visit a site say once a year to replace the used spares than it is to spare the box for the whole of it's life.
    
    This is the reality of commercial practice not some academic theory.
    
    0 0 Reply
Wednesday 28th January 2015 10:07 GMT Anonymous Coward

what about enclosure failure?

The answer to drive (and enclosure failures which wasn't mentioned) is to have distributed grid with data spread pseudo-randomly over the enclosures and then have all drives contain data, parity and hot spare capacity.

In that instance, the loss of a single drive brings all drives in the system, not just in a single RAID set, into the rebuild operation, and depending on the RAID level you have used, can enable some extraordinary quick rebuild times per TB.

Sounds fanciful?

IBM's XIV does this and has done so for years. It's a grid architecture and it has the capability to rebuild complete parity following a 4TB drive failure in under 1 hr on a fully configured system (more than 24 times faster than stated in the article).

6 0 Reply
1. Wednesday 28th January 2015 14:26 GMT Jan 0
  
  Re: what about enclosure failure?
  
  This is the proposal that DougS made in the first post. I don't understand why it's accumulating down votes.
  
  1 0 Reply
2. Wednesday 28th January 2015 16:28 GMT theOtherJT
  
  Re: what about enclosure failure?
  
  This is pretty much how CEPH works too.
  
  You specify a set of nodes, tell it how much parity to data you want, and let it get on with it. Lose a disk? The data on that disk is re calculated from parity (or just redundant copies if you're doing what amounts to raid 10) and written to other disks across the set. Lose a node? Give the other nodes a moment to decide that it's actually gone and isn't coming back, rather than this being some sort of transient network issue, and the same happens but on a larger scale.
  
  Half a dozen nodes with dozen disks each and this becomes really very robust and very VERY fast due to all those spindles being up and spinning all the time. You can also just throw more nodes at it when you want to increase capacity - which is just lovely.
  
  1 0 Reply
3. Wednesday 28th January 2015 23:00 GMT Fuzz
  
  Re: what about enclosure failure?
  
  HPs EVA did this as well, spares were just space reserved at the end of each drive. That way none of your spindles was unused and if your array was at 50% capacity you could lose 50% of your disks (not at the same time) without ever having to replace a drive.
  
  I don't understand why all arrays don't work this way.
  
  0 0 Reply
Wednesday 28th January 2015 10:10 GMT Christoph

"the cost of calling someone to replace a dead drive far outweighs the price of the disk"

So why use an expensive fleshy? It's a standard box in a standard slot with standard connections, arranged in a standard rack in known positions. Just have a juke-box type arm swing in and do the replacement.

0 0 Reply
1. Wednesday 28th January 2015 11:32 GMT Alan Brown
  
  "It's a standard box in a standard slot with standard connections, arranged in a standard rack in known positions."
  
  As is the location of the hard drive. You may as well just add an interface to that slot and forget about the robot. It's not like tapes where the complexity is in the tape drive and the cartridge is a simple unit.
  
  Telcos have been doing "periodic maintenance" for decades, it's a known quantity.
  
  There's an assumption being made that all the storage is in the same enclosure or even in the same datacentre. If it's that critical you don't do things like that.
  
  1 0 Reply
Wednesday 28th January 2015 10:20 GMT David Roberts

Some wierd assumptions

Firstly the apparent assumption that arrays don't carry spares.

I worked with RAID5 arrays in the '90s and there was always at least one hot spare.

Secondly (as already pointed out) using the cost of replacing a single disc vs. leaving the array untouched for 4 years. No apparent consideration of someone popping in once a month to replace failed drives as a bulk process.

1 1 Reply
1. Wednesday 28th January 2015 12:07 GMT Adam 1
  
  Re: Some wierd assumptions
  
  Plus the assumption that you run a data centre but would have to call a guy in to replace the drive?
  
  0 0 Reply
2. Wednesday 28th January 2015 13:15 GMT Anonymous Coward
  
  Re: Some wierd assumptions
  
  It doesn`t seem to reflect the experience I`m sure of many here that cluster failures do happen and there are SPOFS for this in the array chassis (`s ?) with PSU`s backplanes, controllers.
  
  The IBM HPFS , or whatever its current label is from the blurb seems to address most of the problems by spreading data and redundant recovery info amorphously across all the drives even the "spares" so that the worst aspects of other disk array organisation Ie the performance hit and time taken for a reconfigure are not showstoppers, and neither are multiple failures even simultaneously. I`m sure this comes at ££s though.
  
  Urgent replacement of a drive rather than routine visits will usually be driven by SLA, and so the meatware time/cost compared to the loss of revenue or potential contracted penalties seems not to have been considered.
  
  Or the "customer mobility" concerns.
  
  Funny that, realworld conditions.
  
  1 0 Reply
3. Wednesday 28th January 2015 17:29 GMT Destroy All Monsters
  
  Re: Some wierd assumptions
  
  I worked with RAID5 arrays in the '90s and there was always at least one hot spare.
  
  Yes, and?
  
  Secondly (as already pointed out) using the cost of replacing a single disc vs. leaving the array untouched for 4 years. No apparent consideration of someone popping in once a month to replace failed drives as a bulk process.
  
  "Chief. About this disk array down in Antarctica? Can you have PFY pass by for a fast repair once a month?"
  
  1 1 Reply
Wednesday 28th January 2015 12:52 GMT Anonymous Coward

Real estate costs

If you find a suitable cooling solution it's possible to pack disk pods densely with no aisles and reduce your real estate costs. Floor loading might become an issue, though.

1 0 Reply
1. Wednesday 28th January 2015 13:17 GMT Tom 38
  
  Re: Real estate costs
  
  A suitable cooling system means a DC that has enough cooling and power per rack to give you what you are asking. DCs are designed with a specific wattage per rack.
  
  Since everyone wants more power and cooling, if you want more than the average, your DC provider is going to ream you for it.
  
  1 0 Reply
Wednesday 28th January 2015 13:41 GMT Anonymous Coward

I wonder...

...how Google-large storage arrays deal with it.

Do they have a dude driving a minivan and pushing a trolley filled with spare drives roaming about all datacenters, one per week? By the time he finishes the last datacenter, 4 years will have passed since he visited datacenter one.

It is like lawnmowing... or eventually will come to it.

1 0 Reply
Wednesday 28th January 2015 13:52 GMT Otto is a bear.

Not very green either

Running 33 hot spares and failed drives for 4 years. I don't know if it's still true, but the longer you leave a drive idle, the less likely it is to start-up when you need it.

There also have to be extra space costs for having 33 drives doing nothing.

0 0 Reply
Wednesday 28th January 2015 17:10 GMT Anonymous Coward

RAID failure cases are so ugly

And that's before you even take controller failures into account.

I really liked Figure 2 in the paper that shows how non-orthogonal the failure cases can be. By that I mean that with RAID you can't just give a fraction of tolerated disk failures, but have to consider clusters of worst-case scenarios. Their figure isn't for standard RAID, but you can still see how their scheme carries over the non-orthogonality of current RAID implementations.

I dabble in using Rabin's Information Dispersal Algorithm as an alternative to RAID. I've only recently added some introductory information to my repo to show what it is and how it can be better than RAID. In fact, one of the points I made was that failure analysis is a cinch with IDA. Since IDA doesn't distinguish between data and parity, your redundancy level is a simple fraction and if you know the failure rates of individual disks it's very straightforward to calculate the probability of failure of the cluster as a whole. You can obviously go nuts and apply a Poisson arrival model or use prior probabilities to examine reliability over time if you want to, but it's not necessary.

I'm constantly amazed at the number of researches that persist with the mentality that XOR-based data redundancy systems (ie, all RAID systems) are the way to go, and that non-orthogonal failure cases are acceptable. I get that XOR is cheap, but so is any other O(n) algorithm (like IDA) if it's done in hardware. And it's not even the case that XOR-based systems have to have non-orthogonal failure cases. There's a thing called "Online Codes" invented by Petar Maymounkov that follows from previous work on Raptor codes. It uses two layers of XOR and gives asymptotic (probabilistic) guarantees about the recoverability of the data with a given number of erasures. It might not be well-suited to use in storage systems (or it might, if anyone bothered to look into the maths), but it at least shows that orthogonality is possible. (I'm also in the process of implementing this in my repo since it should be good for multicasting a file across my network/storage array so that individual nodes can then do the IDA bit, along with allowing for other nested/hybrid IDA setups)

I'm actually reminded, as I write this of the whole debate around neural networks back when the "Perceptron" was discovered. Research into the whole field basically stalled for quite a few years because it was proven that the Perceptron couldn't encode an XOR rule. It wasn't until multi-layer neural networks were invented that this particular problem was overcome and progress started to be made again. I wonder if there isn't a similar artificial plateau effect happening these days with RAID systems?

3 0 Reply
Wednesday 28th January 2015 17:54 GMT Jim O'Reilly

Painting the Titanic!

This would have been an interesting topic 15 years ago, when RAID was our only data integrity option. Today, with erasure codes and replication, the thesis of the article is badly off base (except for the idea that drive replacement is passe) and essentially irrelevant.

We can get the benefit of no-repair storage arrays using erasure codes. This spreads the drives over a set of appliances. Typical configurations protect against a 6-drive loss, which will allow plenty of time for rebuilds, which can be anywhere in the storage pool (with say Ceph or other modern storage software). There is no need for huge numbers of dedicated spares up front, since adding new boxes of drives is the solution for sparing. Replication is not quite as good, typically providing 2-drive failure protection, but again recovery is to spread the data over available space or onto empty drives in a new appliance.

Object storage doesn't require disk-level recovery - objects can be spread over existing free space on many drives.

1 0 Reply
Wednesday 28th January 2015 23:15 GMT J.G.Harston

All well and good....

until some PHB says "we're running out of space, just use some of those spare drives"

0 0 Reply
Thursday 29th January 2015 08:03 GMT a_milan

Finally

... a scientific proof that XIV is the right idea.

0 0 Reply
Thursday 29th January 2015 08:58 GMT Anonymous Coward

Of course this all falls down....

...when you own the data centre and work just across the corridor.

0 0 Reply

POST COMMENT House rules

Not a member of The Register? Create a new account here.

Topics

Special Features

Vendor Voice

Resources

COMMENTS

Page:

The concept of spares needs to go

Re: The concept of spares needs to go

Re: The concept of spares needs to go

Re: The concept of spares needs to go

Re: The concept of spares needs to go

Oops...

Re: Oops...

Not terribly surprising

I'm no storage king

Re: I'm no storage king

Re: I'm no storage king

Theory and practice

Re: Theory and practice

Re: Theory and practice

Re: Theory and practice

Re: Theory and practice

On site support

Re: On site support

Re: On site support

Costs

Re: Costs

Re: Costs

Re: Costs

Assuming no batch failure modes

Re: Assuming no batch failure modes

Something new every day

Re: Something new every day

Re: Something new every day

Re: Something new every day

Re: Something new every day

what about enclosure failure?

Re: what about enclosure failure?

Re: what about enclosure failure?

Re: what about enclosure failure?

Some wierd assumptions

Re: Some wierd assumptions

Re: Some wierd assumptions

Re: Some wierd assumptions

Real estate costs

Re: Real estate costs

I wonder...

Not very green either

RAID failure cases are so ugly

Painting the Titanic!

All well and good....

Finally

Of course this all falls down....

Page:

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

AWS must pay $525M to cloud storage patent holder, says jury

Backblaze cloud storage buzzes with added Event Notifications

Snowmobile, Amazon's truck-powered migration service, reaches the end of the road

AI boom is boosting demand even for HDDs, raising prices by up to 20% since Q3

San Francisco's light rail to upgrade from floppy disks

Samsung enterprise SSD prices skyrocket thanks to AI's appetite for storage

RISC-V PCIe 5 SSD controller for the rest of us hits 14GB/s

We talk to W3C board vice-chair Robin Berjon about the InterPlanetary File System

Microsoft sends OneDrive URL upload feature to the cloud graveyard

China breakthrough promises optical discs that store hundreds of terabytes

Snowflake share price falls after revenue forecasts dip below expectations

FOSS replacement for Partition Magic, Gparted 1.6 is here to save your data

About Us

Our Websites

Your Privacy