I wonder why they had 4 SSD drives for Fast cache? I think they are RAID 1 so that means they didn't bother with a spare. I know SSD drives are expensive but come on!!
Tieto's five-day outage disaster started with multiple failures of its EMC VNX5700 array's FAST Cache, according to a Finnish source close to the matter. Tieto is a major IT services organisation across Scandinavia and the Nordic region – although it also provides services globally – and pulls in net sales of SEK17bn (£1.59bn …
Hardly surprising with the smell of lawsuits wafting on the breeze.
Best you can hope for, once their PR and legal lads agree on the wording, is a few words that actually say nothing in an astonishingly vague way, padded with some background puff lifted verbatim from the product literature.
As has been pointed out in the article, the service solution appears to have been inadequate to mitigate the risks of running such a workload on one array and failed to consider the recovery times resulting from possible failures.
If the system wasn't replicated then prudence would have suggested that RAID6 was a better protection method for the pool. Even if it was replicated, RAID6 is a good idea given the time required to move 450, 600 or more GBytes around. Poor prudence, rarely has anyone's ear.
>> What needs to be stressed is that Tieto's DR processes were dreadfully inadequate and obviously untested for the eventuality of such a failure. Lawsuits over data loss and business interruptions at Tieto's affected customers are bound to follow. ®
I suspect that they are probably better at writing disclaimers than they are at developing DR plans.
What's a little concerning about this to me, and some of the other rumors about failed SSD drives in FAST Cache causing big problems, is that FAST Cache is supposed to be a "cache". When hot blocks get promoted into FAST Cache, the EMC folk I spoke with in the past said the data was copied, not moved. That would cover the reads. As far as writes go, new updates to that block were supposed to get flushed down to spinning disk just as regular cache works. Your primary copy of the data wasn't supposed to be living in FAST Cache and susceptible to data loss. Additionally, if a FAST Cache drive failed and there was no hot spare (which appears to be the case here), FAST Cache was immediately supposed to go into read-only mode.
The title of the chart: "Verklig Prestanda Jämnförelse" is not even correct swedish. It should be spelled "Verklig Prestandajämförelse". Just to give people an inkling of the competence level of this company when they can't even formulate correct sentences with real words...
Their technical skills are evidently a good match for their spelling abilities.
Biting the hand that feeds IT © 1998–2019