Since t-mobile is taking the hit...
...this likely means that its not actually M$'s fault the failure caused dtata loss.
The firm I'm with has about 3500 x86/64 servers plus a number of other systems including about a dozed mainframes of varying ages. In addition to our owd data processing business lines, being the bulk of the systems, several hundred of those servers and about 100TB of data belong to external customers. You would be SURPRISED how many of them flatly refuse the redundancy services and high availability tier systems we offer. Most agree to basic clustering, many will accept relatively reliable SAN storage (if not mirrored tier 1 RAID10 chassis), but very few will deploy tier 0 or tier 1 availability systems, few pay for redundant networking, and basically none of them have anything more advanced than simply tape backups.
We offer fully redundant systems, at about 2.5 times the cost of standard systems (not bad actually) with standard being defined as at least a tier 3 availability environment (RTO/RPO not more than 24 hours). Customers with multi-million dollar contracts with the government, contingent upon 4 hour recovery times, will often even settle for far less just to save a couple hundred grand. It;s amazing how many people simply do not comprehend how disasterous a disaster can actually be. They have NO CLUE how long it really takes to restore a 40TB database were it actually to fail. The have NO CLUE how difficult a redeployment would be, and how long it would take, if a catastrophic failure (fire, major power issue, mainframe failure)happened. Their MILLIONS of dollars in contract revenue rinde on the decisions to skimp and save a few grand here and a few grand there.
I would not doubt at ALL that T-Mobile did not pay M$ the appropriate fees to have 4 hour full datea recover, and full system replication across multiple sites. They probably valued the user data very low, figuring some clause in a contract about "back up your own data" covered their asses, and did not account for the PR nightmare a data loss for a hosting firm is.
Bean counters simply don;t understand the logistics of server availability and data recovery. The see things like "3-4 times the cost" to move from a tier 2 system to a tier 0 system, and simply figure its so small of a change it's worth saving the million or two on deployment. Then Microsoft says "hey, here it is in writing where we suggested a more resilient design including full data replication and 14 day journaled writes with live rollback support, and here's your signature refusing that and saying tMobile will only pay for tape backups" and follow it up with "and here's the clause absolving us of data loss and SLA requirements because of your refusal to pay, and good luck with your customers..."