Go Team Oracle!
Oh wait... this isn't good promo?
Salesforce.com's protracted outage earlier this week caused data loss. An update on the company's status page dated May 12, 2016 20:00 UTC says data “written to the NA14 instance between 9:53 UTC and 13:29 UTC on May 10, 2016 can not be restored.” There's a tiny ray of sunshine in that announcement, because previous updates …
A 4 hour RPO is generally considered pretty tight in non-transactional environments and the recovery is comfortably in so it would seem their protections worked.
If you corrupt the data files and the archive logs you're losing data. It's just a case of when you last put some files in a place that didn't get FUBAR'd. Moving them every few hours is normal and 4 hours is a common figure used as it's a good balance.
I'm sure that Salesforce will be crafting more expensive and complex solutions for more stringent objectives as we speak.
Wouldn't want to be a CIO explaining that they lost that amount of data if the business did not "allow" that amount of data loss though.
"Fair call. Maybe I should have said WTF was Saleforce doing allowing new data to be created while its systems were too fragile to handle it."
That has me thinking of a particular system outage which happened just before the start of work one day.
The post mortem revealed that what they should have done was say "Let's cease processing right now and start restoring", but instead they waited for someone senior enough to arrive and make that decision.
Unfortunately by the time that decision was made, things were in a real mess.
Many people are using SalesForce in transactional environments (because if you've got Service Cloud you kind-of have to in order to offer customer services support for your transactional systems). If you've built on force.com this must really hurt.
It appears that Heroku, being an acquisition with a sensible architecture, isn't affected but that's certainly being used for real-time transactions.
I am sure that Salesforce is estimating the costs in direct costs to fix it, potential pay-outs/goodwill gestures, lost brand-value, etc.
I am a bit disappointed. I thought their infrastructure would be tighter on data-loss, bit stil a bit early to speak of them as having not taken properly care of things.
People think Salesforce must be running some cutting edge Caltech JPL, Johns Hopkins APL style massive scale out and cluster architecture, but it is just Dell servers running Oracle... as Salesforce's infrastructure was set up in the early 2000s and that was just kind of the way things worked in those days.
> It's not as if anyone would be stupid enough to run their mission critical applications on a cloud, is there?
It's not as if anyone has ever experienced a failure like this on their in-house managed IT systems?
Of course not: all in-house IT systems have their DR plans fully documented, with a test failover performed monthly.
> Of course not: all in-house IT systems have their DR plans fully documented, with a test failover performed monthly.
Actually we did a failover every two months on our in house system.
Yes, in-house IT can be a mess in many cases, but for any organisation with mission-critical parts kept in-house, *you* retain control of the specification and management of your live and redundant resources and can plan for business continuity and failover under your control and to a budget and resource pool that meets your requirements; perhaps even considering worse-case scenarios like hiring a shipping container data centre or, as in our case, being able to cannibalise other, less critical services, for parts while we wait for spares and an engineer to turn up within their 1/2/4/whatever hour response window.
As an example, we specced a standby system running a single unit tape backup/restore function (in case our autochanger failed) to have the same hardware as one of our primary servers. One day, the HP SAS caching controller on a live server failed and brought down the ERP system . While we were on the phone arranging a service call-out, another team member replaced the controller with the one from the 'spare' system. Total downtime was 12 minutes - and that was our only service break on that system in about 3 years.
When you rely on DR in a SAAS cloud-based environment, you won't likely have a dedicated engineer thinking about what they can do for you and your service - they will be taking a holistic view across the entire failed infrastructure and the solution will most likely take longer to implement unless its something simple like failed switch or router because whole systems/backups (ha - if they exist) will be being restored rather than just *your* systems and backups.
Again with this flawed comparison. Apples and oranges, my good sir.
If your in-house IT fails, it only bothers YOU and YOUR customers.
If the cloud fails, it bothers EVERY SINGLE COMPANY RELYING ON IT.
It's a question of scale : to fuck things up you need a computer, but to really fuck things up you need a cloud.
Yes, in-house IT systems fail just the same as other systems, agreed. But it's not about the inevitable failures, it's about the recovery. In-house we don't wait for cloudy 'admins' to discover the problem, then cock it up further by trying to fix what they didn't think could break.
Funny how people always blame the Database first when it goes down, but the reality is what it was running on and how well implemented was the DR plan? Clearly, the hardware is at fault here and as others are saying, its running on Dell, and so the question is, maybe Salesforce is running antiquated systems or systems that are not capable of providing maximum uptime? The problem with Dell or any of the commodity server providers, is that they are just a "layer" in a complicated, multi-vendor solution and so keeping all of the layers in sync, up to date, and with high availability, is virtually impossible. Probably why converged systems and even Oracle's Engineered Systems are doing so well these days .
You never blame the database first.
Blame goes "It's the Network", then "It's the Storage", then "It's Virtualisation", and fianlly, if there's really enough push from on high "It's the sh**y code running the database"
It's infrastructures fauilt, it has always been the way....
Seeing the profits salesforce makes (NOT!) and the fact these idiots are selling software for sales departments it baffles me why they are still in business. The salesforce software is useless, salesforce.com are the experts when it comes to their software, and fail to meet targets every year using it.
Now, can you numpty salesforce customers please explain why it is a good idea to buy salesforce services ? Exactly!
Don't worry, our sales guyz use it too ... asked them about it, we had a good laugh, the only explanation I got was "company policy" ... that is like the BS you hear from Windows Cleaner brigades or religious people ...
Me: "Why do you use Windows server as a file server????"
WCSE: "Company policy, we're an all MS shop, here."
Me: "The bible is supposed to convey the words of god, how come the earth is flat according to the bible, he should know, he created it."
Believer: "No, you don't understand, it's a metaphor."
Joined my current organisation three years ago and in my first week the CTO declared we are going all Salesforce for our cloudy data needs and the Salesforce people promised it was all possible and easy (there might be a little customization required but the platform is so fast to develop on honestly).
Three years on...
...Let's all guess how much of this has gone live?
Never, ever, ever believe a Salesforce sales droid
Biting the hand that feeds IT © 1998–2020