Reply to post:

Oh, sugar! Sysadmin accidently deletes production database while fixing a fault


My team runs our company's incident/problem/change solution. (Yes, the irony burns.) A few years back, we had tables in a QA DB that we no longer needed. DB administration at that level is managed by a dedicated DBA team, not the application team. We sent them a request for the table drops and, knowing our prod and test DBs had nothing to do with one another, thought nothing more of it.

Except, unbeknownst to us, the tool the DBAs use to perform such tasks connected to both test and prod systems alike, and the over-eager person involved issued DROP CASCADE against the tables in question *everwhere*. In the middle of the US morning / EU afternoon.

The only reason this did not completely destroy our production OLTP DB was that there were locks in play because of our level of user concurrency. (Logs later showed that the DBA actually tried the deletes several times when they failed.) Our prod reporting DB instance had no such protection and critical tables were wiped out. Restoring that took a long time because the tables were huge and, at the time, the reporting DB schema was not a 1:1 match with the OLTP system. (You can do that with fancier replication tools.) The reporting instance had to be restored from remote backup, which literally took days. Fortunately, for the duration, we were able to point most of our BAU features that relied on the reporting instance to the OLTP instance instead, accepting the modest risk of OLTP performance impact to keep important things working.

Happily, this event did produce both process and architecture changes in the way the DBA support tools were used and set up. And, probably, at least one staffing change. o_O

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019