Only appropriate icon
Have one of these, "Bert" -->
The days are drawing in and the mornings are getting darker – so why not take a dose of Who, Me? to help lighten up your day. In this week’s instalment of El Reg’s weekly column filled with embarrassing reader cock-ups, we meet “Bert”, who was working at a large brewing firm at the point in question. “I was a junior dev as …
We had a file system that held samna / amipro documents that constiuted all the specs and programming documents.
We were running out of space so i deleted what i thought were some unneeded docs.
Except i rm - r on the wrong dir
I recovered from a 3 montj old backup
>Except i rm - r on the wrong dir
>I recovered from a 3 montj old backup
Sure? There have been a few instances where I had to do an editing job I was pretty certain I had already completed, and the work flow logs indicated I had done so too. Strangely when asking IT support the answer was invariably "you must be mistaken." I am not so sure.
And equally suspiciously the last backup they could document was almost about as old as when the editing was supposed to have taken place.
I remember the one and only copy of an old database I deleted on tape.
Way back in the late 90's we have tons of physical servers and a tape drive and backup software installed on five of them. I dutifully inserted the tape for that night's backed into one of the drives and right clicked, chose erase.
Nothing. No lights on the tape drive to show a job was in progress.
So I repeated the steps.
I did this at least three more times until I saw a tape below the one I was working on light up.
Good 'ol Backup Exec allowed anyone to point the software to any tape drive on another server in the network so long as that server was also running BE.
I must have erased that lone backup for the old DB at least four times. That tape was definitely erased!
Once I fixed the error and erased the correct tape I told my boss what I had done. His reaction?
"Good. That's what they get for messing around with the backups and not telling us."
Others do, Sabroni? Really? Care to show me who explained why Microsoft allowed a file vital to the database to be deleted while the system was running? Or is that reasoning, as I suggested, only known to Microsoft?
But thank you for your input. I'm sure it was intended in the spirit of diversity, inclusion, welcomeness, and openness.
"Why would database software be written such that deleting an ancillary file ( such as a log file of historic steps) cause it to fall over?"
Not familiar with that particular engine but one possibility is that it contains the log of current transactions, i.e. WORK that's STARTed but not yet COMMITted or ROLLed BACK. That puts the engine in a bit of a quandary as to what it's working on. The longer term need for the logs is to roll the database forward after a restore from backup and there ought to be a means of archiving logs for use as and when required instead of letting them grow indefinitely.
>The longer term need for the logs is to roll the database forward after a restore from backup and there ought to be a means of archiving logs for use as and when required instead of letting them grow indefinitely.
Running a full backup will normally consolidate the transaction logs on a MS database (SQL, Exchange, etc) server. If you run incremental or differential backups, it doesn't.
"Running a full backup will normally consolidate the transaction logs on a MS database (SQL, Exchange, etc) server. If you run incremental or differential backups, it doesn't."
I'm more used to Informix where you back up the logs separately. Also the restore of a full backup plus incremental takes you to a fully check-pointed position. You only need the logs since the last incremental was made to roll forward from there. And you were doing your incrementals every night or even more frequently, weren't you?
"Why would database software be written such that deleting an ancillary file ( such as a log file of historic steps) cause it to fall over?"
Because the transaction log of an SQL server is anything BUT an "ancillary file".
SQL Server transaction logs are actually split in virtual log files, that are used in rotation. All transactions are written sequentially to it and eventually it goes back to the start / 1st virtual file and starts over, ever-overwriting old transactions.
Other than being where current, non-committed transactions are serialized to allow for rollback or for recovery in case of power loss / unexpected shutdown, it also serves multiple other purposes and that's where "runaway transaction log files" tend to happen to people, but those are actually an indicator of something not configured correctly or something else being broken.
The issue is that those virtual log files can only be reused once they can be cleared; and a few things control how and when they can be cleared.
One thing is ongoing, uncommited transaction. Put simply if you have 10 Gb of uncommited transactions, and 5 Gb of transaction log files... Well, the file will grow. That's normal, otherwise the transaction can't happen.
But SQL Server also has different recovery models (related to how you backup/restore a database).
A "full recovery" database is designed in such a way that by combining (full / differential) backups and transaction log backups, you can do point in time restores down to an exact transaction.
To do that however all transactions must also be backed up! So until a specific virtual file is fully backed up, it can't be freed for reuse. It's typical for fully logged database to have a very fast transaction log backup regime (think 5-15 minutes), because the transaction log must hold all transactions until they're backed up.
If someone sets a database to be fully logged, and then does not get a transaction log backup schedule going, well... They log fill will grow indefinitely, because... well because that's what they asked it to do.
Another reason is replication. The various replication schemes that exist in SQL Server are based off the transaction log: transactions that are commited on the source server, are transferred to the transaction log of target server and then also commited to the datafiles of the target servers. Any virtual log files that contain data that is not yet replicated to even just one of the target server can't be cleared for reuse, so broken replication configurations / failed servers can be a reason for transaction log files bloat.
Oh well I could keep blabbing about that, but yeah confusing transaction log with just "log files" is a classic mistakes that bites every SQL dabbler in the ass eventually :)
Not familiar with that particular engine but one possibility is that it contains the log of current transactions, i.e. WORK that's STARTed but not yet COMMITted or ROLLed BACK.
Pretty much spot on. Depending on your database recovery model, it either contains the data for uncommitted transactions, or a log of everything that has been done in the database since the last full backup (enabling a restore to any point in time since then, and fancy things like database mirroring / log shipping).
The former should mean small(ish) log files, which usually contain a fair amount of empty space, corresponding to the largest transaction in the file's history. That space can be freed up by using the database commands to shrink the file.
The latter means big log files, especially if you're not regularly backing up the database.
On a modern file system, the log file will be locked while the database is online, safe from errant presses of the Delete key in file explorer. I'm guessing that on Win2000, not so much...
I've worked with SQL Server since SQL Server 6.5 and never been able to delete a log file when the SQL Server service is running, so I find the entire story a bit unlikely to be honest.
As you say, a SQL Server log file isn't just an ancillary file of historic steps. It maintains the state of active transactions and is a critical part of the database.
I might have something of an answer. As a programmer, say you have a log file for activity recording purposes. When the server comes up, you check that the log file is there. If it isn't, you create a new one and the server is good to go.
But when the server is up and running, why check for the _existence_ of the log file ? You know it was created at start, so it should be there. You also have a tendency to foolishly assume that the people who will be administering the server have a clue and won't be deleting files willy-nilly.
Also, just recreating the log file and continuing operation is not really a good idea. If the log file disappears while the server is running, it means that there is a problem and it is actually better to crash the server to bring attention to it, rather than just trudging on and ignoring such an issue.
The transaction log file is NOT an "ancillary" part of the database. It's a critical file, which is just as important as the data files, since the state of the database is represented by the combination of data files AND transaction log file. The transaction log file contains changes to the data that have been committed, but that may not have been applied to the data files yet.
A database log file is very different from an OS or application log - despite the common name - it's not an ancillary record of "historic" steps, it's a live running record of ongoing activities with all the data required to make them "done" (committed), or canceling them by rolling back them while keeping full data coherency. It can also be used to roll-forward, replay the actions to restore the state of the database to a given point in time, from a backup which may not contain the latest data. They are more like the "journal" of a journaled file system.
Usually, any change to the database is registered first in the log file, and only later in the data files. And they must be written quickly (that's why log files are often places on faster storage) and committed to disk, so if the instance crashes, the database can be put again in a coherent state reading log data and reapplying or rolling back whatever is needed - often data blocks are kept in memory for performance reasons, and written to disk later. Log files are usually sequential files with a simpler structure that are much faster to write than the much more complex and random access data files.
Remove it, and the database no longer knows where to write those critical data, and to avoid corruption it will usually refuse to perform any activities until a log file is available again. There could be ways to recreate an empty log file, but of course anything not yet applied into the database have been lost - and backing it up *immediately* is needed, because there will be no way to roll forward from the last backup.
There are many reason to protect log files more than data files....
The reason is many SQL Server instances are misconfigured and never properly backed up. In other databases keeping the size of log files under control is easier - especially since you usually wanted to put them on very fast storage albeit usually snaller, especially for system doing a lot of transactions. But SQL Server was and is often used as a fire-and-forget databases, and issue start to arise later.
Once, years ago, I was called at a manufacturing plant far away because their SQL Server, where they wrote a lot of telemetry data coming of the manufacturing machines, was crawling to a halt. They thought it was some arcane issue that would have taken days to solve. Actually it was just a few bad DB design decisions and lack of proper maintenance - after a few hours spent putting it a a good and safe configuration it was fast again - and I'm quite sure the local acting DBA was later called for a word or two....
"I'm quite sure the local acting DBA was later called for a word or two"
The local acting DBA would probably have had a good response if he dared use it: "Why don't you send me on a training course?". That wouldhave put his manglement in a real quandary - if they did that they might have to pay him the going rate for a trained DBA.
"The local acting DBA would probably have had a good response if he dared use it: "Why don't you send me on a training course?". "
A number of years ago we were supposed to be doing an install at a new client. The usual procedure for SQL Server was to send our database creation script to the DBA for comment, with a note explaining that we would work with them to ensure that the resulting database met all requirements. The database creation script was extensively commented.
The result was an extremely irate phone call from the DBA on Friday evening at 5 (so doubleplus ungood). He hadn't read the bit about "for comment". What did we mean sending him this to install? Who did we think we were? There was no way he would install stuff from such totally incompetent people and his MD would hear all about it on Monday.
He continued on transmit only for quite a while and eventually I managed to find out what he was objecting to. We specified the log file dimensions and parameters (with various "insert your preferred value here" comments but he hadn't bothered to read that). Our default maximum log file was 2Gbytes.
His view was that he refused to have log files on databases because they took up room.
So how were we supposed to roll back bad transactions etc.? Nothing to do with him, company policy.
We never did get the job, which was based in an East Anglian city well known for football and insurance. It was eventually leaked back to me that the DBA would be at a loss to display any qualifications on SQL Server. But it did cause me to cancel my car insurance policy and move it to another company.
>The local acting DBA would probably have had a good response if he dared use it: "Why don't you send me on a training course?".
That is a high risk strategy that could backfire badly. Management might get the idea that they have gotten so far with incompetent DBAs that they might just as well fire him and hire in our friend Anonymous DBA whenever the need would arise. And since slimming down the work force could net them a bonus I'd say the DBA would be heading for a role as the second part of a three course meal with an apple in his mouth.
I can do that ... Ever lose control of a handtruck loaded with punchcards containing a post-grad's entire dataset from an overnight run at SLAC? I have. Down a flight of stairs (the elevator/lift was down for maintenance). I was a teenager, landed a position as a summer intern ... I thought for sure I would be out the door.
After giving me shit for about half an hour as we picked up the cards, Don (the post-grad) relented & demonstrated the proper care & feeding of a card sorter, and explained how columns 73-80 were used. I'm pretty sure that's where I started developing my belt-and-suspenders (belt-and-braces to you Brits) approach to data retention. I'm not paranoid about it, but I'm getting there ...
I once worked for a company which used a mainframe in the midlands to run FEA. One day two engineers set off which a full set of program and data cards in the boot, in two boxes. On the way there, someone drove out in front of them and they had to do an emergency stop.
On arrival on site it proved necessary to turn round and go back again because, of course, the boxes had ruptured and the cards were distributed all around the boot.
After that more care was taken in ensuring data integrity.
Biting the hand that feeds IT © 1998–2019