This is number one on my list of Essential DBA tasks; it’s not the most glamorous item on the list, but I would argue it’s probably the most important for this reason: the DBA is usually responsible for ensuring that backups of corporate database information occur and that those backups are usable in case of a disaster. I’ve seen instances where people have lost their jobs over this item alone. So, no excuses here:
- Make sure that your backups are setup,
- make sure that they happen,
- make sure that the backup files are safe, and
- make sure that the backup files are usable in case they are needed.
It’s not just tornados and terrorists that I’m talking about here; just about anything can cause a disaster, even your own hardware. I was once called in to help a client fix a database; it turned out that they had a hard drive “glitch” which caused a couple of pages in their SQL Server database to become corrupt. Easy, you say, just restore from the last backup before the failure…well, their IT staff never performed backups, and they were stuck. The short story is that I figured out how to BCP out the data, avoiding the corruption, and had to re-create the entire database from the BCP files. A huge amount of time was spent by this client before it got to me (including consulting with other vendors first), and they were down until it was corrected. Moral of the story…don’t assume that this is just for natural disasters or odd events like the Apocalypse; assume that you may need the backup just about any time.
Let’s break down each of these items. First, make sure that your backups are setup. This includes full database and transaction log backups. You need to make sure that there are guidelines in place that govern how often the backups are taken. This should be based on the requirements of those using the data found in the databases. How much data can they afford to lose if something bad happens? How much drive space do you have to keep copies of backups? When should they go to tape/to another site? Once these questions are answered, make sure you setup regular, scheduled backups to keep the data safe in case of disaster. Related to this is also the recovery model of each database; full or simple matters here. Use SQL Server maintenance plans, PowerShell, SQL Agent jobs, third party software, whatever, but make sure it happens for ALL databases. Whatever you setup should also be documented so that someone else can follow your notes and restore if you are not available.
Second, make sure your backups happen. Your job isn’t over once you setup the backups on a schedule. You need to make sure that the backups execute successfully every time. Yes, that means you need to monitor the job(s)…all the time. For one server that’s easy, but for multiple…you probably want to use PowerShell or SSRS or several interns (just kidding about the interns)…there are lots of options out in cyberspace or you can write your own. Years ago (using SQL Server 6.5, so I’m dating myself here) I wrote a VB 6 application that gathered the SQL Agent job information from all of my servers, including backup jobs. All I had to do was open the app each morning and look at the list; anything red was a failure that I needed to check out and correct. It made my life easy for a couple of years on that job. Please, make it easy on yourself and automate this as much as possible, for your own sanity and also so that others can perform this task in your absence. You don’t want to be Mister or Miss I-Can-Do-It-All-Myself-All-The-Time.
Third, make sure that the backup files are safe. This means that someone needs to be responsible to make sure that the files can’t get into the wrong hands. If they are stored on your network then it’s probably someone else’s responsibility to manage the network security, but you should at least ask about it. If your backups roll to tape at some point, are they safe as well? What happens if the files get into the wrong hands, do they need to be encrypted? If so, also make sure that any/all encryption keys are in a safe place and they are also backed up (and included in the documentation!).
Last, make sure that the backup files are usable. Grab one on a regular basis and test out your restore documentation. Is the documentation still accurate? Can someone else do the restore if you are not available? Are the backup files readable? Grab them from all of the different locations where they are stored (local disk, network disk, tape) and make sure…weird things happen to bits and bytes when files are transferred from one storage device to another. If they are encrypted, can you decrypt them successfully? It doesn’t do you or your company any good if the backup files you create can’t be used if needed.