\[Editor's Note: This column was largely adapted from Chapter 17 of Managing Microsoft Exchange Server, ISBN 1565925459. The information appears here courtesy of the publisher, O'Reilly & Associates.\]
Because this issue of Windows 2000 Magazine highlights disaster recovery, I thought I'd give you a quick rundown of the basic planning procedures to follow before disaster (i.e., a crashed Microsoft Exchange Server system) strikes, as well as the first steps to take after the fall.
Planning Prevents Poor Performance
Failure to plan is the primary cause of permanent Exchange Server data loss. Consider a typical single-site environment with two servers. A fire destroys one server. The server's backup tapes were sitting next to it instead of in an offsite (or fireproof) vault. That server is irrecoverable because its administrator didn't plan adequately, not just because it roasted like a marshmallow. Ask yourself the following questions, then take the appropriate steps based on your answers.
How long can I afford for my server to be down? The less downtime you can tolerate, the more preparation you need. For example, if you can't afford for your mail server to be down for more than 4 hours, you need to think about ways to reduce your recovery time (e.g., using hot spares, clustering) or change your backup strategy to permit faster restores.
Do I have adequate replacement hardware? If your primary server is a large quad-processor box with a 60GB store, what happens when you need to restore all 60GB of data to another machine? The best solution is to keep a clone of your standard-server configuration so that you can use it as a recovery server, but you might need to find creative workarounds if you don't have any spares. Of course, you also need to ensure that you have the right backup hardware and software on your recovery server, or you might not be able to restore your backup.
Do I make regular backups and make them often enough to capture all the changes that occur on my server? Do the backups include system information such as the domain SAM and server Registry? For information about making proper backups, see Getting Started with Exchange, "The Six Deadly Backup Sins," April 2000.
Do I make the right kind of backups? Consider whether your backups adequately capture the data you need. If the answer is no, come up with a new plan.
Do I regularly test my backups to make sure that they work properly? Do I regularly review the backup logs for errors? Do I practice disaster recovery to prevent surprises? The answer to all three of these questions had better be yes, or you're headed for trouble. Change your ways now, while you still can.
Are my backup tapes secure? Ideally, keep multiple backup sets and store some of them in a secure offsite location. And be sure you have at least one spare tape drive that can restore the tapes.
Before you can build a truly comprehensive plan, you must understand the mechanics of Exchange disaster recovery. I suggest that you read Microsoft's white paper, "Microsoft Exchange Disaster Recovery" (http://www.microsoft.com/exchange/ 55/whpprs/backuprestore.htm), as soon as you can. In the meantime, the following basics will give you the knowledge you need to get started on an appropriate plan.
What to Restore?
You can cleanly separate recovery operations into two tasks: recovering the OS and recovering the Exchange Server data. Sometimes you need to perform both tasks, and sometimes you need to perform only one task.
The first task is recovering the OS. If the Exchange Server database isn't damaged and if you can restore the OS without touching the Exchange Server installation, you're good to go with a simple OS restoration. You're more likely to be in this favorable situation if you use a disk configuration that separates your OS installation, Exchange Server databases, and transaction logs on separate physical disks and if you use recovery methods (e.g., a parallel Windows NT installation) that help you quickly recover NT. I recommend keeping the OS, transaction logs, and Exchange databases on separate physical-disk subsystems whenever possible.
Let me digress to point out a disaster-recovery obstacle that affects many sites: using a PDC as your Exchange server. To restore an Exchange server from NT, you must have access to the SAM for the server's domain. If your Exchange server is a member server or BDC, you can probably access the SAM without difficulty (if a domain controller is available when you do the restore). But if your Exchange server is a PDC, beware. If the PDC fails and you need to reinstall NT to fix it, you'll have difficulty recovering your Exchange Server configuration because when you reinstall NT you get a new SAM database, and the old SIDs that Exchange Server needs disappear.
The second task involves recovering the Exchange Server database and log files. To successfully restore an Exchange Server database, you must follow a couple of ironclad rules:
You must have a complete backup—either a full backup or a full backup combined with appropriate incremental and differential backups.
You must have already disabled circular logging, and you must have access to the log files, either on their original disk or from a recent backup. (Circular logging overwrites old log files. You can restore a server on which you've enabled circular logging, but unless you disable this feature, you won't have a complete set of log files available at restore time.)
The first rule is self-evident: Without a good backup, you're toast. The second rule makes sense if you think about the transaction log files' purpose: to capture transactions that the Exchange store process hasn't committed to the store. If you have a store file backup and a complete set of log files, you can play back the log file transactions to restore the recovered database to the prior status quo. Guess what? If you don't have the log files, you lose any uncommitted transactions.
Restoring the OS
Imagine that you're trying to fix a downed server. Your Exchange Server data is still safe (as far as you can tell), and you still have access to the domain SAM, but you need to reinstall NT to get the server back up. Here's what to do.
You now have a clean installation of NT and Exchange Server. However, don't start the Exchange services yet. If your Exchange Server database and log files are intact, you're in good shape. If not, you'll still need to reload the Exchange Server store data from your backups.
If you have an offline backup, make sure the Exchange Server services on the recovery server are still stopped. Next, copy the database and log files to their proper locations, restart the Directory Service (DS) and System Attendant service, then run isinteg patch. (For information about running Isinteg, see Getting Started with Exchange, "The Sorcerer's Apprentices," May 2000.) After Isinteg runs, restart the IS.
Restoring the Database and Log Files
How hard you need to work to restore an Exchange Server backup depends on how you stored your database and log files (i.e., whether your databases and transaction logs are on separate physical disks and, if so, whether one or both disks failed). Repairing a failed database disk is probably the easiest type of recovery. All you need do is perform the following steps:
If your transaction log disk failed, you'll probably end up losing some data. When you lose the disk that holds the IS logs, perform the following steps:
What if the logs and databases resided on the same physical disk or if both disks failed? Just follow both sets of steps, in order.
Time to Get Busy
I can't summarize in this short column everything about Exchange Server disaster recovery, but I've introduced you to the basics. I encourage you to study this article and read (or reread) Microsoft's disaster-recovery white paper, then write a detailed custom recovery plan and regularly practice recovery. Also, a $200 call to PSS for a helpful walk through the recovery process might be the cheapest insurance you ever buy. (For tips to get the most out of a call to PSS, see Getting Started with Exchange, "7 Steps to Using Tech Support," June 2000.) The help you can get now will be less expensive than the help you'll need after a bungled restoration.