You know that someday disaster could strike at your Exchange environment—probably at the worst possible time. Regardless of whether your Exchange organization is large or small, losing mail services has a big impact on your business. These three tips will help you in designing, planning, testing, and implementing an Exchange-specific disaster recovery plan. For additional tips, see the Web-exclusive sidebar "More Exchange Disaster Recovery Tips," http://www.windowsitpro.com/microsoftexchangeoutlook, InstantDoc ID 49606.

Tip 1: Assess Required Service Levels
Email is a vital function, perhaps never more so than when disaster strikes and mail services aren't available. You need to make sure all email users at all levels of the business agree about the response times and service levels needed. Clearly explain to users how IT will restore email services in different disaster scenarios.

Recovery time will depend largely on how long it will take to recover Active Directory (AD), the Exchange system, and Exchange databases from backup media. Therefore, to gauge response time, first calculate the total amount of time needed to recover a complete database and a complete server. Doing so lets you estimate the amount of time needed to recover an Information Store (IS) or a complete server in optimum circumstances. You'll then have to build in additional recovery time for more severe disasters to accommodate dependencies such as faulty or inoperative network infrastructure and other failing services (e.g., SANs, NICs). To shorten recovery time, you might also opt to decrease database sizes, which will almost automatically require additional databases and storage groups (SGs). Each SG, with a maximum of four per server, can have as many as five databases. Because each SG creates its own log files, you'll then want to separate the transaction-log sets on dedicated disks. Spreading the storage load in this way can help you recover the databases more quickly.

Tip 2: Create a Disaster Recovery Information Kit
Create a disaster recovery kit, which should be stored securely offsite on a remote server, backup tape, or even in a box containing hard-copy files. The kit includes detailed information about server names, passwords, installations, patch and driver history, configuration history, and licensing information. Also include in the kit disk and partition configurations, your Exchange organization name, administrative group and routing group names, system state information, and Microsoft IIS metabase backups. Store recent backups or printed information about where to find other backup media, store installation media, system state backups, and contact information about who or what type of IT pro can and will restore what data. If you have a SAN, include contact information for your SAN specialist.

Also you should regularly extract AD user information, such as email addresses, by using a utility such as LDIFDE or CSVDE and add this information to the kit. For example, you'd use the following command to export directory objects, including mail addresses:

ldifde -f C:\export.ldf -v

Tip 3: Back Up the Cluster Quorum Disk
If you're using an Exchange cluster, you'll need to include in your disaster recovery plan backing up and restoring the cluster quorum disk as well as the shared disks. Without the quorum disk, you won't have vital cluster-configuration data and more important, your cluster will no longer start when disk signatures have changed—for example, when you replace disks, use storage-management tools to change the disk configuration, or reconfigure the array on a shared bus.

To back up the quorum disk, you'll need to perform a full computer backup or a Windows system state backup. You can use NTBackup's Automated System Recovery (ASR) tool to create an ASR floppy disk that stores the disk signatures. On Windows NT 4.0 and Windows 2000 pre-Service Pack 3 (SP3), you could use the Windows 2000 Resource Kit Cluster Tool (clustool.exe) to back up the configuration of the complete cluster, including disk signatures. In case of a lost quorum and when the signature of the quorum disk changed, you can use the Win2K resource kit's Dumpcfg utility (dumpcfg.exe) to manually write the signature back to the quorum disk. (The Microsoft article "Recovering from an Event ID 1034 on a server cluster" at http://support.microsoft.com/?kbid=280425 provides detailed instructions for using Dumpcfg.) In Windows Server 2003, you can use the cluster service and the Windows 2003 Resource Kit Cluster Server Recovery Utility (clusterrecovery.exe) tool to fix a lost quorum disk. Additionally, make sure you read, understand, and test the procedures explained in the clusterrecovery.chm Help file.

Prepare Now; Minimize Stress Later
Schedule recovery tests to give you and your colleagues practice in recovering your Exchange server. Use test labs and the Recovery Storage Group (RSG) to check whether database backups were successful. You could, for instance, extract random mailboxes from the RSG by using the Exchange Mailbox Merge (ExMerge) utility to check the data and the Exchange Disaster Recovery Analyzer (ExDRA) tool to check data integrity. By testing your Exchange recovery procedure now, you'll be better prepared to handle a far more stressful, real-world Exchange crash.