An under-the-hood look at the Exchange ESE backup process

I recently updated my understanding of Exchange Server backups and restores (I'm always discovering new things about Exchange) and learned quite a bit about how Exchange backs up databases. This week, I want to give you an under-the-hood look at the Exchange Extensible Storage Engine (ESE) backup process. Next week, I'll cover the ESE restore operation.

Starting Off
Several important operations occur at the start of Exchange's backup process. When you initialize a full backup operation, ESE begins by flushing all the dirty pages in its cache (i.e., the Information Store—IS—buffers) to disk and halting the checkpoint. The checkpoint won't advance until the backup operation is complete. (When you run a partial backup such as a differential, incremental, or copy backup, ESE lets the checkpoint advance because the backup operation doesn't touch the databases.) Next, ESE creates patch files for each database you're backing up. ESE uses patch files in special circumstances during backup operations to ensure database integrity. (In Exchange 2000 Server Service Pack 2—SP2—Microsoft figured out how to avoid the need to flush dirty pages that cause patching operations, so SP2 doesn't need to use patch files.)

Copying the Databases
The next step in the backup process involves backing up the database files. The backup application uses backup API calls to pass to ESE a list of databases to be backed up. These database are open files (remember Exchange backups let you back up the databases online), so the backup application doesn't simply copy the databases to the backup set. Instead, ESE begins to send the backup application 64KB chunks of database pages (sixteen 4KB pages at a time) in sequential order. During this crucial step, ESE performs a checksum on each page; any errors cause the backup operation to terminate with the -1018 error. (See "Understanding -1018, -1019, and -1022 Database Errors," http://www.exchangeadmin.com, InstantDoc ID 25236, for a description of this error.) ESE continues this process until it has sent the backup application all the pages for each database.

Back Up the Transaction Logs
Next, ESE must store the transaction logs to the backup set. As I mentioned earlier, ESE halts the checkpoint at the beginning of the backup. Although the checkpoint is halted, ESE continues to write transactions to the log files and continues to flush dirty pages from the database cache to disk. To back up the log files, the backup application uses the HrESEBackupGetLogAndPatchFiles API call to request a list of log files (and patch files, if applicable) from ESE. When ESE receives this call, it closes the current log file, saves the file as the next log generation in a sequential list, and opens a new E0n.log file (n refers to the storage group—SG—instance of the log file). In the case of a full backup, ESE then returns a list of log files to the backup application; this list starts with the current log generation (in which the checkpoint was halted) and ends with the log generation that ESE just closed (i.e., E0n.log minus 1). In the case of an incremental or differential backup, ESE returns a list beginning with the oldest log generation on disk and ending with the most recently closed log generation. Using this list, the backup application can open file handles to the log files and copy them to the backup set. During this operation, ESE ensures that no log generation is missing from the sequence passed to the backup application.

Truncate the Logs
After the log files have been stored to the backup set, they aren't needed on disk. During full and incremental backup operations, ESE truncates the log files on disk after it receives the HrESEBackupTruncateLog API call from the backup application. The lower of either the checkpoint log generation or the log generation listed in the database header for the current full backup determines which log files ESE truncates. (To view the header, use the Eseutil program with the /mh parameter.) After the log files are truncated, the backup operation is complete and the backup application closes the backup set. At this point, ESE can return to normal database-engine operations and permit the checkpoint to advance.

The backup operation for the Exchange ESE is intricate and can be confusing. However, understanding how this important operation works is crucial. The way you structure your backups and the combination of full, incremental, or differential backups that you employ determines the recoverability of your valuable Exchange data. As an exercise to better understand this process, you might want to consider testing backup operations in a lab environment.