Tools and tips to help you decide whether to rebuild Exchange Server 5.5 databases

Like all databases, Exchange Server's Information Store (IS) grows over time. Messages, documents, and other items flow into and out of the IS's mailboxes and public folders. Some items remain in public folders permanently, but users eventually delete most of the IS's data. These deleted items leave empty space, or white space, in the database. Microsoft attempts to restrict the IS's demand for disk space by reusing the database space that deleted items occupied before users deleted them.

Background Maintenance
Microsoft designed Exchange Server 5.5 to perform online maintenance at least once a day. You can set the interval at which maintenance occurs by selecting a server in the Microsoft Exchange Administrator program, selecting the File menu's Properties option, and clicking the IS Maintenance tab, which Screen 1 shows. Administrators typically schedule Exchange Server database maintenance when user demand is lightest, because of the load that maintenance activities generate. I recommend scheduling maintenance to start at about midnight. (To speed up the maintenance activities, I don't run online backups and maintenance activities at the same time.) The maintenance period that Screen 1 shows lasts from midnight until 3 a.m., which is enough time for database maintenance activities to complete on all but the largest servers.

A set of threads that execute within the IS process (store.exe) perform this background maintenance. The threads perform two major tasks: rearranging the contents of IS pages to free space within the database (i.e., defragmenting the database) and removing deleted items that have passed their retention date. The threads also perform tasks such as monitoring mailbox quotas and expunging tombstone entries. (Tombstones are entries in the Directory that represent deleted mailboxes or custom recipients.)

Database defragmentation. Internally, Exchange Server organizes its databases in 4KB pages. As transactions occur, Exchange Server writes data into these pages. Exchange Server releases pages when users delete the pages' contents. Some pages store message headers; others hold message content and attachments. When a new message arrives, Exchange Server inserts the message's header into one page and saves the message's text and attachments in one or more other pages. A message's text often fits into one 4KB page, but attachments are typically much larger than 4KB, so Exchange Server stores them across multiple pages.

Obviously, the database is most compact and efficient when it fully utilizes all of its pages, but an Exchange Server database is in this state only when you first create it or after you rebuild it. At all other times, the pages are in a state of flux, because users are interacting with the database.

As users delete messages, the pages in which Exchange Server saves the messages become free for reuse or defragmentation. A page is a candidate for defragmentation when Exchange Server removes some but not all of its content. A message header occupies approximately 400 bytes, so one page in the IS might contain 10 message headers. Pages that hold message headers are the private IS's most obvious candidates for defragmentation, because users partially empty many message-header pages as they delete messages during a workday.

Through defragmentation, Exchange Server moves message headers between pages to fill some pages and release others for reuse. For example, if one page is 30 percent full and its neighbor is 60 percent full, Exchange Server moves the data from the 30 percent-full page into the 60 percent-full page so that one page is 90 percent full and one is empty and available for reuse.

Exchange Server performs background defragmentation of the public IS (pub.edb), then the private IS, then the Directory store (dir.edb). Screen 2 shows an event 179 log entry, which notifies you that an online defragmentation pass has begun for a database. In the event-log entry that Screen 2 shows, store.exe is processing the private IS. Because priv.edb holds user mailboxes, it is usually the largest database on an Exchange Server system and takes the longest to process. Unless defragmentation encounters an error, you'll see event 180, which tells you that the defragmentation pass is complete, sometime after you see event 179. On the morning that I captured Screen 2, event 180 for the server's private IS occurred at 12:57 a.m. Screen 2 shows that event 179 registered at 12:16 a.m., so I can calculate that the server took 41 minutes to process a 3GB private IS. This calculation tells me that the system (a dual 200MHz Pentium server with 256MB of memory) took roughly 14 minutes to process each gigabyte. The speed of processing depends on the system's load and power. The speed of a system's disks and controller affect every database operation, so fast disks and controllers can substantially reduce maintenance times.

If you see event 183 in an Exchange Server's application event log, Exchange terminated maintenance before it finished defragmenting a database because the scheduled maintenance period (such as the period that Screen 1 shows) expired. To solve this problem, extend the maintenance period until you no longer see event 183 in your log files.

Deleted-item retention. Although Exchange Server 5.5 improves the mail server's defragmentation capabilities, it introduces a feature that increases database sizes—deleted-item retention. An Exchange Server 5.5 system soft deletes an item when a user first removes the item from the Deleted Items folder. The server then hard deletes, or permanently removes, the item from the IS database after a set retention period expires. Many companies set deleted-item retention periods between 7 and 14 days so that users can recover an item 1 to 2 weeks after they delete the item.

Deleted-item retention is a fantastic messaging feature for users and administrators. Users who have Microsoft Outlook 8.03 or later can recover deleted items through their messaging client without an administrator's assistance. Even better, administrators don't have to restore entire servers just to recover a document that someone deleted by mistake.

The increase in database size is the downside to deleted-item retention. To discover how much space deleted items occupy in the 3GB private IS on my Exchange Server system, which retains deleted items for 7 days, I looked at one of the application event log's 1207 events. An Exchange Server system records an event 1207 in the application event log when it removes deleted items from the IS. Screen 3 shows that the IS found 8841 soft-deleted items (which occupied 153MB) at the start of a deleted-items cleanup and 7816 soft-deleted items (which occupied 144MB) after Exchange Server removed all the items that had exceeded their retention period. I used this information to determine that the server's use of deleted-items retention expands the size of my database by slightly less than 5 percent (153.6MB is 5 percent of 3GB).

You don't have to rely on event 1207 to discover how much space deleted items are using on your system. You can use Exchange Administrator's Save Window Contents feature to extract data about mailboxes, including deleted items. In Exchange Administrator's left pane, you expand the properties of the server that you want to gather information about. Then, the right pane lists data about each mailbox, including the amount of space that the mailbox's deleted items occupy, as Screen 4 shows. If you don't see the Deleted Items K column, use the Columns option on the View menu to add deleted items to the display. When you can see the information you're looking for, select Save Window Contents from the File menu to write the data in the right pane to a Comma Separated Values (CSV) file. You'll be able to open the resulting CSV file in Microsoft Excel, Microsoft Access, or another data-manipulation program.

I don't think that the Save Window Contents method for calculating deleted items' size is as accurate as 1207 events' data. At 11:00 a.m. on the day of the deleted-items cleanup that the event-log entry in Screen 3 describes, I used Save Window Contents to save my private IS's user-mailbox statistics, then opened the CSV file in Excel. I added up all the entries in the file's Deleted Items K column; they totaled 272MB—considerably more than the 144MB that Screen 3's event 1207 reported. User activity on the Exchange Server system was light between midnight (when the event 1207 log entry appeared) and 11:00 a.m., so I can't blame the difference in data on users' deletion of a large number of items. I suspect that the difference between the two reporting methods is that the event log takes into account the effect of Exchange Server's single-instance storage model. I'm guessing that when Exchange Administrator lists the amount of space that deleted items occupy per mailbox, it includes every deleted item that the mailbox points to, regardless of whether the mailbox shares the message with another mailbox on the server. Or perhaps the discrepancy reveals an Exchange Server bug.

Even if the larger figure I found (272MB) is correct, the private IS on a server that hosts some large mailboxes expands by less than 10 percent as a result of the 7-day deleted-item retention period. I consider deleted-item retention to be well worth the extra disk space. However, as you examine the size of your ISs and consider rebuilding Exchange Server's databases, think about whether you need to change the length of your deleted-item retention period.

Determine Recoverable Space
The set of empty pages in a database makes up the database's white space. When Exchange Server rebuilds a database, it condenses information to fill all the database's pages and frees up for other uses the hard disk space that the database's white space consumed. Exchange Server 5.5 with SP1 or later uses event 1221 in the application event log to report how much white space a database has after defragmentation, as Screen 5 shows.

Exchange Server reports separate 1221 events for the private IS, the public IS, and the Directory store. The 1221 event-log entry in Screen 5 shows that my server's private IS has 166MB of white space, so roughly 5.5 percent of the 3GB database is white space. According to Microsoft, the white-space figure that the application event log reports is conservative; this number represents the minimum amount of disk space that rebuilding the database will recover.

For a more accurate estimate of how much space rebuilding an Exchange Server database will recover, run Eseutil with the /ms switch. Eseutil /ms scans your database and calculates the number of pages that are available for reuse. Only the version of Eseutil that ships with Exchange Server 5.5's service packs supports the /ms switch. To make sure that you have the right version of Eseutil, open the executable's Properties page and look for the build number. You need build 2232 or later to use /ms; Exchange Server 5.5 comes with build 1960.

The IS service must stop for Eseutil to run in /ms mode, but the utility doesn't take long to scan even a large database. Eseutil scanned my server's 3GB priv.edb file in 74 seconds. I typed

c:>\winnt\system32\eseutil /ms priv.edb

at a command prompt, and Eseutil produced a 640KB text file that contained details about every structure in the database. Most of this data would mean nothing to you. Figure 1, page 142, shows the results that I was interested in. The utility discovered that 88,762 of my server's database pages were available for reuse. Each page is 4KB, so I estimated that a full rebuild of the database would recover 355,048KB, or almost 350MB. Note that this figure is twice as large as the estimate that Exchange Server's routine defragmentation provided.

Do I Need to Rebuild?
After you determine how much white space a database has, you can make an intelligent decision about whether you need to run Eseutil to rebuild the database. Shrinking databases is good for a network. Backups and restores finish more quickly when databases are smaller, and you can devote the space you liberate to other applications. However, rebuilding an Exchange Server database requires you to take the server offline, which can cause problems.

Every time you rebuild an Exchange Server database, you must stop the associated Exchange service; you must stop the IS service to rebuild an IS database or stop the Directory Service (DS) to rebuild the Directory. Rebuilding databases isn't a fast process—the most powerful servers process data at about 4GB per hour—so the downtime that rebuilding requires is significant. Some large Exchange Server systems have databases that are more than 100GB. Companies with databases of this size can no longer even consider a rebuild, because rebuilding the database would require the Exchange Server system to be down for more than a day, which is unacceptable except in periods of low demand, such as holidays.

Hard disk prices are low and backup hardware is fast, so if your server needs to be available 24 * 7, you might not be able to justify a rebuild. When Eseutil informed me that my 3GB private IS had 350MB of white space, I decided that reducing the database's size by 5.5 percent didn't justify taking my Exchange Server system offline for an hour. I would probably have made the same decision even if rebuilding could have saved me 10 percent of a larger database. I don't think I could justify 4 hours of downtime to recover 1GB from a 10GB database or 8 hours of downtime to recover 2GB from a 20GB database. In addition, remember that although Eseutil fixes minor inconsistencies in a database, it doesn't cure severe database corruption. If you notice evidence of corruption, such as an event 1018 in the event log, you have a problem that you need to fix fast.

Consider running Eseutil if event 1221 or Eseutil's /ms option reports that the amount of white space in a database exceeds 10 times the amount of daily messaging traffic on the server (which you can estimate by measuring the total size of transaction logs that a server generates every day). Consider running Eseutil if you remove a large number of mailboxes from a server. The large-number threshold differs among servers; it depends on how much data the server's mailboxes hold. Use 10 percent of either the total number of mailboxes or the total size of mailbox data as a rule of thumb. The Directory store usually doesn't benefit from a periodic rebuild unless you make frequent and large-scale changes to Directory entries, perhaps as the result of directory synchronization with a foreign messaging system.

The most important part of deciding whether to rebuild is knowing what you're doing. Rushing to run Eseutil because a note on an Internet mailing list advises you to do so is a bad idea. If you appeal to a mailing list with a problem, you'll receive all sorts of advice, but the people giving the advice might not understand all your circumstances or really know how to help you.

If you decide to use Eseutil to rebuild a database, make a full online backup before you run the utility and make another full backup after the utility finishes. Rebuilding the database changes internal structures and renders previous transaction logs invalid. You don't want to have to redo all your work if a problem occurs. Also, before you run Eseutil, make sure you have available at least as much free space as the size of the rebuilt database. You don't want to run out of disk space after a couple of hours of solid processing.

Keep a Cautious Eye on System Vital Signs
Experienced administrators monitor their systems carefully. Exchange Server provides you with a massive amount of detail in the event log, Exchange Administrator, and the output from utilities. This detailed information helps you envision how a server is functioning and lets you make intelligent decisions about proactive maintenance, including the decision about when you need to rebuild the IS.