New servers get a lot of attention. But the level of monitoring inevitably decreases over time. As a server begins to age, however, it can become a lot less reliable. As stable as Exchange Server 2003 is, some deployments have been in place nearly four years. And even Exchange Server 2007 has been on the market for more than a year now. If you neglect an aging server and expect it to continue to perform the way it always has, you might be surprised when problems occur. With that in mind, let’s consider what you can do to ensure the ongoing reliability of your Exchange Server.
1. Check Database Integrity
It's no secret that pre-2003 versions of Exchange Server were prone to database corruption. This corruption wasn’t always catastrophic at first, but it could grow fatal over time. I know of one enterprise-class organization that backed up corrupt Exchange Server data for months without knowing it. The corruption grew progressively worse until the database eventually refused to mount.
Exchange 2003 is far more reliable than its predecessors when it comes to database corruption. Even so, corruption can occur within an information store. After a bad experience with previous versions of Exchange, I got in the habit of regularly testing the integrity of my Exchange Server databases. Although Microsoft (to the best of my knowledge) doesn’t make any recommendations as to how often you should test your Exchange databases, I like to test my databases twice a year. I would probably test the databases more often, but testing databases requires that they be dismounted.
The best tool for testing database integrity is ISINTEG, a command-line tool that tests the integrity of the Information Store and can repair problems related to corruption or inconsistencies. To test the integrity of your Information Stores, you dismount the stores by opening the Exchange System Manager and navigating through the console tree to Administrative Groups, your administrative group, Servers, your server, and finally the storage group containing the store you want to dismount. Right-click the store and select the Dismount Store command from the shortcut menu (but leave the Information Store service running, as required by ISINTEG). You also want to verify that you have a good backup and that the volume containing each store has plenty of free disk space. A good guideline is to have enough free space to accommodate a copy of the database plus 20 percent.
Open a Command Prompt window, and navigate to the \Program Files\exchsrvr\bin folder. You’re now ready to use ISINTEG to test the databases and correct any problems it finds. To do so, enter the following command:
–fix –test alltests
In this command, the –S switch lets you specify the name of the server containing the database that you want to test or repair, the –ALLTESTS switch tells ISINTEG to run a full battery of tests against the store, and the –FIX switch tells ISINTEG to try to correct any problems that it finds. Keep in mind that ISINTEG repairs only database inconsistencies. It doesn’t fix physical problems within the database, such as damaged database pages. For that, you need to use ESEUTIL, which Tip 2 covers.
I strongly recommend that you initially execute this ISINTEG command without the –FIX switch. The –FIX switch usually works, but it has been known to destroy Exchange databases. Therefore, make sure you perform a full backup before you run the command with the –FIX switch.
ISINTEG can only test databases that have been taken offline. Simply enter the number of the database you want to test when prompted. ISINTEG will ask you to confirm the database that you want to test, then will begin the testing process. The time required to complete the tests varies depending on the size of the database and on your hardware’s capabilities, but the process can take several hours for a large database.
2. Test Database Integrity
ISINTEG checks the overall integrity of Exchange databases, but it doesn’t test the database in-depth. You can perform a detailed analysis by running the ESEUTIL utility. ESEUTIL looks at individual database pages and verifies that the checksum values within them are correct. Although ISINTEG and ESEUTIL perform different tasks, both are important.
Using ESEUTIL to check a database is harmless, but it can be more time-consuming than testing with ISINTEG. If ESEUTIL encounters problems, you can choose to use the /P switch to tell ESEUTIL to rebuild the database and correct the problem. However, use the /P switch only as a last resort because (as with ISINTEG’s –FIX switch) it has the potential to destroy your database. Always make sure you have a good backup before attempting any sort of repair using ESEUTIL. For general testing, though, you can use ESEUTIL without fear. To perform a general database test, ensure that the database is offline, open a Command Prompt window, navigate to the \Program Files\Exchsrvr\Bin folder, and enter the following command:
ESEUTIL /G database name
3. Perform Offline Defragmentation
While I'm on the subject of ESEUTIL, you might consider using ESEUTIL to occasionally perform an offline defragmentation of your Exchange Information Stores. Exchange Server 2003 performs an online defragmentation of the Information Stores nightly. An offline defragmentation does basically the same thing, except that the database is compacted in the process.
The subject of whether an offline defragmentation is valuable or a waste of time has been hotly debated in Exchange circles. In my opinion, an offline defragmentation is worthwhile if you’ve recently deleted a large number of objects from the database and need to free up a significant amount of disk space.
I also like to perform an offline defragmentation once a year. The defragmentation process essentially rebuilds the database, which helps the database operate at peak efficiency. However, an offline defragmentation requires you to take the database offline and tends to be time-consuming. That’s why most Exchange administrators prefer to perform offline defragmentation rarely if at all.
If you decide to perform an offline defragmentation, make sure you have a good backup first. As I mentioned, the defragmentation process rebuilds the database, and there’s always a slim possibility that something could go wrong in the process. You can use a variety of ESEUTIL command-line switches as a part of the defragmentation process. But you can initiate the defragmentation process at its simplest by using the following command:
ESEUTIL /D database name
4. Test Backups Regularly
If you remember nothing else after reading this article, I hope you’ll keep in mind that your Exchange 2003 server isn't getting any younger and that server hardware doesn’t live forever (see the Web-exclusive sidebar “4 Tips for Maintaining Your Server Hardware,” InstantDoc ID 98194, for ways to prolong your hardware’s life). Because server failures aren’t always predictable, testing your backups on a regular basis is essential. If a critical failure does occur, you have the peace of mind of knowing that you can restore the lost data.
I could write an entire article about different ways of testing backups. Unfortunately, I don’t have the space here to cover those details. What I will tell you, though, is that I test my backups monthly. To accomplish this, I have lab servers configured identically to my production servers. When it’s time to test my backups, I simply restore the backups of my production servers to their lab counterparts, then verify that the Exchange Server databases mount correctly and that I can use a lab workstation to log in and open my mailbox.
As important as it is to test your backups, it’s equally important to verify that your backups have run on a daily basis. Consistently verifying that your backups have run not only helps prevent data loss if a failure should occur but also helps prevent your transaction logs from running the server out of disk space. Unless you’re using circular logging, Exchange will keep all the transaction logs until the data stored in them has been backed up. If backups aren’t performed regularly, over time, the accumulation of transaction logs can cause the server to run out of disk space.
5. Verify that the Nightly Maintenance Cycle Is Running
Exchange Server 2003 is designed to be fairly self-sufficient. Exchange Server 2003 performs the following 11 maintenance tasks automatically:
- Clear the indexes on the mailbox and public folder stores
- Perform tombstone maintenance on mailboxes and public folders
- Remove expired messages from the dumpster for the mailbox and public folder stores
- Remove expired messages from public folders
- Remove deleted public folders that have tombstones more than 180 days old
- Resolve message conflicts within public folders
- Update server version information on public folders
- Check for and remove duplicate site folders on public folder stores
- Remove deleted mailboxes on mailbox stores
- Check the message table for orphaned messages (messages with a reference count of 0)
- Perform an online defragmentation of the store
Performing all 11 tasks against a large information store can take awhile, and Exchange must perform the maintenance tasks in a non-disruptive manner. In most cases, this means performing the maintenance tasks late at night. Because there might not be enough time to perform all the tasks every night, Exchange Server is designed to perform the tasks by priority with regard to which tasks have run most recently and to which tasks are most important. The first 10 tasks all have equal priority. But the eleventh task (performing an online defragmentation of the store) is considered far more important than the other tasks.
The first time the maintenance cycle runs, Exchange begins with the first task on the list and works through as many of the other tasks as time allows. If the maintenance window gets down to only fifteen minutes remaining and Exchange hasn’t completed all the tasks, it must make a decision.
If Exchange has spent the entire maintenance period on a single task and that task has yet to complete, Exchange will spend the rest of the maintenance period on that task. However, if at least one task has completed, Exchange notes the last task to fully complete. At this point, Exchange aborts the current maintenance task and begins an online defragmentation. Because Exchange Server considers an online defragmentation more important than the other maintenance tasks, it lets this task run for up to an hour after the maintenance period expires.
The next time Exchange encounters a scheduled maintenance period, it checks to see which task was the last to be successfully completed (not counting the online defragmentation), then launches the next task on the list. By not starting over at the beginning of the task list each night, Exchange makes sure it runs each maintenance task occasionally.
It’s extremely important that Exchange performs this maintenance nightly, and you can even control when the maintenance takes place so that it doesn’t interfere with the nightly backup. Before I show you how to adjust the maintenance schedule, though, you need to know two things. First, Exchange performs maintenance at the store level. So if your Exchange Server contains multiple stores, you need to schedule maintenance independently for each store. Second, maintenance tends to be resource-intensive, with store maintenance typically consuming a lot of disk time and CPU time. You need to take this into account when scheduling maintenance.
Exchange’s default maintenance schedule is to perform maintenance on each store from midnight to 4:00 AM nightly. To change the maintenance schedule, open the Exchange System Manager and navigate through the console tree to Administrative Groups, your administrative group, Servers, your server, your storage group, and your store. Right-click the store you want to make the adjustment to, and select the Properties command from the shortcut menu. From the resulting store properties sheet, select the Database tab. As you see in Figure 1, the Database tab contains a Maintenance Interval drop-down list, which you use to select a 4-hour block of time for the store’s daily (or nightly) maintenance period. Alternatively, you can click the Customize button to specify a custom maintenance interval.
6. Check Event Logs Daily
You’re likely using a management product such as Microsoft Operations Manager (MOM) or System Center Operations Manager 2007 to watch over your Exchange Servers. If not, you need to check your server’s application event logs daily. Sometimes events recorded in the application log point to future problems. Being proactive about checking the logs gives you the chance to fix anomalies before they turn into problems. As a simple example of such a situation, information in the event logs might warn you that the server is running low on disk space. You can then move some things around or install larger hard disks before the situation becomes a problem.
While you’re checking the event logs, look for entries related to the maintenance process I covered earlier. Table 1 lists the maintenance events of particular interest as you monitor for unusual activity.
Fight the Aging Process
As servers age, they often become less reliable. Therefore, an aging server deserves nearly the same level of monitoring as a brand-new, untested server. By performing these regular Exchange Server maintenance tasks, you can prolong the life of your server and discover problems before they sneak up on you.