In a perfect world, Exchange Server would be self-administering. However, people must intervene to fix problems on an Exchange server. To help administrators diagnose, troubleshoot, and fix problems as they crop up, Microsoft offers several monitoring tools, such as Windows NT's Performance Monitor, NT Event Viewer, and NT Diagnostics. (Ian Tweedie explains how to use Performance Monitor to track Exchange Server in "Keep Tabs on Exchange Server" in the March issue.)
However, Exchange Server has built-in monitors, too. Server and link monitors let you automatically monitor services, synchronize clocks, and check messaging links for a server or group of servers. After you set up and start these monitors, they can send alerts to notify you of impending trouble. In addition, server monitors can sometimes heal a server by restarting a downed service. These utilities let you head off problems before they get worse within the Exchange organization.
In this article, I'll discuss how to set up and use server and link monitors. Then, I'll describe how you can use the Exchange Monitor Report Generator (EMRG) to summarize the monitors' activities for management reports.
Server Monitor Overview
Server monitors let you check the condition of NT or Microsoft BackOffice services and the clock synchronization of Exchange servers. You can use a server monitor to manage all servers in an organization from a single seat.
Server monitors use remote procedure calls (RPCs); so if you have low-bandwidth or poor-quality lines, you can't manage Exchange Server centrally. In other words, if you don't have the network links to handle the overhead associated with RPCs, your only alternative is to have server monitors monitoring the local server.
By default, server monitors watch only Exchange Server's Directory Service (DS), Information Store (IS), and Message Transfer Agent (MTA), but you can configure server monitors to monitor any installed NT or BackOffice service. You can configure a server monitor to send a notification message or take some action when it encounters a failed service. An example of an action is to restart a service or reset the clock. A monitor can't always restart a failed service, however. For instance, if the hard disk where the MTA resides has less than 10MB of free space, the MTA remains in a downed state.
Configuring Server Monitor
You configure and run server monitors from the Microsoft Exchange Administrator program. You can set up one monitor to watch several servers, or you can run multiple monitors from one instance of Exchange Administrator. Note, however, that if you close Exchange Administrator, you'll automatically close any monitors you've started, too.
To configure a server monitor, open Exchange Administrator and choose File, New Other, Server Monitor from the menu bar. Server Monitor has six tabbed pages: General, Permissions, Notification, Servers, Actions, and Clock. Let's take a look at each tab.
On the General tab, which you see in Screen 1, you define basic information for the monitor, including the server name, an optional log filename, and the polling interval. At a minimum, you must provide a directory name for Exchange Server to use internally and a display name for the monitor. You need a log filename if you choose to write a log of maintenance, repair, and alert status updates. Although creating log files is optional, log files are helpful for troubleshooting problems with services. The polling intervals refer to how frequently the monitor checks the services you're monitoring.
The Normal polling interval refers to how often a server monitor looks at services on a healthy server. The default interval is every 15 minutes. The Critical sites polling interval is how often the monitor looks at a server that is in a warning or alert state. A critical site usually has a shorter polling interval. The default is 5 minutes. You must decide whether the default intervals are suitable for your environment.
On the Permissions tab, you can add NT users and groups that have the right to modify a monitor. At a minimum, an NT user must have the View Only Admin role to start monitors. To create or change monitors, individuals must have permissions at both the Site and the Configuration container.
You use the Notification tab, which Screen 2 shows, to specify how to notify you when a service is in a warning or alert state. A warning state signals a possible problem (e.g., a minor clock-synchronization problem), and an alert signifies a serious occurrence (e.g., a stopped service, an unresponsive server, or a serious synchronization discrepancy). You can determine whether you want the monitor to notify you only for alerts or for both alerts and warnings.
To enter the parameters, click New on the Notification tab. In the New Notification dialog box, you can choose whether you want the monitor to deliver a notification as a process, an email message, or an NT alert. An example of a process is a third-party pager application that automatically sends a page to an individual. An email message is usually not a practical notification method, because if a key Exchange service (e.g., the IS) is down, the monitor can't deliver the message. Finally, you can use an NT alert as a notification. An NT alert requires that the Alerter and Messenger service be active on both the Exchange server and the NT client workstation receiving the alert. Microsoft has included a Test button on the Escalation Editor window where you configure the notifications, so you can verify that the notification process is working properly.
You can set up notifications in several ways. For example, you can define different notification methods for different servers by creating multiple monitors, each monitoring a specific server. You can set an escalation path to notify a list of people to notify at different times when a monitor enters an alert or warning state. Or you can configure a notification to fire only for a warning state or only for an alert state.
On the Servers tab, you specify which Exchange servers to monitor. You can add servers by site by selecting a server in the Site field and then clicking Add. You can choose which services to monitor for each server by selecting a server and clicking Services. You can choose to monitor the three default services (i.e., DS, IS, and MTA) or add other services. For instance, you might want to monitor other key messaging connectors, such as the Internet Mail Service (IMS).
On the Actions tab, you select what action to perform if a service stops. You can choose one of three actions (Take no action, Restart the service, or Restart the computer) for the first, second, and subsequent attempts, as you see in Screen 3. If you choose Restart the computer for the subsequent attempts, you need to set an appropriate polling interval for critical sites on the General tab. For example, if you define a 5-minute polling interval for critical sites and the server takes a full 7 minutes to start all the monitored services, you'll get into an endless loop of rebooting the Exchange server. Therefore, you need to know how long your system takes to reboot your server and load all services. If the action is to restart a computer, you can configure a restart message to display at the server the system is rebooting and specify a restart delay value for the window you see in Screen 3.
On the Clock tab, you can synchronize all clocks between Exchange Server computers. In effect, the server initiating the monitor is the time server that maintains the master clock. All servers listed in the Servers tab will synchronize their clocks with the time server. The monitor adjusts server times to account for time zone differences.
Why do you need to synchronize clocks? First, you need trustworthy timestamps for message tracking. Without synchronized times, you can't construct a time line for messages as they make their way through the system. Second, all event log activity has time/date stamps, and an incorrect time in the event log could lead to false assumptions.
With server monitors, the Site Service Account must have the user right to change the system time on all servers that you are monitoring. If the Site Service Account doesn't have the user right, it can't synchronize all the servers' clocks. You can assign the user right directly to the Site Service Account. Go to User Manager for Domains, and select Policies, User Rights. From this screen, you can grant the right to change the system clock. An alternative is to add the Site Service Account to the Domain Admins global group.
Link Monitor Overview
Link monitors let you test messaging connectivity between Exchange servers and foreign mail servers by periodically sending a message. At a defined polling interval, the monitor delivers a message to every server you've specified in the monitor. Link monitors essentially send out a ping message and wait for the message to bounce back to the originating server. If the message returns, you know the message transport is operational.
Sending a ping message is different from using the TCP/IP Ping utility. Sending a TCP/IP ping message to a remote server shows only network connectivity; it doesn't prove that the MTA is working. A link monitor message, however, confirms that both systems are messaging-capable.
Whereas you use server monitors to monitor Exchange Server inside your organization, you usually use link monitors to test the messaging capability of servers outside your Exchange infrastructure. Link monitors let you test whether the remote system is responding to mail messages.
Configuring Link Monitors
As you do for server monitors, you configure link monitors from Exchange Administrator. Go to File, New Other, Link Monitor. You'll see six tabs: General, Permissions, Notification, Servers, Recipients, and Bounce.
On the General tab, you fill in the names of the monitor and the scheduled polling interval. The default polling interval is 15 minutes, but you must determine the usual message transit time and consider special circumstances. For instance, if a scheduled messaging connector is typically down at night, you don't want to monitor the link when it's not operating. Similarly, you don't want to monitor a server while it's performing a scheduled offline backup. You need to coordinate your link monitors to coincide when the remote systems are up and running.
Polling intervals are global: If you're monitoring 10 servers, the monitor sends ping messages at the same time. If simultaneous ping messages aren't suitable for your situation (i.e., if a server is down at a certain time), configure link monitors for individual servers.
As you do for server monitors, you use the Permissions page to denote which NT accounts can modify the specified monitor. You use the Notification tab to configure notification parameters when a ping message fails to return in a specified time.
The Servers tab specifies which servers you want to initiate the monitor (i.e., the servers that will send the ping messages to test links). You can monitor servers in remote sites in your organization by selecting the down arrow under the Site field. You can select only Exchange servers in this window, because foreign systems aren't part of your Exchange Site.
You use the Recipients tab, which Screen 4 shows, to specify the servers you want to ping. You enter a recipient (real or fictitious) for the destination server. You can use the left pane in two ways. First, if you're sending ping messages to Exchange servers, you can leave this window empty, because the monitor sends the ping message to the System Attendant on the remote server by default. This functionality is one reason why the System Attendant has a mailbox under the Mailbox Resources field in Exchange Administrator. Second, you can populate the window with a legitimate recipient from each server (from the Servers tab). The only catch is that the recipient must have a process to automatically reply to the ping message, or it won't bounce back to confirm messaging connectivity. You can configure an automatic reply on the destination client with such tools as the Inbox Assistant or Out-of-Office Assistant in Microsoft Outlook. As an alternative, you can configure a Rules Wizard rule, although this method might not be appropriate, because the Rules Wizard functions only when the Outlook client is online.
You use the right pane of the Recipients tab for non-Exchange mail servers that either don't or can't use AutoReply or for which you have no legitimate email address. Although you can enter a custom recipient who exists on that foreign mail server, you can't be sure that someone has configured an AutoReply for that mailbox. If no one has configured an AutoReply, the message will never bounce back to confirm messaging connectivity. For cases like these, set up a nonexistent custom recipient on the Exchange server with the email address of the foreign mail server. After unsuccessfully attempting to process the message, the foreign system will reject it with a nondelivery report (NDR) and thus automatically send the message back. This action confirms that the messaging link is operational.
Most administrators create a hidden custom recipient solely for sending ping messages. You can hide these recipients by using the Advanced tab of the respective recipient, so that these objects won't display in the address book and your users can't view them.
The Bounce tab lets you set the warning and alert times for a monitor. These time intervals are global for all servers specified. For example, if you set a warning state of 30 minutes and a ping message doesn't bounce back from a foreign system to the server performing the monitoring, the monitor enters a warning state and can notify the user listed under the Notification tab.
The monitor sends a warning when the server is late returning a message; the default is 30 minutes. The monitor sends an alert when the message is extremely late returning; the default is 60 minutes. Of course, a message bounce depends on many factors, such as bandwidth, network topology, and the number of hops through routers. The only way to determine an acceptable bounce time is to benchmark what you consider a usual time interval and set the warning and alert times accordingly. If you are configuring link monitors to multiple servers, you might want to configure a separate link monitor for each server to take into consideration the varied acceptable bounce times.
In most organizations, you monitor from one Exchange Administrator program. If you run the same monitor from different administrator programs, the monitors might conflict. For example, if you run the same server monitor on two systems and configure both monitors to restart the computer, you could get into an infinite loop of servers rebooting.
You can start monitors manually or automatically. To manually start the monitors from Exchange Administrator, select Link Monitor or Server Monitor under the Monitors object, then select Tools, Start Monitor. You can start multiple monitors simultaneously by holding down the Shift key while you select monitors.
To automatically start a monitor when you invoke the Exchange Administrator program, you modify the shortcut property sheet of the Exchange Administrator icon. For example, modifying the target box of a shortcut to Exchange Administrator with
starts Exchange Administrator, a server monitor, and a link monitor on the Cleveland server.
Whether you start monitoring manually or automatically, if you close Exchange Administrator without shutting down the monitors first, the monitors will reappear the next time you run Exchange Administrator. I recommend resizing the Exchange Administrator window and all the monitors' windows, so you're aware of what's running at one time.
Before you take an Exchange server offline (e.g., for maintenance), be sure to inform all servers performing monitoring functions on the server to temporarily stop monitoring. If you don't stop the monitoring on the server you're taking down, you might end up with endless notification messages and failed attempts at restarting a system that you've turned off on purpose. To notify the monitors, enter
on the command line of the server you're taking offline, and then press Enter. Table 1 shows the options that the /t switch supports.
Before you take a system down for maintenance, be sure you've allowed ample time for the admin /t option switch to ripple through all monitors in the organization. Polling intervals determine when a monitor sees that a server is in maintenance mode. To confirm that the monitor has successfully received the /t switch, double-click the monitor's property sheet on the server you're maintaining. The monitor's status appears on the Maintenance status tab.
To view warnings and alerts, start Server Monitor from Tools, Start Monitor in Exchange Administrator. Warnings appear as red exclamation points, and alerts appear as red inverted triangles. Green triangles signify that the servers are functioning properly; question marks denote that the monitor has not yet measured the service's status. Xs signify a maintenance situation.
Screen 5 shows the results of the monitors. The server monitor shows a normal reading (triangle) on one server and an alert state (inverted triangle) on another server. The X status on Heckle tells me that the server sent out a notification that it was going offline for maintenance. The link monitor shows me that the monitor hasn't checked Paris yet, the cc:Mail post office is operational, and the MS Mail post office had a longer bounce-back than anticipated. You can obtain more detailed information (such as exact services that are running or the total bounce time of a message) by double-clicking any of the servers in these two windows. Screen 6 shows a server's bounce time.
Exchange Monitor Report Generator
EMRG is a tool in the Microsoft BackOffice Resource Kit—BORK—2nd edition. The EMRG is in the Admin\Emrg folder for your OS. You can use the EMRG to gather information from a monitor's log file and present it in an organized manner. For example, you can configure the report to summarize a server's downtime, or you can generate a report for the contents of a monitor for a range of dates, for specific log files, or for specific servers.
As you see in Screen 7, you can output reports in Comma Separated Value (CSV) or text format. In addition, you must specify whether the monitor is a server monitor or a link monitor to properly build the report. After you have defined the log files with this utility, click Generate Report Now to create the file. The output this tool generates gives a comprehensive summary of the data, as you see in Figure 1 on page 5. You can use these reports to identify troublesome servers in an organization. Third-party products (e.g., NetIQ's AppManager Suite) provide more sophisticated monitoring tools.
Monitor Your Servers
Monitors provide a way to keep tabs on your servers and prove connectivity to foreign mail systems. Because monitors let you define escalation paths (i.e., the sequence of activities) and recovery attempts when a monitor detects a problem, you have a chance to fix the problem before it becomes a full-blown catastrophe. Mail administrators need all the help they can get, and monitors provide one way of keeping your Exchange server happy and healthy.