"My messages aren't getting through. Is the server down?" It's a common problem mail administrators face: messages that don't get to their destinations as expected. To diagnose and fix this problem, Microsoft Exchange and Windows NT provide several resources for troubleshooting message queues and tracking messages. These resources include message queue management, predefined Performance Monitor (Perfmon) worksheets, and a search engine for tracking individual messages through your enterprise.

Message Queues and You
Messages wait in a queue for Exchange's Message Transfer Agent (MTA) to determine a routing path for them. When sending to Site and X.400 connectors, the MTA sends the messages via a connection to the MTA on the destination server. In other cases, the MTA hands the message off to a connector. Exchange can also use the Internet Mail Service (IMS) to connect sites. The Information Store retrieves the message from the destination post office and notifies recipients of new mail. (For more information about the MTA, see Tony Redmond's "Understanding the Exchange Message Transfer Agent," Windows NT Magazine, January 1998.)

However, the process doesn't always go that smoothly. Sometimes the queue grows too long and delays occur, or a message in the queue gets stuck and prevents all other messages from passing through. How can you troubleshoot undelivered mail?

One of the first places to check for problems is the message queues. You can view these queues from several places in the Exchange Administrator program. For example, you can view the IMS and Microsoft Mail (MS Mail) Connector queues from their respective property sheets. The MTA maintains the X.400, Site, RAS, and Information Store queues.

To view all the mail queues from one window, go to the Exchange Administrator program, select a server, double-click the MTA, and select the Queues tab. (The mail queues don't show messages that the IMS is processing, however, and lots of mail can get stuck in the IMS.) As Screen 1 (page 2) shows, the drop-down Queue name menu displays the connectors and the number of messages currently awaiting delivery for each connector. If the server is a bridgehead server for a site connector, the computer name will appear as a queue name (e.g., HECKLE, in Screen 1).

Mail is usually lined up in a queue, ready for delivery. When Exchange is working properly, the mail moves continuously through the queue (unless you have a scheduled connector that hasn't initiated a session yet). The MTA processes messages very quickly; therefore, if more than 20 messages are waiting to go to a connector, you might have a problem. (Note that messages appear to be at a standstill in the Queue name window because the window is a snapshot of the queues at one moment. Click Refresh to get a current reading.)

Messages piling up in a queue without moving can signify a problem with a LAN/WAN connection, or an error on the server, a connector, or the MTA. Before you assume that the problem is on your server, however, check the components on the recipient's server. Often, the destination server has a misconfigured receiving service. Or, if the destination post office's X.400 connector is scheduled for only daytime delivery, messages sent to its server during the night must wait until morning for delivery. The best way to verify the configuration of the recipient's server is to talk to the mail administrator of that post office. Insufficient bandwidth can also result in a sluggish queue. If you have a slow WAN connection between servers and the link becomes clogged, you need to rethink your wiring topology.

If the problem is on your server, move the top message to the bottom of the queue to see whether you can get the mail flowing again. Select the first message, and then select Priority, Low Priority to move the message to the bottom of the stack. If the messages start flowing and then stop when the message you moved comes to the top again, delete the corrupt message. Exchange usually marks the deleted message as nondeliverable and returns it to the originator.

As Screen 2 shows, the Details option on the MTA Properties sheet provides information about the message originator. Exchange stamps each message with a Message Transfer Security ID (MTSID) number and uses the number to track the message through the system. Submit time is the time the message arrived at the MTA. Compare this time with the time the originator sent the message (the Submit time in the MTA Properties window) to help track down the source of the delay. The Originator field in the Details display specifies the source of the message.

The MTA Properties sheet displays basic queue data. For more specific information about the IMS and MS Mail connectors, you must go to these connectors' respective property sheets. Use the same approach for other connectors (e.g., Lotus Notes, cc:Mail, or X.400).

IMS Queues
Exchange Server's IMS messaging connector lets users transfer messages using Simple Mail Transfer Protocol (SMTP). The IMS receives outbound messages from the Information Store and converts them and their attachments into either MIME or UUENCODE format before scheduling delivery. SMTP uses these two formats, and Exchange has the flexibility to define them on a per-domain basis. You can view both inbound and outbound data files in the \EXCHSRVR\IMCDATA\IN and \EXCHSRVR\IMCDATA \OUT folders. These folders function as a temporary storage area for the messages.

If the IMS queue is consistently long, consider relieving the congestion by balancing the message load among several IMS connectors. Because each server in the Exchange site can have only one IMS, you can add an IMS on another post office in your site or organization. Or you can configure one IMS to be Inbound Only and the other to be Outbound Only on the IMS Connections tab on the IMS Properties page.

To purge a long mail queue, you can temporarily stop the IMS from processing any new messages so the connector can concentrate on clearing the current queue. On the Connections tab, under Transfer Mode, select None (Flush Queues). You can fine-tune the scheduled retry attempts and message timeouts of a stalled message by changing the default times in the Service Message Queues window on the Connections tab. For instance, you can specify that all originators' messages receive a Nondeliverable message when their messages exceed the defined timeout scheduled for the IMS.

MS Mail Queues
The MS Mail Connector has a queue for each outbound connector to an MS Mail post office. To access the queue, go the MS Mail Connector Properties sheet, and from the Connections tab, select the connector and click Queue. The message queue window displays the following information for messages awaiting delivery: From, Subject, Message ID, and Date/Time.

If you have a mail message for an MS Mail Server and that message refuses to budge from the queue, you can delete it. The Message ID column in the message queue window lets you track the message. If you are tracking a message within an Exchange organization and message tracking is enabled on the Exchange servers that the message has passed through, you can track the message throughout its course. However, if the message passes out of an Exchange component to a destination post office (such as the MS Mail connector or X.400 connector), you can't track the message to the end-point. The Date/Time column shows the date and time when the connector received the message. You can compare this date and time with the originator's send time to determine whether any delay in delivery occurred.

Troubleshooting Queues
In a large enterprise, message queues are inevitable. But if a queue becomes a bottleneck, delivery times can become unacceptable to the organization. Only by closely monitoring your mail system and benchmarking queue lengths can you determine when this point occurs. NT provides a few tools to help you.

Perfmon. Microsoft provides two preconfigured Perfmon workspaces you can use to track your message queues. From the Exchange post office, go to Start, Programs, Exchange to view Server Queues and IMS Queues. If the MTA is causing a mail backlog, the Work Queue Length counter will be high in the Server Queues chart view. If the Private Information Store is slow to deliver messages to mailboxes, the Send Queue Size will be much higher than its benchmark reading. In both cases, adding faster disk controllers or more disks to the stripe set can improve performance.

You can also use Perfmon to create alerts to notify you if a queue has passed its acceptable message limit. For example, the Queue Length counter in the object MSExchangeMTAConnections tracks the number of outstanding queued messages. If you want to know when a queue contains more than 10 messages, you can configure an alert to notify you when Queue Length reaches that threshold for one connector or any number of MTA queues. In Perfmon, select Edit, Add to Alert to get to the display shown in Screen 3. Next, go to Server Manager and configure the post office to send an administrative alert to a specific computer or username.

Diagnostics Logging. Diagnostics Logging in Exchange lets you determine NT's level of verbosity, or detail, in writing events to the Event Viewer or to text files for later analysis. Events are significant system-related activities that NT writes to a log file. Diagnostics Logging is available for the MTA and connectors.

By default, Exchange sets logging to None. However, even when the logging level is None, Exchange writes critical events and errors to the event logs. If you suspect a problem with a queue, you can raise the verbose logging level to increase the amount of data that Exchange writes to the event logs. Be forewarned that raising the logging level increases the server load. After you have resolved the problem you're logging, set the level back to None.

The MTA Properties sheet has a Diagnostics Logging tab with several complex categories that you can enable by setting the Logging level to Medium or higher. For instance, the Interface and Interoperability objects create a file named AP0.LOG in the MTADATA subdirectory. This file documents the binary contents of the X.400 protocol messages. The Application Protocol Data Unit (APDU) category creates a BF0.LOG file, which can help you troubleshoot MTA communications.

Screen 4 shows how you can use Diagnostics Logging to troubleshoot an IMS queue problem. If you set SMTP Protocol Log and Message Archival to Medium logging, Exchange will write these events to text files in the \IMCDATA\LOG and the \IMCDATA\IN\ARCHIVE and \IMCDATA\OUT\ARCHIVE, respectively. Event Viewer will also write a log noting that you have enabled these categories.

MTACHECK
MTACHECK is a command-prompt utility that takes defective objects from the queues and places them in directory files for you to inspect. To run MTACHECK, you must first stop the MTA service via the Control Panel applet. In addition, ensure that the \EXCHSRVR\MTADATA\MTACHECK.OUT path is empty so MTACHECK can write errors here. Some of MTACHECK's switches are

/v(verbose)Displays a summary to the console
/f(file)Saves a summary to a filename on your hard drive
/rdRemoves directory replication messages
/rpRemoves public folder messages
/rlRemoves any link monitor messages

MTACHECK examines each queue. If MTACHECK finds a corruption, it moves the corruption from the queue and places it in the MTACHECK.OUT folder. The corrupt file information includes queue name, MTSID, and error type. MTACHECK notifies you if some data has been lost. The only downside to this utility is that you cannot run it remotely. For more information about MTACHECK, see "Understanding the Exchange Message Transfer Agent."

Final Queue Tips
As I mentioned previously, if you have a long queue, don't assume that the problem resides on your server. For example, before the MTA can deliver mail, you must establish the MTA on both the originating and the destination servers. A long queue can mean that the destination server's MTA is temporarily down. In that case, your message queue will bloat until both MTAs can establish an association or until the messages time out. Also, don't forget to check the Application event log for clues. The Application event log can save you hours of troubleshooting.

Finally, if you have a less-than-perfect network connection between servers, you can fine-tune your MTA. Mail administrators can make a suspect connection more forgiving by configuring the Request to Send (RTS) values, Connection retry values, and Association parameters on the MTA Site Configuration Properties sheet, as Screen 5 shows.

For example, if you have a slow WAN link, you can decrease the Checkpoint size. If the link fails, Exchange has to transfer the data from only the last checkpoint. Or if you have a dirty line, you can increase the Threshold value. A connector that uses the MTA opens another association in the same session if the message queue reaches the specified threshold. However, multiplexing over this session can saturate the link and cause it to crash. Increasing the size of the threshold from its default of 50 messages prevents multiple associations.

You can also increase performance in a clean connection by changing the MTA configuration values, particularly in the Checkpoint and Window fields. Click Help to get a full description of how you can use these parameters.

Message Tracking
To troubleshoot mail that is causing slow or stopped connections, you can track messages sent between Exchange servers. You can enable tracking on the MTA, the IMS, or the MS Mail connector to monitor messages as they travel through the system. Exchange doesn't enable message tracking by default, so you need to enable tracking on all components that contribute to message traffic. When you enable tracking in this way, the System Attendant manages the service and writes a daily log in the \EXCHSRVR\TRACKING.LOG folder.

After you enable tracking, browse for the message you want to track and trace its path through the mail system of the servers that handled it. You also need to have access to the NT file share where Exchange holds the tracking logs.

To track mail from the Exchange Administrator program, go to Tools, Track Message. You can select a specific server and determine whether you want to select messages Sent To or From and the number of days you want to review. When you've defined these parameters, click Find Now to search through the logs for all messages that meet the criteria. Next, select the message you want to follow, then click OK.

In the Message Tracking Center display, select Track to search the log files, and follow the path of the message. You have successfully tracked a message if Exchange has delivered the message or if the message has passed through a connector. Screen 6 shows three Exchange services processing the message you're tracking. You can see the MTA expanding a distribution list and successfully handling the message on both the Squirt and Heckle servers, the MS Mail connector transferring the message out to BEYONDHELP/DAYTONOH, and Exchange delivering the message to the Public Information Store.

Advanced Search Features
The Message Tracking Center provides three advanced search features for finding messages. You can search by Exchange Server component, by messages transferred in by gateways, and by message ID. When searching by Exchange component, you can select which service you want to monitor. An especially useful tool is the capability to monitor mail messages from users outside your messaging system. To monitor these messages, select the Advanced Search button from the Message Tracking Center, and then select Transferred into this site. Enter the name of an outside recipient, and then choose the type of connector you wish to track. Finally, you can search by message ID. You can type in the MTSID (e.g., the number in the Details window shown in Screen 2) to track a piece of mail as it makes its way through your post office.

Put the Tools Together
I've shown you a few ways to use tools from both NT (Perfmon and Event Viewer) and Exchange (message queues, message tracking and Diagnostic Logging) to troubleshoot your messages. Most often, you will use these tools together to track down those pesky messages that clog your post office. Learning how to use these tools effectively will ensure that your post office operates at peak efficiency.