Don't just guess about your email delivery problems, get a clue

When you suspect that your email system is having message delivery difficulties, you often need to play detective and search for clues to determine the problem. But where do you begin your search for message transport clues, and what tools and techniques are available to help?

The first notification about an email delivery problem often comes from a user who has received a nondelivery report (NDR). If available, an NDR is often the best place to begin your search for clues because it contains information that can help you determine which component or system to investigate.

Consider an Exchange Server 5.5 design that has two sites with two servers at each site. Site 1 has Exchange servers named EX3 and EX4, and Site 2 has Exchange servers named EX5 and EX6. A Site connector connects the two sites, with EX4 and EX5 acting as bridgehead servers; EX4 hosts the Internet Mail Service (IMS). The Site 2 administrator has recently deleted a mailbox for Mosby Jones from EX5. Figure 1, page 2, shows two NDRs that result when someone tries to send a message to Mosby Jones. The first NDR resulted when an external user tried to send an SMTP email message to Mosby Jones, and the second NDR occurred when a user with a mailbox on EX4 tried to send a message to Mosby Jones. The text of both NDRs is similar, but if you look closely, you'll see that not only do the two NDRs have different MTS-IDs (aka Message IDs) but the last lines of these two messages are different and contain clues that will help you answer message delivery—related questions.

Exchange 5.5 Clues
As Figure 2, page 2, shows, each Exchange 5.5 NDR contains information that you can use to determine whether a delivery problem exists: By examining the Message ID string, you can see that the message came from a user on EX4. The Component, Site, and Server elements show that the Message Transfer Agent (MTA) on EX4 successfully sent the original message to the EX5 server at Site 2. However, the MTA on EX5 couldn't complete the delivery, so EX5 is where you need to start looking for more clues about what happened.

The next step is to open a Microsoft Exchange Administrator session on EX5 and look at the Recipients containers to confirm that the Mosby Jones account doesn't exist. You can easily understand how someone could send an SMTP message to Mosby Jones and get an NDR because the mailbox doesn't exist, but how does a user on an Exchange server that shares a common view of the Global Address List (GAL) enter Mosby Jones into the message's To field when the mailbox isn't listed in the GAL? The most common cause for this phenomenon occurs when a user replies to a message for which the deleted account was the sender or an addressee. You might also see this happen immediately after an administrator deletes a mailbox but replication hasn't yet synchronized this change to another server. To determine whether either of these scenarios has occurred, you need to open the original message from the sender's Sent Items folder and examine the message properties.

First, examine the message's addressee information. Right-click the addressee and select Properties. You can determine immediately whether the sender used an address from an old message: If the message recipient is listed in the GAL, Exchange doesn't store complete information about the recipient in each message; instead, Exchange stores a pointer to that information in the Exchange Directory, along with the display name. When you access the properties of a recipient who's no longer listed in the Directory, you'll see the recipient's distinguished directory name (e.g., /o=/ou=/cn=/cn=) instead of the expected display name, as Figure 3 shows.

Another way to determine whether the sender replied to an old message is to use the Exchange Message Tracking Center, which is available in all versions of Exchange. (For information about using the Exchange 5.5 Message Tracking Center, see Tony Redmond, "How Message Tracking Works," June 1998, InstantDoc ID 4927.) The Exchange 2000 Server interface is slightly different from previous versions and includes some additional capabilities, such as the ability to save results, but the general functionality is the same.

The Message Tracking Center can locate a message to track by searching for a specified sender or recipient, or the service can begin tracking immediately if you supply a Message ID. (You can copy the Message ID from the delivery report—see Figure 2—and paste it into the Message Tracking Center.) After you generate a tracking history of a specific message, you can examine the message submission event to see a list of recipients.

To generate a tracking history in Exchange 5.5, start the Exchange Administrator program and select Track Message from the Tools menu. Enter the name of the server on which you want to start tracking. (The best server to use is the one on which the sender's mailbox resides.) After you enter the server name, you'll see a Select Message to Track dialog box that lets you enter sender or recipient information so that you can locate the message you want to track. In the Mosby Jones example at the beginning of this article, you already have the Message ID, so you can close this dialog box because the Message ID uniquely identifies which message to track. Open the Message Tracking Center, click the Advanced Search button, and select By Message ID, as Figure 4 shows. Click OK to reopen the Select Message to Track dialog box, and enter the Message ID that you copied from the NDR. Click OK to close the dialog box. In the Message Tracking Center, click the Track button. The Message Tracking Center will search the tracking logs on the server for entries that contain the Message ID. As the tracking center finds message transfer events, it will open logs on other servers and build a start-to-finish tracking history that details the events and message transfers made during delivery.

As Figure 5, page 4, shows, the history of the Mosby Jones message begins with Chuck Havens initiating a submission event. If you select this submission event and click Properties, the Message Properties screen opens, showing the recipient information. If you then select Mosby Jones from the Recipients list and click Properties, you see that Exchange can find no information about this addressee in the Exchange Directory. As the error message indicates, you might find this recipient in the sender's Outlook Contacts or Personal Address Book (PAB). Because the recipient information displays an Exchange distinguished name (DN) rather than an SMTP address, the recipient in question was likely deleted.

Exchange 2000 Answers
You now know how to verify a reported problem by using an Exchange 5.5 NDR, and you can apply the same principles to Exchange 2000. However, Exchange 2000's NDRs use a different format from earlier Exchange versions and don't provide the same details. Because the primary protocol for message transport in Exchange 2000 is SMTP, Exchange 2000 uses Internet Engineering Task Force (IETF) Request for Comments (RFC) 1893 Enhanced Status Codes to provide more Internet-standardized problem-reporting codes. For information about these codes, read the Microsoft articles "XCON: Delivery Status Notifications in Exchange 2000 Server" (http://support.microsoft.com/default.aspx?scid=kb;en-us;q284204) and "XCON: Enhanced Status Codes for Delivery - RFC 1893" (http://support.microsoft.com/default.aspx?scid=kb;en-us;q256321). Unfortunately, Exchange NDRs no longer contain the Message ID. As Figure 6, page 4, shows, you still can gather information about the server that generated the report. The server information and status code is displayed at the bottom of the NDR, enclosed by the less-than (<) and greater-than (>) symbols. You can use this information to start looking for problems on a specific server, but you must reopen the sender's original message if you need the Message ID. Although Exchange 2000 NDRs don't provide the Message ID, they offer more information than their predecessors about an unreachable recipient. If you compare the Exchange 5.5 report in Figure 1 with the Exchange 2000 report in Figure 6, you'll notice that instead of providing the display name, the Exchange 2000 report provides the distinguished directory name of the unreachable recipient, then tells you that Exchange couldn't find the account in the Directory.

Another Exchange 2000 behavior that you need to consider when tracking down a problem is that by default, Exchange 2000 retains a copy of a deleted user's mailbox in the Exchange Store for 30 days—in case you need to restore the mailbox. (Thirty days is the default setting; you can set that time period to be longer or shorter.) In Exchange 5.5, if a sender replies to an old message that includes a deleted recipient, the sender will receive an NDR shortly after sending the message. In most cases, the sender will realize that the NDR references a user who is no longer with the organization, discard the NDR, and adjust future messages to exclude the deleted recipient. Because Exchange 2000 retains a copy of the mailbox in the Store, if a sender whose account resides in the same Store uses an old message containing a deleted addressee, the system doesn't immediately return an NDR. The system continues to deliver messages to the mailbox during the 30-day countdown period before expiration. After 30 days, the mailbox disappears and the sender who used the deleted address begins to receive NDRs. This delay can sometimes cause confusion because the sender hasn't seen an entry for the deleted user in the GAL for more than a month. Users might blame a system problem when they begin receiving NDRs from a message that they "always use to send their report." These users don't realize that their dated message is causing the problem, now made apparent by the expiration of the mailbox.

Having an NDR as a source for a Message ID is convenient, but often questions or reports of problems come without this detail. For example, users might question why they received a message when they weren't an addressee. Most likely the recipient received the message as spam, as a blind copy, or as a result of an alternate delivery configuration on a user's mailbox profile. As another example, a message might have been sent to an all-user distribution list (DL), but one person claims not to have received it. In all these cases, you can use a message-tracking history to confirm or explain the users' questions, but you must go back to the original message as a source for the Message ID. As I mentioned earlier, whenever someone sends a message, Exchange generates a unique Message ID and stores it with the message. By opening the message in the sender's Sent Items folder or by opening a copy of the message from a recipient's mailbox, you can retrieve the Message ID. Click File, then click Properties from the menu of the open message to access the Message ID. Figure 7 shows the Properties sheet's two tabs, one of which is Message ID.

You can use this tab's Message ID field to track the message and determine all the addressees—even if the sender sent the message as a blind copy. By comparing the recipients of the sent message with the recipients listed in the message-submission event, you can confirm whether a user was blind-copied. Also, you can check the Message Tracking Center for a message-redirection event, as Figure 8 shows, to determine whether an alternate recipient configuration caused the mystery message delivery.

Other Sources of Information
Another useful source of message transport information is the Application event log. The diagnostic logging-level settings you make on the various system components control what goes into the event log. By default, all diagnostic logging levels are set to none, but this setting doesn't mean that Exchange writes no information to the log by default. Exchange always logs critical and error events to the log, and some Exchange components, such as the MTA and Exchange 2000 Store (or Exchange 5.5 Information Store—IS), log some warning and informational events by default. For example, Exchange 5.5 logs NDRs to the event log by default. (You can configure Exchange 2000 to log NDRs by increasing the diagnostic logging level for the MSExchangeTransport component.)

Careful monitoring of the Application event log can help you detect and resolve transport problems. Figure 9 shows the details of an MTA-logged event related to the undeliverable message examined at the start of this article. The event detail contains several pieces of information that can help determine whether you have a problem. The reason code explains why the system generated the NDR. The reason code in Figure 9—unable-to-transfer—is fairly common and usually results from deleted recipients or mangled addresses. Other reason codes, such as transfer-failure, might indicate more severe problems, such as a stopped MTA or failure of the MTA to authenticate at the destination server. Although the particular event in this example is fairly common, the logging of many of these events consecutively or within a short time period could indicate a problem. Often, when you see large numbers of these events, they are the result of one user's actions; but the event details don't indicate the originator, just the invalid recipient. When the event details include a Message ID, you can use a message-tracking history to determine the message's originator. You can then contact the originator to see what might be causing so many NDRs.

Problems that affect large numbers of people are often easier to troubleshoot because you'll usually see clues to help you determine what part of the system is broken. When just one or two people report the problem, the task is more difficult because you lack the significant indicators—such as a large queue of pending messages—to point to the problem. A variety of factors (e.g., invalid addresses, stopped MTAs, loss of directory access, failed authentication) can cause transport problems, but don't discount the simple things when looking for a problem's cause. Use tools such as the Exchange Message Tracking Center and event logs and clues such as NDRs and Message IDs to help answer users' "Why did it happen?" questions.