Rescue your users from snafus

Sure, the life of an Exchange Server administrator has its exciting moments, such as waiting to see what snacks are available at the Microsoft Exchange Conference (MEC). But for every pinnacle of excitement, you must wade through 10 user complaints. As a public service, I want to explore ways to quickly pin down and fix client-side problems (and give you more time to enjoy those snacks).

Getting to Know You
You can ask clients a set series of questions when email problems arise. These questions make the troubleshooting process more consistent and efficient.

Has the problematic function ever worked? Ask this deceptively simple question first to immediately determine whether the problem is a sudden breakage. If the function once worked, the fault is probably a recent change or event. If the function never worked, you'll need to broaden your problem search: Ask whether the problem function has ever worked for anyone on your network. If the answer is no, the problem might have a systemic reason. For example, the Microsoft Outlook for Macintosh client acts like a Microsoft Schedule+ client, so Outlook for Windows and Outlook for Macintosh clients can't share calendar information. If a Mac user complains that he can't share calendar information with a Windows Outlook user, you know the problem isn't with the user's machine or with your server. If two Windows Outlook users have the same problem, though, you need to search for a different cause.

Is the problem reproducible? Can the user make the problem happen at will? Does the problem happen on different machines, or only on one machine? A reproducible fault is a troubleshooter's joy; a problem that you can intentionally duplicate is much easier to find and fix. For example, suppose one of your users can't log on to her mailbox from her desktop. If she can log on from another machine, the problem is most likely with the profile on her desktop machine, not with her mailbox or her machine.

Does the problem affect more than one user? Sometimes the answer to this question is elusive. In large networks, you commonly find that many people experience a problem before anyone ever reports it—many users simply assume that someone else has already notified the support staff, sigh in resignation, and go about their work. Of course, you shouldn't expect end users to know whether the problem is localized or widespread; that's why you get the big bucks.

Begin at the Beginning
After you gather the interview information, you're ready to pinpoint the problem. Over time, you'll develop the ability to form a problem guesstimate using only this information. Until then, ask these follow-up questions.

Is the server up? As an administrator, you'll probably already know the answer to this question. But if you and your mail servers are in different locations and your user tells you that the problem isn't localized to one machine, you should make sure the server is functioning properly.

Does the client machine have basic network connectivity to the server? If the problem is confined to one computer, double-check that machine's physical connection to the network. To test the connection, you can use the Ping and Tracert utilities with the server's IP address. If these tests reveal a problem, you'll need to correct it before you can get mail flowing again. (For information about Ping and Tracert, see Michael Otey, Top 10, "Network Diagnostic Commands," April 1999.)

Does name resolution work? If IP packets can reach the server, you can ping the server by name to see which network address the name resolves to. I recently needed to fix an Exchange 2000 Server setup that was suffering seemingly random client failures. The cause: The Exchange 2000 machine was running RRAS, and the connection's Register this connection's address in DNS setting was enabled. Every time the RRAS connection came up, RRAS added the ISDN line's IP address to the DNS server and gave that address to clients when they performed DNS queries—but Exchange Server wasn't listening on that IP address. A quick name resolution check let me identify the problem and fix it within 10 minutes.

Are the necessary protocols available on the network? If you're trying, for example, to figure out why an Outlook Express user can't send email to an Exchange Server machine, ensure that port 25 (i.e., the SMTP port) is available from one end to the other. To accomplish this task, you can use the Telnet command, which lets you connect to a particular port. If you're using a Messaging API (MAPI) client, the Rpcping tool can be useful, although you must run Rpcping on both the client and server.

6 Steps to Diagnostic Success
After you conduct your initial interview and ask your follow-up questions, you're ready to begin troubleshooting. In its Official Curriculum courses for Windows NT, Microsoft teaches a six-step troubleshooting method that provides a repeatable, easy-to-understand procedure for attacking any problem. If you analyze skilled troubleshooters' actions, you might be surprised by how closely their innate problem-solving process mirrors these steps (or vice versa). Many fields, including medicine and engineering, use variants of this process.

Step 1: Identify what's wrong. Correctly identifying the problem is a must: If you don't know what's wrong, you have little hope of fixing the problem. The interview will help you complete this step.

Step 2: Find the problem boundaries. Knowing how far the problem extends can help lead you to the right solution in minimum time. The interview questions help with this step, too. On closer examination, you might find that the problem resembles a problem you've solved in the past. You might also want to research the problem's symptoms on TechNet or your favorite knowledge base. You might find that your problem looks just like a documented problem in a Microsoft article. (Of course, quitting and relaunching Outlook—or rebooting the client PC—to see whether that resolves the problem is always worth a try.)

Step 3: Single out potential solutions. List the possible root causes of the problem that you identified in Steps 1 and 2. Also list at least one diagnostic test (i.e., a test to identify whether that cause is truly responsible) and one potential solution (i.e., an action that you think will fix the problem) for each cause. Prioritize this list according to the most likely root causes. This step's goal is simply to identify potential solutions, not to start testing those solutions.

Step 4: Start testing solutions. Now that you have a prioritized list of possible causes, start trying the potential solutions to those causes. But be careful; as doctors say, "First, do no harm." Before you begin this step, make sure you have good backups of the machine you're working on.

Step 5: Determine whether the problem is solved. A fine distinction exists between fixing a problem and working around it or disguising its symptoms. Ideally, you want to fix the problem permanently, but sometimes you can't—you might not know how, or (more likely) you might not have the time you need to solve the problem as opposed to temporarily sweeping it under the rug. Now is the time to decide whether the problem was a one-time-only occurrence, whether it could happen again, and what you can do to keep it from recurring.

Step 6: Keep a diary. Keep a log that tells you what the original problem seemed to be, what it really was, and how you fixed it. The diary data forms an invaluable record for your future reference (see Step 2), as well as for anyone who might inherit the messaging system that you now maintain. However, don't keep this information on your Exchange Server machine, or you might not be able to access the data when you really need it. As part of this diary, produce a list of common problems that your users can fix without your assistance, along with instructions for solving these problems. For some examples of common culprits, see the sidebar "The Usual Suspects."

Practice Makes Perfect
The best way to become a good troubleshooter is to practice. Spending time reading through Exchange Server newsgroups (e.g., news:microsoft.public .exchange.*) and various Exchange Server-oriented mailing lists can also be productive and fun. These forums give you a chance to see the kinds of problems other administrators face and whether you can figure out the solutions. And sharing your skills with someone else is a terrific way to improve those skills and to benefit the Exchange Server community.