Dcdiag rescued me from a frustrating replication problem

The problem-frustration-solution-elation cycle is a way of life at the Windows 2000 Magazine Lab. However, some cycles' problem and frustration phases are more persistent than others'. Such was the case with a problem that recently crippled our network.

The Lab uses a mixed-mode Win2K Active Directory (AD) domain with two domain controllers (DCs) and more than 130 computers for benchmark testing. As is the case with Win2K domains, one DC acts as a PDC emulator. During preparations for a test, I rebooted about 50 Windows NT Workstation 4.0 computers and discovered that the DCs denied domain access to some of these computers. Each of those workstations displayed a message that the domain account or password was invalid. I found corresponding failed-authentication messages in the PDC emulator's System log.

AD problem resolution isn't my strength, but I knew I needed to start somewhere and soon; several Lab projects were on hold. I used the Microsoft Windows NT Server 4.0 Resource Kit's Netdom tool to reset the domain accounts. I could then log on from the computers—but the logon problem resurfaced a short time later. After deleting and recreating the computer accounts in the domain failed to fix the problem, I checked both DCs' Directory Services logs and discovered warning messages. On the PDC emulator, event ID 1308 indicated failed replication with the second DC. The second DC's log displayed event ID 1586, which denotes unsuccessful checkpoints with the PDC emulator.

At that point, I called Microsoft Product Support Services (PSS). The support technician advised me to run two Win2K support tools: Netdiag and Dcdiag. (You'll find these tools on the Win2K Server CD-ROM.) Netdiag.exe ran a variety of domain connectivity and authentication tests and showed no errors on the DCs.

Dcdiag.exe ran a variety of DC diagnostics. When I ran dcdiag.exe on the PDC emulator, the tool reported a message similar to event ID 1308: failed replication with the second DC. However, when I ran dcdiag.exe on the second DC, the tool's NCSecDesc test gave me the information I needed to fix the problem: The Enterprise Domain Controllers group lacked three rights that replication requires. The tool listed both the rights and the naming contexts (NCs) in which I needed to set them.

To track down the delinquent rights, I opened the Microsoft Management Console (MMC) Active Directory Users and Computers snap-in on the second DC and selected View, Advanced Settings from the menu bar. I then opened the appropriate NC's Properties dialog box and clicked the Security tab. I changed the Enterprise Domain Controllers group's rights on the second DC to match those on the PDC emulator. To verify that these changes fixed the replication problem, I opened the MMC Active Directory Sites and Services console on each DC and used each DC's Connection object's Replicate Now function. However, the replication fix didn't alleviate the original symptom of invalid computer accounts. (Machine account passwords had changed, but because of the replication problems, only one DC had the working password.) Before I could experience the elation of a problem solved, I needed to use Netdom to reset the accounts.

Education typically emerges somewhere in the middle of the problem-frustration-solution-elation cycle. Although I never discovered the cause of the missing rights, I did learn about an effective tool that I'll be able to use in future problem cycles. Kudos to the team that designed Dcdiag.

Give Us Your Feedback: The Lab strives to cover products that address your most common problems. Are we hitting the mark? What categories of products do you want us to cover? What kinds of information do you need to make purchasing decisions? To post your comments, click "Post a comment" in the right-hand column of this page.