Eliminate duplicate accounts

The migration from one Exchange Server 5.5 server to Exchange 2000 Server isn't particularly difficult. However, the migration from a large Exchange Server 5.5 environment, with many servers and users, to Exchange 2000 is fraught with challenges. Chief among those challenges is merging directory information from the Windows NT 4.0 SAM database and the Exchange 5.5 Directory Service (DS) to Active Directory (AD).

At some stage of this migration, you'll likely find two entries in AD for one mailbox or account—not a good situation. However, if you end up in this jam, you can turn to the Active Directory Account Cleanup Wizard, or the AD Cleanup utility, which you install by default when you install Exchange 2000. You can run the wizard in Windows 2000 from Start, Administrative Tools, Exchange Program Group.

More Isn't Necessarily Better
Many circumstances can cause duplicate objects to exist in AD. Migrating from Exchange Server 5.5 is the most common cause, and the reasons for duplication are fairly obvious. Let's look at an example.

Assume that you've used the Active Directory Connector (ADC) to create disabled user objects in AD that correspond to your Exchange Server 5.5 mailboxes. You would use the ADC in this way so that an Exchange Server 5.5 mailbox object is represented in AD and accordingly exists in the Global Address List (GAL) for Exchange 2000. This process is classic directory synchronization between two mail environments.

The next step in most environments is to migrate NT 4.0 accounts to Win2K accounts. If you're upgrading an NT 4.0 domain into an already existing Win2K forest, you can't match an ADC-created disabled user object with the NT 4.0 account that you're migrating because the domain upgrade won't recognize the ADC-created object. Similarly, just creating a new account for a user won't match against the ADC-created object either. Some NT 4.0 account migration tools (e.g., FastLane Technologies' DM/Suite, Mission Critical Software's OnePoint Product Suite) can't detect an ADC-created object in AD and, as in the two situations I just described, can't match the NT 4.0 migrated account with the ADC-created object. In such a case, you end up with two user objects in AD: a disabled user object that the ADC creates and a live user object that your migration tool creates. (I describe this process in more detail later.)

Simply deleting one of the objects isn't an option because both objects hold important information. The ADC-created disabled user object contains information about the user from the Exchange Server 5.5 DS (e.g., job title, office location, telephone number, email addresses). Figure 1, page 2, shows a relevant example for the user Richard Bijaoui. Similarly, the user object that the migration tool creates, which Figure 2 shows, doesn't show those attributes but contains important information about the user's Win2K group membership, which Figure 3 shows, and other security attributes, such as the sIDHistory. Although your migration goal is to merge directory information, all you've done here is partition the information for one user into two distinct objects.

When Two Become One
Because two representations of a single user are of little use, you need to merge information from both these objects into one object. For this task, you use the Active Directory Account Cleanup Wizard.

In the example, the migration created the disabled user object in an organizational unit (OU) called Temporary Migrated Users, which Figure 4 shows; the enabled user account has migrated from NT 4.0 into the Users OU in AD, which Figure 5 shows. Accordingly, when you start the AD Cleanup utility, limit the search scope to a well-defined set of containers—in this case, Temporary Migrated Users and Users, as Figure 6 shows. This approach is sensible because it reduces the likelihood of the wizard's selecting inappropriate accounts for merging.

Performance is a factor here, too. Rough performance testing on a dual-Pentium III processor with standard disks and controllers shows that it takes about 40 minutes to search through a forest with 10,000 user objects, so limiting your search to particular areas of the forest provides speedier results. This phase of AD Cleanup only identifies accounts the wizard thinks should be merged; it doesn't proceed to merge the account data.

The wizard uses different matching rules as it attempts to match different types of objects. To match a disabled user object against an enabled user object—which is the case here—the wizard tries to match the msExchMasterAccountSID attribute of the disabled user in AD against either the objectSID or sIDHistory of the enabled user, also in AD.

Let's see why the wizard tries this match. As Figure 7 shows, when the ADC creates a disabled user object, Win2K assigns the disabled object a new SID because the object is a new security principal (i.e., it can have access controls associated with it) in the Win2K domain. Exchange Server 5.5 associates the existing Exchange Server 5.5 mailbox with an NT 4.0 account (which itself has a SID) and identifies the NT 4.0 account through the Exchange Server 5.5 DS attribute Assoc-NT-Account (the primary NT account). To preserve the linkage back to this NT 4.0 account, the ADC populates the disabled user object's msExchMasterAccountSID attribute with the value of the Assoc-NT-Account attribute.

Subsequently, when you migrate the NT 4.0 account into Win2K, the new enabled user object receives a new SID (because the object is a new security principal), but the SID of the NT 4.0 account is placed in the sIDHistory attribute. So now you have a means to link both the disabled and enabled user objects: The msExchMasterAccountSID and sIDHistory attributes share the same value.

In this example, the wizard selects both accounts for user Richard Bijaoui and displays them before asking you to confirm—you need to confirm twice—that you want to merge them. Merging accounts is a one-way trip: When you've merged objects, you can't unmerge them. Therefore, you need to be certain that these accounts are valid merge candidates. For any merge pair, you can look at some key attributes of the objects, which Figure 8 shows, before you commit to merging them. Look for giveaway attributes such as matching telephone numbers, office location, or employee badge number. If these attributes are identical, then they're likely to be erroneous duplicates. If you have any doubt, get in touch with the individuals before you perform any merges. When the merge is completed, the attributes from the disabled user object are merged into the enabled user object and only one object remains in AD.

Other Merge Operations
In addition to merging disabled user objects into enabled user objects, you can use the Active Directory Account Cleanup Wizard to merge a contact into an active user object, provided that only one of those objects is mail-enabled (i.e., has an Exchange mail alias so that you can send mail to it). However, matching that involves contacts isn't an exact science. Matching disabled user objects and enabled user objects is more precise because the matching is based on SIDs. Matching based on numbers is more likely to be accurate than matching based on names (e.g., common names—CNs—and usernames). As you can see in Figure 7, the match is defined on the value 1234.

Contacts don't have SID values or SID histories because they're not security principals. If you want the AD Cleanup utility to merge a contact with a user object, the wizard searches by using the CN or display name (displayName) attributes.

For example, let's say that a contact has the displayName Steve Balladelli. The wizard matches this name against an enabled user object and judges it to be a merge candidate if either the user object's displayName or CN matches Steve Balladelli.

In addition to matching on names, AD Cleanup can also match the Exchange Server 5.5 mailbox alias (mailNickname) against the NT 4.0 logon ID (samAccountName). An example of this kind of matching is an Exchange Server 5.5 mailbox that has an alias name with the value SteveB. If the ADC creates a contact in AD based on the Exchange Server 5.5 mailbox, the ADC synchronizes the alias value into the contact's mailNickname attribute. AD Cleanup matches this attribute value against the samAccountName (i.e., SteveB) of the active user object.

The flexibility to match on CN terms, mailbox aliases, and account names is a great benefit for contact-based merge operations. Often, naming information isn't consistent between Exchange Server 5.5 and NT 4.0. In the last example, if you had named the Exchange mailbox Steve Balladelli and the NT 4.0 account Steven Balladelli, a match based on CN and displayName operations wouldn't occur. Using a match based on alias and logon ID provides a useful alternative mechanism when naming standards are inconsistent.

Many environments don't have synergy between their Exchange Server 5.5 alias names and logon account names. Some companies have NT 4.0 account names based on a user's last name (e.g., gates), but the standard Exchange Server 5.5 alias generation uses the first name and first initial from the last name (e.g., BillG). This technique wouldn't work in those settings. The Exchange 2000 release to manufacturing (RTM) ADC lets you create only disabled user or enabled user objects, not contacts; therefore, the problem of matching based on names goes away, for the most part. The sidebar "What the ADC Creates in Active Directory" describes how the latest ADC version works.

Manual Merges
Occasionally, you might need to force a merge operation. Consider an example in which you create a new Win2K account for the user Paul Laahs. Because this account is new and not migrated, it has no SID history associated with it. If the ADC then creates a user object for Paul Laahs's Exchange Server 5.5 mailbox, the ADC can't perform matching on sIDHistory, and thus the ADC creates a duplicate account. Correspondingly, when AD Cleanup runs, it doesn't detect a match because two user objects exist, and the utility performs user-object matching by using only SIDs, not names.

This scenario is rare in most settings, but when it occurs, you must manually select the two duplicate objects by clicking Add on the Review Merging Accounts dialog box, which appears after AD Cleanup has finished searching through AD for duplicates. This AD Cleanup feature is powerful, but remember that merging objects is a one-time operation that you can't undo. Make sure the objects represent the same person before you merge them.

Command-Line Operation
In addition to its GUI, AD Cleanup has a useful command-line interface. Table 1 shows the AD Cleanup command-line options. You can obtain full descriptions of these qualifiers in the online Help within the Exchange 2000 System Manager tool.

Using the command line lets you script AD Cleanup operations to execute potentially unattended. For example, you might have a migration process that runs at appropriate times to bring NT 4.0 accounts over to Win2K. If you script this task, you can execute a script that performs AD Cleanup operations immediately after the migration to ensure AD integrity.

Running the command-line version of AD Cleanup explicitly splits a merge operation into two phases. The first phase performs a search on AD for duplicate objects and creates a file that contains merge candidates. In the second phase, you explicitly run the AD Cleanup command again to process the merge candidate data file and merge the objects. Obviously, such a two-phase process ensures safety, but it also generates a report of potential duplicate objects from a script. This report is useful for any organization in the midst of a migration project. As a good management practice, you might consider running such a script every night and analyzing the results the next morning.

When You Do and Don't Need to Use AD Cleanup
I've discussed some of the rules for automatic detection of merge candidates. These rules are based on SIDs and naming structures. You can infer two facts from this discussion: the importance of using good migration tools and the need for good naming standards.

Migrating all your NT 4.0 accounts to Win2K first usually minimizes the requirement for using AD Cleanup. All the major migration tools rely on the ClonePrincipal API from Microsoft. This API lets the SID History be populated into the migrated account and, subsequently, the ADC can match on it, thereby eliminating the potential for duplicate objects.

However, it's unlikely that you can wait for a complete migration of your NT 4.0 domains before you put the ADC in place; therefore, the likelihood of object duplication is real. Many migration tools are becoming more Exchange aware and are capable of matching the SID of an NT 4.0 account being migrated to Win2K against the msExchMasterAccountSID of an existing ADC-created object. Clearly, this capability reduces the need to run AD Cleanup. Similarly, some tools also are becoming good at matching on name terms. Therefore, having NT 4.0 account-naming data in line with naming data from Exchange Server 5.5 becomes important, too. Any effort spent on sanitizing your existing environment (e.g., reconciling names such as Rich and Richard or Steve and Steven) reduces headaches during migration.

In Perspective
You're likely to discover duplicate objects in AD during or after a migration to Exchange 2000. The Active Directory Account Cleanup Wizard provides an invaluable way to detect and merge these troublesome duplicates, preserve attribute and access control information, and ensure the integrity of groups and distribution lists (DLs).

Of course, AD Cleanup is not all-powerful. The utility can't merge objects between forests. Nor can it merge enabled objects in different domains within the same forest (you must move them into the same domain first). AD Cleanup can't merge two objects that both have Exchange mailboxes associated with them or two objects that are both mail-enabled.

Although AD Cleanup reduces many migration headaches, don't use its availability as an excuse for sloppy Exchange Server 5.5 and NT 4.0 account data synergy or careless migration practices. Take the time up front before any migration activity to clean up your existing sources of data to reduce the likelihood of duplicates or, in the worst case, to increase the likelihood of AD Cleanup finding a match.

By selecting the right migration tools for the job and carefully planning your move to Exchange 2000, you might get away with never having to use AD Cleanup at all.