The vital link between the Directory Store and AD

Chalking up more than 5 million new seats every quarter, Microsoft Exchange Server is on a roll. Today, more than 30 million people use Exchange Server for their email system. Exchange 2000 Server (formerly code-named Platinum) represents a fundamental shift in Microsoft messaging technology because it depends on Windows 2000's (Win2K's) Active Directory (AD), which usurps the role of Exchange Server 5.5's Directory Store. Users, contacts, and groups become mail-enabled AD objects that take the place of the Directory Store's mailboxes, custom recipients, and distribution lists (DLs). Whereas the Directory Store today holds information about servers and sites, AD holds all configuration data about Exchange servers, routing groups, and administrative groups.

An important technical fact to remember in your Exchange 2000 design phase is that Exchange Server places all its configuration data in AD's configuration naming context (NC). You can replicate this configuration data only inside a forest, so an Exchange 2000 organization can span only one Win2K forest. If you want to operate multiple Exchange 2000 organizations (each with its own set of servers), you'll need multiple Win2K forests. Today, some companies operate multiple organizations when no trust relationship exists between two user communities and their technical staffs. You can run two Exchange Server organizations within one Windows NT 4.0 domain, but this capability won't be possible in an Exchange 2000 environment.

Major upgrades require comprehensive planning. You can't deploy Exchange 2000 without a solid Win2K infrastructure—particularly AD—already in place. Some companies will deploy Exchange 2000 immediately after they stabilize Win2K. Other companies will be content to run Exchange Server 5.5 (with Service Pack 2—SP2—or later) on Win2K and assess Exchange 2000's performance in other environments. However, these companies need to remember that they can upgrade from Exchange Server 5.5 SP2 or later to Exchange 2000 only after they've upgraded their servers from NT to Win2K. No upgrade paths to Exchange 2000 exist for Exchange Server 5.0 or 4.0 servers in NT.

Your Exchange 2000 deployment requires a link between the Exchange Server 5.5 Directory Store and AD. This link, the Active Directory Connector (ADC), lets users attached to different versions of Exchange Server communicate with one another. The ADC is an important tool for any Exchange Server administrator because it will be necessary in all but green-field deployments (i.e., deployments of new servers with no migration requirement). To begin a successful implementation of the ADC, you need to understand the basics of Lightweight Directory Access Protocol (LDAP), the reason for synchronization, the importance of connection agreements (CAs), and the role of replication.

LDAP Provides the Base
LDAP is the Internet protocol for directory access. Exchange Server 5.0 was the first version of the messaging software to support LDAP. Initially, Exchange Server supported just the read-only mode, which was sufficient to let POP3 and Web browser clients consult the Directory Store to check email addresses. Exchange Server 5.5 enhanced LDAP support to allow read-write access through LDAP 3.0. Exchange Server 5.5 also introduced performance enhancements that let the server process large queries efficiently. These collective LDAP enhancements let Exchange Server 5.5 support bidirectional synchronization between the Directory Store and other LDAP-compliant directories, such as AD. Finally, Exchange Server 5.5 SP2 applied some schema extensions to the Directory Store that let the Directory Store hold AD attributes for objects and that can track whether the ADC has synchronized an object.

Win2K relies heavily on LDAP, which is the basic protocol clients use to query AD and retrieve information. LDAP is also the protocol that the ADC uses to read and write data from the Directory Store and AD. Directory synchronization requires read-write access—otherwise, you could never update a directory. Therefore, you must run Exchange Server 5.5 before you can use the ADC. This requirement doesn't mean that you must immediately upgrade every Exchange Server 5.0 and 4.0 server to version 5.5. You need only one Exchange Server 5.5 machine in each site that you want to connect. This server is similar to bridgehead servers that connect Exchange Server sites by means of directory replication connectors (DRCs). However, given that you can upgrade to Exchange 2000 only from Exchange Server 5.5, plan to move all your organization's Exchange servers to version 5.5 as soon as possible—ideally to the latest service pack.

LDAP lets clients access the contents of a directory, but the DirSync control establishes a method to compare directory contents and decide what data requires synchronization. In the Internet-draft Microsoft LDAP Control for Directory Synchronization, Microsoft describes the DirSync control as a potential solution to the problem of synchronizing different directories according to a common standard. You can find the draft, which might become an Internet Engineering Task Force (IETF) Request for Comments (RFC), at http://www.ietf.org/internet-drafts/draft-armijo-ldap-dirsync-00.txt. Over the next few years, Microsoft will probably use DirSync extensively to synchronize AD and other directories, such as the Lotus Notes address book, Novell GroupWise, and Internet Yellow Pages.

Why Synchronize?
Clients depend on a Global Address List (GAL) to address and route email correctly. Exchange Server 5.5 derives the GAL from the Directory Store; the GAL contains entries for every object (e.g., mailbox, custom recipient, DL, public folder) that has an email address and that the administrator hasn't hidden. Every Exchange server holds a copy of the Directory Store that contains information from all the organization's servers. AD takes the place of the Directory Store when you install Exchange 2000, which derives the GAL from the complete copy of user information that resides on an AD Global Catalog (GC). Exchange 2000 stores configuration information in AD's configuration NC, and this data replicates across the forest so that all the organization's servers have access to it. In addition to server details, the configuration data includes such data as Exchange 2000 administration and routing topologies and connectors. The GC holds copies of every mail-enabled object in a Win2K forest, so Exchange 2000 uses the GC to create a GAL that clients can access. The Exchange 2000 System Attendant service automatically executes an LDAP query every 10 minutes to create the GAL.

Migration from Exchange Server 5.5 to Exchange 2000 won't happen overnight. You need to provide users with a unified directory during the migration period. Because Exchange 2000 builds the GAL from a GC (which depends on AD), synchronization must occur between the Exchange Server 5.5 Directory Store and AD. This synchronization creates a complete picture of all the users in an Exchange Server organization during the migration period. The ADC is the simplest and best way to accomplish the synchronization.

The Active Directory Connector
Two versions of the ADC are available. Win2K provides one version (in the Windows 2000 Server—Win2K Server—CD-ROM's valueadd\adc directory), and Exchange 2000 provides an enhanced version (in the Exchange 2000 CD-ROM's \adc directory). The Win2K version supports only replication of objects from the Directory Store site NC (e.g., the contents of Recipients containers), whereas the Exchange 2000 version can also process configuration data. Both versions can provide a synchronized GAL that includes full details about Exchange Server 5.5 mailboxes, custom recipients, and DLs. However, you need the Exchange 2000 version if you have a mixed-mode Exchange Server organization (i.e., an organization that includes Exchange 2000 servers and earlier-version Exchange Server computers). To support mixed-mode organizations, the Exchange 2000 ADC version lets configuration data (e.g., site topology, information about servers) replicate between the Directory Store and AD. The Exchange 2000 ADC also supports downstream routing (i.e., the ability to use messaging connectors attached to Exchange Server 5.5, 5.0, or 4.0 servers).

You can start using the Win2K version of the ADC, then upgrade to the Exchange 2000 version after Exchange 2000 ships. This strategy lets you start populating AD with Exchange Server user data so that a GAL is ready for your first Exchange 2000 server (which will probably be a test server so that you can become accustomed to the new software). Later, when you're ready to begin your migration, you can upgrade the ADC to allow configuration data sharing. Deploying the ADC twice is hard work, so the best strategy is to deploy the Exchange 2000 version first.

As Screen 1, page 150 shows, the ADC process runs as a service, as do all the other services that constitute Exchange 2000. You can see some of the new services that Exchange 2000 introduces, such as the new SMTP-based routing service and the IMAP4 protocol stub, in Screen 1.

Connection Agreements
Although the ADC controls synchronization, one or more CAs tell the ADC how synchronization needs to occur. To fine-tune synchronization and accommodate the needs of even the most complex Exchange Server deployments, one ADC can support many CAs. An ADC that must manage more CAs will consume more system resources and create a more complex overall synchronization environment.

According to Microsoft, even though no architectural restrictions exist for the number of CAs that one ADC can support, system resources become a practical constraint after you establish about 50 CAs. If you need to run more than 10 CAs, consider allocating a dedicated server for ADC operation. Best practice in this area will evolve over the next couple of years. Today, you need to approach ADC projects with the goal of restricting the number of CAs.

To host the ADC, distributed Exchange Server organizations with large numbers of sites might require multiple servers with CAs shared among the servers. Organizations that already operate central Exchange Server routing sites (typically around a hub-and-spoke network) will likely find that the central site is a good location for synchronization. Because an Exchange server can make changes only to objects that belong to its site, you need a separate CA for bidirectional synchronization between AD and each site. Following the same logic, if you want to apply updates from the Directory Store to Win2K domains, you need a separate CA for each domain. Companies that run large multisite Exchange Server organizations will therefore find that they require multiple CAs, even if they want to synchronize with only one Win2K domain.

The ADC supports full bidirectional connections—changes synchronize from AD to the Directory Store, and vice versa. However, you can decide to use a one-way CA. For example, if you want to populate AD based on the Directory Store and establish a test environment for Win2K, use a one-way connection from the Directory Store.

Each CA specifies several important parameters, including

  • The names of the Win2K and Exchange servers on each side of the connection, and the account names to use and the passwords of those accounts. You enter this information on the CA's Connections tab, which Screen 2 shows.
  • The LDAP port that the system uses to connect the two directories. For LDAP connections, Exchange Server 5.5 typically uses port 389. However, if the server is running Win2K, the OS takes port 389 for base LDAP connectivity. Therefore, you must allocate another port for Exchange Server's LDAP use (most people opt to use port 390). To choose a different port, select LDAP from the server's Protocols container and update the number.
  • The schedule by which the system establishes connections.
  • How you create new objects in AD for Exchange Server mailboxes. Your choices are new user objects, disabled user objects, or mail-enabled contacts.

CAs process only mail-enabled objects that reside in the Exchange Server Directory Store. Outlook clients can also send email to contacts that reside in a folder in a private mailbox or the public store. Although Microsoft Outlook contacts and AD contacts have similar purposes, they're different. If you depend on Outlook contacts (e.g., to store details about external correspondents in a public folder), you'll need to manually apply these addresses in AD until public folder replication is in place between Exchange 2000 and Exchange Server 5.5.

Prime CAs
AD is a multimaster directory that lets any controller within a domain create and update objects. The Directory Store is also a multimaster directory, but a server can create and update objects only within its home site. (This limitation is one of the reasons that you need a separate CA per site.) Using primacy, the ADC supports different scopes for write access. A prime CA lets you create objects on either side of the connection, and changes will replicate back to the other directory. You can set CAs to prime for Exchange Server or Win2K. For example, if you mark a CA as prime in Exchange Server, you can update Exchange Server 5.5 objects in AD and the changes will replicate back to the Directory Store. The first CA that the ADC establishes is always a prime CA because you need at least one prime CA for the ADC to operate. A nonprime CA lets you update objects only if the objects already exist.

One CA will suffice if you require only one-way replication (e.g., from the Directory Store to AD). However, if you begin to deal with bidirectional updates involving several Exchange Server sites, multiple Win2K domains that hold user accounts, or object remapping across containers, then several CAs are necessary. If you need to run more than one CA, ensure that synchronization doesn't cause Exchange Server to create duplicate objects in each directory—a possibility that exists if multiple CAs are performing updates.

One-way synchronization is straightforward. Bidirectional synchronization is another matter. The advantage to bidirectional synchronization is that it allows changes to objects in either Win2K or Exchange Server 5.5 and replicates those changes back to the appropriate directory. The ability to use one utility, such as the Active Directory Users and Computers console, to manage users, contacts, and groups—even if the object belongs to Exchange Server 5.5—is attractive. However, the potential difficulties of managing synchronization across a distributed network offset the value of that ability. Some companies preparing for Exchange 2000 have decided that managing each set of objects with native administration tools is the simplest strategy. This method results in one-way CAs across the board.

Bidirectional synchronization works best when you have strong centralized management models; weaknesses appear when you have more distributed models. Setting up one-way CAs is easier if you have more than 10 sites, multiple recipient containers that lead to complex CAs, and multiple servers running the ADC. One-way synchronization forces Exchange Server 5.5 to use the Exchange Server administration program to manage objects and forces Win2K to manage objects through Active Directory Users and Computers. Although one-way synchronization might seem like a step backward, it's the way administrators have always managed the two sets of objects.

Mapping
Both the Directory Store's Recipients containers and AD's organizational units (OUs) are containers that store objects, and both serve as a subdivision of their respective directory. By default, each Exchange Server site has one Recipients container, but many sites require multiple Recipients containers. Some Recipients containers hold specific objects, such as custom recipients that Exchange Server synchronizes into the Directory Store from a foreign directory (e.g., Lotus Notes). Other Recipients containers split the directory into organizational departments (e.g., Marketing, Finance, Sales).

If you want to move objects between Exchange Server 5.5 containers, you need to delete and recreate the objects. However, Exchange Server has provided Address Book Views (ABVs) since version 5.0. ABVs generate dynamic views of the Directory Store according to preset criteria, so you can move objects simply by changing the values of one or more properties—a much easier process than the delete/create alternative. Because of dynamic view generation, creating containers for organizational reasons is no longer common. Exchange Server 4.0 doesn't support ABVs, so many early deployments used multiple Recipient containers.

The Directory Store uses a distinguished name (DN)—partially based on the container name—as a key to an object. AD also uses DNs, but objects use globally unique IDs (GUIDs) as the primary key in the directory. An object's DN can change as you move the object within AD (e.g., from one OU to another), but an object's GUID always remains the same. This GUID consistency simplifies the act of moving an object between OUs or from one server to another.

Screen 3 shows how to select Exchange Server containers that you want to synchronize with AD. A CA can direct updates from multiple Exchange Server containers to one OU. However, if you want to move objects from one container to a specific OU and direct objects from another container, you need multiple CAs.

How Replication Occurs
Both the Directory Store and AD store a set of attributes for each object. Replication ensures that the same objects exist in both directories, and that the same values exist in the objects' attributes. However, because replication occurs on a scheduled basis, changes on one side might not flow immediately to the other directory. Attaining a state of loose consistency—in which two directories are 99 percent synchronized—is fairly easy, thanks to the nature of networked systems and the experience gained from using the Directory Store since 1996. Tighter consistency requires frequent synchronization across reliable network links.

Exchange Server administrators are accustomed to the mechanism of replication. Update sequence numbers (USNs), which change as you update objects, drive replication. Each time you make a change within a directory, the directory increments the server USN by 1 and stores the new value as an attribute of the object. Servers that synchronize together know their partners' USNs and can make requests for outstanding updates by asking for all changes that have occurred since a specific USN value. Suppose server A has received all updates up to USN 1755 from server B. Server A can simply request outstanding updates by asking server B to provide data for every change that has a USN value higher than 1755. Exchange Server uses this scheme (with small variations) to synchronize changes in the Directory Store and public folders; Win2K uses this scheme to synchronize AD between Win2K domain controllers. That the ADC uses the same mechanism isn't surprising.

When replication occurs from the Directory Store to AD, the ADC executes an LDAP query to discover outstanding changes in the Directory Store. The ADC then examines each object that the Directory Store returns to retrieve values from the DN, NTGUID, and PrimaryNTAccount attributes. The NTGUID is a unique numeric value that serves as the key for the object within AD, and the PrimaryNTAccount attribute holds the SID for the account associated with the mailbox.

Because the NTGUID is the primary key, the ADC uses it first to locate the object. If this search fails, the ADC indicates that the object doesn't exist under this GUID but that the object might reside under a different GUID. The ADC then uses the object's DN to search against the LegacyExchangeDN attribute. The LegacyExchangeDN attribute holds the DN in the Directory Store, so a successful search will bring up objects that the ADC has synchronized into AD—perhaps through another CA. If the ADC still can't find the object, the ADC uses the object's SID (which resides in the Directory Store) to perform a final search through the AD SIDHistory. Failure at this stage means that the object doesn't exist in AD, so the ADC creates a brand-new object according to the CA policy (i.e., user, contact, or disabled user). However, if the ADC finds the object at any of these stages, the ADC updates the object by merging its attributes.

Whereas AD supports attribute-level replication, the Directory Store performs replication at the object level. If you change one attribute on a Directory Store object, the complete object replicates. Replication must occur at the lowest common denominator, so it occurs between AD and the Directory Store at the object level.

Sychronization is somewhat simpler when it occurs from AD to the Directory Store. AD can contain many types of objects—many of which the Directory Store doesn't support—so to ensure that synchronization occurs only with mail-enabled objects, the ADC applies a filter to its query to locate outstanding updates.

When the ADC synchronizes AD objects into the Directory Store, the Directory Store allocates each object a DN, which the objects store in the LegacyExchangeDN attribute. This DN value is the key into the Directory Store, and the ADC attempts to locate an object that has a matching NTGUID. If the ADC can locate the object, the object is therefore synchronized with AD, and the ADC performs an update by merging attributes. Otherwise, the ADC creates a new object.

AD builds groups, and the Directory Store builds DLs, from a set of backward pointers to other objects. Incomplete replication poses a problem for the ADC: The ADC can feasibly request to create a new group in AD—based on an Exchange Server DL—before all the group's members exist in AD through synchronization. In this situation, the set of backward pointers is incomplete. Obviously, you can't create a pointer to an object that doesn't exist. In this situation, the ADC uses a special attribute of the UnmergedAtts group to keep track of latent members that the ADC finds in the Exchange Server DL when the ADC creates the group. Each time the ADC synchronizes the directories, the ADC searches AD for the latent members. If the ADC finds the latent members, it establishes the members' true membership. An ADC detail such as this proves that Microsoft has learned important lessons from 4 years of Directory Store replication.

Scheduling Synchronization
As I stated earlier, a replicated directory is typically in a state of loose consistency. Synchronization achieves a similar state of consistency between two directories; you determine the exact level of consistency by how often the contents of the directories undergo synchronization. The ADC synchronizes AD and the Directory Store according to the schedule you establish in the CA. Your options are Never, Always (i.e., every 15 minutes), and Selected Times (i.e., according to the schedule you set on the CA's Schedule tab, which Screen 4 shows). The CA schedule is identical to the schedules for intersite directory and public folder replication in earlier versions of Exchange Server.

When you determine a suitable schedule, you'll juggle the need to replicate changes quickly and the desire to dedicate a reasonable amount of system resources to the job. If you activate a CA every 10 minutes, the directories will stay in a higher state of consistency than if you activate the CA every 2 hours. However, each CA activation, which involves the two directories communicating with each other and processing updates, generates network traffic. A default schedule (e.g., every 15 minutes) will probably suffice if you're processing a limited number of Exchange Server sites and Recipients containers. However, if you have a more complex deployment that requires bidirectional replication from 20 or 30 sites and involves some degree of object remapping, you need to ensure that the CA activation schedule doesn't exhaust system resources. In a complex deployment, each CA will probably have a schedule that is linked to an overall schedule that doesn't interfere with the schedules of other CAs.

The Start of the ADC Journey
We're in the early days of ADC design. Exchange Server experience shows that design and deployment techniques evolve quickly after Microsoft releases the technology and companies put the software into production. The ingredients for a successful ADC design include experience from other synchronization projects, knowledge of AD and the Directory Store, and a certain amount of guesswork.

The ADC is an important part of Exchange Server's journey into the Win2K era. Even if you plan to run Exchange Server 5.5 on Win2K and delay your Exchange 2000 upgrade, you still need the ADC to avoid operating separate directories. And when the time comes to migrate to Exchange 2000, the ADC will help synchronize the action. Much information about this subject will become available as systems administrators gain experience from early deployments. So keep your ear to the ground and make sure you're fully informed before you begin your implementation.