Naming Contexts
In "Active Directory in Windows 2000," Winter 1999/2000, I introduced the concept of AD naming contexts (I've also seen them referred to as partitions), which are different paths Win2K uses to replicate different types of information between domain controllers in a forest. For each domain in the forest, Win2K replicates the domain naming context to all the domain controllers within that domain.
Win2K replicates the schema naming context (i.e., the AD schema) and the configuration naming context (i.e., site and subnet configuration information and other replication meta data) to all domain controllers in the forest. In addition, the GC, which is a partial replica of all objects in a forest, replicates to domain controllers that you designate as GC servers. The schema and configuration naming contexts contain information that isn't likely to change often for most enterprises, so these two naming contexts don't affect your site topology as much as the domain naming context and GC do.
Microsoft supports two replication protocols in AD. Standard RPC, the more common protocol by far, supports replicating the three naming contexts and the GC and can compress data for intersite replication. Standard SMTP is only for intersite replication of the schema and configuration naming contexts and the GC. SMTP is useful for intersite connections that are slow, unreliable, or even unavailable for large parts of the day. However, SMTP replication sends at least twice as many bytes across your network as RPC replication does. You use the AD Sites and Services MMC snap-in to define site links and specify which protocol to use for a given site link.
Connection Objects
After you decide on a site topology, Win2K creates the replication connections for you. AD provides a service called the Knowledge Consistency Checker (KCC) that runs on all domain controllers and builds connection objects between all domain controllers in your forest. Connection objects handle replication traffic between domain controllers. The KCC builds intrasite connection objects such that no more than three replication hops exist between any two domain controllers. Figure 3, page 62, shows the AD Sites and Services snap-in window, with a connection object (in the right pane) that the KCC generated on SERVERA in the Branch site.
Connection objects are one-way paths; if you have two domain controllers, each has a separate connection object to the other. However, if you have large numbers of domain controllers replicating with one another, not all the domain controllers might have two one-way connections with every other domain controller. A server initiating an intrasite replication event notifies the server at the other end of its connection object that the initiating server has changes. The target server then pulls the changed data from the initiating server.
The KCC also builds connection objects for intersite replication. When you create a site, the KCC picks a server to act as the bridgehead server for communication between the new site and remote sites. The bridgehead server uses its connection objects to replicate to remote sites at the times you specify in the site link object. Another server takes over the bridgehead responsibility if the regular bridgehead fails.
Designing a Site Topology
To design an effective AD site topology, you need to know how your company will use AD and how the different features in the Win2K infrastructure affect replication traffic. You can examine how you use your NT 4.0 domains to estimate your AD use. For example, look at the number of user accounts you create per day, the frequency of password changes, the average number of users in your user groups, and the frequency of user logons. Your unique traffic requirements should drive your site topology design.
Microsoft can tell you how many bytes of data are generated on the network when you create a new user or when users change their passwords (this information is available from resources such as AD Sizer and the Microsoft Windows 2000 Resource Kit). However, only you know how often you create new users and how often they change passwords. AD replication can take place at the attribute level (e.g., when a user changes his or her password, AD replicates only the password attribute, not the entire user object). But AD provides many more attributes per object than the NT 4.0 SAM provides, so depending on how you use objects, AD might generate more or less traffic than the SAM.
Your site topology design must answer the following questions:
- At which points do I need to establish site boundaries?
- How much bandwidth do I need to provide low latency of AD replication?
- At what point do I need to deploy a local domain controller instead of having users authenticate remotely?
Earlier, I stated that sites control when and how often AD replication takes place. (Intrasite replication takes place at 5-minute intervals; intersite replication takes place at 15-minute or longer intervals.) People often ask how slow a network link between two domain controllers can be before each controller needs to have its own site. No hard and fast rule exists. When a link to a remote location becomes saturated with intrasite AD replication and other traffic during typical operations, build a new site for one of the domain controllers and schedule replication at a longer interval. Intersite replication uses compression when a transaction is larger than 32KB. This compression can be very efficient, resulting in as much as a 90 percent decrease in data size (password changes don't benefit much from compression because they're encrypted).
After you roughly calculate how much data AD will replicate across your network, you can set replication frequency for your site links. Remember that site links are collections of sites in which the sites are connected by network links of roughly equal bandwidth. So, if sites A, B, and C are in one site link, they all replicate on the schedule you set for that site link. Set your replication schedule to take advantage of intersite compression while minimizing latency between domain controllers. For example, you might be tempted to set all of your site links to replicate at the minimum intervalevery 15 minutesto keep latency low. However, if you don't generate many changes, the amount of data transmitted in each replication might not be sufficient to trigger compression, and you might end up sending more data per replication than if you had spaced your replications further apart.
In addition to replication traffic that AD domain naming context changes cause, you must consider other bandwidth consumers in your site topology design. Here are some of the more obvious ones.
GC servers. Servers you designate as GC servers receive changes from every domain in your forest. The amount of data these servers receive is slightly less than the sum of changes from each individual domain because the GC holds only a partial replica of each domain's domain naming context.
SYSVOL shares. SYSVOL shares are replicated between a domain's domain controllers. Group Policy Objects also keep data in SYSVOL. The NT File Replication Service (NTFRS) replicates data between all domain controllers in a domain. NTFRS uses the existing site and replication topology to propagate these changes.
DNS zone records. If you're using AD-integrated DNS, DNS keeps zone records in the domain naming context for the domain in which the DNS servers are running. DNS zone data can change frequentlyfor example, if you have a large population of mobile users whose workstations change IP addresses on a regular basis.
Group members. AD group objects store their members as one multivalue attribute. Thus, when you change one user in a 500-user list, AD must replicate the whole attribute. In fact, Microsoft recommends keeping group membership below 5000 because above that number, replicating the entire attribute in a single replication event becomes difficult. If you need to support more users or computers in a group, use nested groups.
Finally, you need to consider, on a case-by-case basis, whether you want to place a domain controller physically close to client workstations in a remote location. In general, if the traffic that AD and related services generate to a local domain controller is greater than the traffic remote workstations generate to authenticate across the network link, you probably should not place a domain controller close to the remote workstations. Consider also that if you build a site around a remote set of users and put a domain controller in that site, you'll likely need to make that domain controller a GC server as well. This action will immediately increase your bandwidth requirements to that site. (For more information about designing AD sites, see Sean Deuby, "AD Sites, Part 1," June 2000, and "AD Sites, Part 2," July 2000.)
As you can see, moving AD from theory to practice requires careful planning to produce a thoughtful design. You'll need to base many of your design decisions on factors specific to your environment and your network. But with a little help from the resource kit and tools such as AD Sizer, implementing AD isn't completely black magic. The key is to understand thoroughly how and how much you'll use ADthen double those numbers and design accordingly, just to be safe.
George Lara December 21, 2000