It's Friday afternoon, and 5 o'clock is fast approaching. You're just about ready to head home for the weekend, and your phone rings. You glare at the phone with a sense of foreboding, and reluctantly you pick it up. The Help desk has just accidentally deleted an organizational unit (OU) containing several top executives' user accounts. None of the execs can log on to the domain to access resources, including email and calendars. You feel a knot in your stomach because although you've tested the restoration of some Active Directory (AD) objects, you haven't had to do it under fire and you haven't had to use tapes controlled by the backup group in a remote datacenter location. After 2 hours of wrangling with the backup group and finally getting the correct tapes in the tape loader, you're ready to perform the restore.
If you've done a good job of testing and documenting your backup and restore procedures, you'll need just another 2 hours before you can finally restore the directory information tree file on the domain controller (DC) and proceed with the authoritative restore of the objects. And all along, you're thinking, There must be a faster and easier way.
Many companies rely on AD not only as the domain-authentication mechanism that permits access to resources on the network but also as the email directory and in some cases the company's authoritative directory. Needless to say, the integrity of the data in AD needs to be heavily guarded and disaster recovery must be a priority. Every hour necessary to restore a deleted object can translate into thousands of dollars of lost productivity. Enter the delayed-replication recovery site.
Delayed Replication
The basic concept of delayed replication is simple: Imagine a pair of DCs that replicate with the rest of the forest only once per week on a staggered schedule. This lengthy replication cycle, in a multimaster directory, lets an administrator turn back the hands of time in the event of a disaster. For example, one recovery DC might replicate every Tuesday at 11:00 am, and the other every Friday at 11:00 pm. This staggered replication schedule ensures that you always have a minimum of 3.5 days to recover an item (or items) after it's deleted.
If you implemented only one recovery DC per domain, the timing might be such that the deletion would immediately precede the replication to the recovery DC, in which case you'd lose your opportunity to recover the item. Consider a scenario in which an object gets deleted on Tuesday at 10:00 am, but the deletion isn't noticed until Tuesday afternoon. If you had only one recovery DC replicating at 11:00 am on Tuesday, the DC would have replicated with the rest of the domain and received the deletion of the object. You would have missed the opportunity to recover the object. If you have a second DC replicating on a staggered schedule (e.g., Friday at 11:00 pm), you can still recover the object from that DC.
You could implement more than two delayed-recovery DCs and establish any number of replication timing scenarios. You could have seven delayed-replication DCs that replicate on different days, permitting you to restore objects with greater precision. In this article's 3.5-day scenario, it's possible that you could restore an object that is more than 3 days old. Any changes made to the object in that 3-day period, however, would be lost on restoration.
In the past, recovering a deleted object took as long as 6 to 8 hours and involved several folks from several support areas. With delayed replication, you can recover a deleted user account in less than an hour, using only one support person. In my company, the first time we used our delayed-replication DCs to recover a deleted account, even the user was surprised by how quickly we restored the account to working order.
Another advantage of putting DCs on a delayed-replication schedule is that you can query the directory on the delayed DC to find information (e.g., the distinguished nameDNwhich is required to restore the object) about the deleted item. You might need this functionality, for example, if a user account was deleted but nobody knows the user account's OU path location. In a typical restore scenario, you would need to access a restored directory offline (not on the network) so that you could query for the object and gather the DN for use in the authoritative restore process.
Building a Delayed-Replication Site
The mechanism that controls the schedule a DC uses to replicate is the placement of DCs in separate AD replication sites. If you want to have two DCs for each domain that replicate on different schedules, you'll need to create two AD sites and configure site links from those sites to another well-connected AD site. You'll need to configure these site links to replicate at the times you desire. In the previous example, one site replicates only on Tuesday at 11:00 am and another replicates only on Friday at 11:00 pm. Ensure that your new AD sites are configured to use only the delayed-schedule links you specify and not the default site link.
You also should ensure that the new delayed-site links are configured to have a higher cost than any other site links in your forest. Doing so will ensure that Microsoft Exchange Server's Dsaccess component doesn't choose the delayed-replication DCs for directory lookups. This procedure is important even if no Exchange servers exist in the delayed-replication sites, because Exchange will sometimes choose DCs in other sites. You'll need to use the Active Directory Sites and Services snap-in to perform this work. Navigate to the Inter-Site Transports\IP container to create your site links. After you create a link, go to the link's properties and set the cost to the desired parameter. Next, click the Change Schedule button and choose the times you want replication to occur. Figure 1 shows an example of setting the link schedule.