Set up a replicated data-publication system on Windows Server 2003 R2
Any administrator who manages a distributed network knows what a pain it is to distribute data around the network to where it's needed and keep it all in sync. Historically, data distribution hasn't been a strong point of Windows: Most administrators use the old Robocopy utility to do their replication and synchronization work because the built-in File Replication Service (FRS) just isn't up to the task.
Robocopy works great, but it's just a simple command-line utility with limited features. Now, you have a better alternative. One of the coolest new features of Windows Server 2003 Release 2 (R2) is DFS Replication (DFSR). DFSR is a complete reworking of FRS, with none of the limitations of its predecessor. Using DFSR with R2's improved DFS (now called DFS Namespaces), it's really easy to set up a replicated, fault-tolerant data-publication system.
The Basics of DFS and DFSR
DFS is a utility that abstracts Universal Naming Convention (UNC) names into a folder hierarchy using names you choose. For example, you can take a collection of shares such as \\flaserver1\reports, \\tx-server4\reports, and \\caserver2\reports and create a namespace out of it. A namespace is a virtual tree of folders that begins with \\ServerOrDomainName\RootName. In our example, the namespace would have a top-level folder named Reports and three subfolders: Florida, Texas, and California. Thanks to DFS, an employee looking for Texas reports could simply connect to \\domain\reports and click the Texas folder, instead of trying to remember which server the share resides on.
You can also use DFS to make data easily accessible by grouping it under a common UNC name, regardless of which server the data is stored on. DFS does this grouping when you map a DFS folder to multiple network shares (aka link targets) scattered in different locations. Because DFS is Active Directory (AD) site aware, a Windows XP or Windows 2000 client accessing a DFS folder will attempt to find the closest link target—a process Microsoft calls the data distribution scenario. (To learn about how DFS and AD integrate with each other, see the sidebar "How DFS and AD Work Together", page 34.) For the data distribution scenario to work properly, users must see the same files regardless of which link target they connect to, so all the network shares to which a folder is mapped must contain identical data.
DFSR is a multimaster replication engine used to distribute copies of data across multiple servers. It can run with or without DFS Namespaces, but its most popular use is to ensure that every member of a set of servers—a replica set—contains identical data and that replication is fast and bandwidth-efficient. It has many features, including bandwidth management, replication scheduling, and an innovative compression algorithm, that together dramatically decrease the amount of network bandwidth needed to keep data synchronized across your network. Microsoft reports that using DFSR results in up to a 300 percent improvement in the speed of large-file replication and 40 percent less administrative time spent managing the replication set.
A DFSR Scenario
Let's take a hypothetical software distribution system that uses DFS and FRS and rebuild it using DFS Namespaces and DFSR. HardwareTX is a small company with offices in Houston, Fort Worth, and Sweetwater, Texas. One of the services it provides to its clients is installing customized OS builds on notebooks, desktops, and servers. To support this service, the company has a software distribution system that keeps copies of CD-ROM and DVD ISO images at each office. When a build needs to be updated, the home office in Houston changes the master copy, creates a new image of it, and copies the image to the distribution system's local share at each location.
FRS keeps the network shares at the Fort Worth and Sweetwater branches synchronized. FRS is so troublesome, however, that Emily, the distribution system administrator, planned to use Robocopy instead. However, with the availability of Windows 2003 R2, she scrapped those plans and will use DFSR.
Before Emily can start using DFSR, her IT colleagues need to upgrade the company's AD schema to Windows 2003 R2 by running the Adprep utility. This schema upgrade, or extension, is necessary to support DFSR's required classes and attributes. (See the x86\setup\CMPNENTS\R2\ADPREP\SCH31.LDF file on the distribution media for more information about the schema changes.)
While she waits for the schema extension, Emily must upgrade her distribution servers to R2. If you're already running Windows 2003 and you've installed Service Pack 1 (SP1), you'll find R2 installation a snap. (For instructions on installing R2, see the Web-exclusive article "How do I install Windows Server 2003 R2 on a Windows 2003 Service Pack 1 installation?" March 2006, InstantDoc ID 49716.) Then, she needs to install DFS from the Windows Components section of the Add/Remove Programs Control Panel applet.
Like AD replication, DFSR is designed for a "read mostly" environment. Because the replication engine is loosely coupled, updates to a file on one member of a replica set don't lock that file on other members, nor are the updates transmitted immediately. Therefore, DFSR isn't suited for a highly active, update-rich system. You might work around this technical restriction by using business processes to restrict updates of a particular set of files to one replica member; file locking will then ensure that changes are made in a sustainable manner.
After the schema and distribution servers are upgraded, Emily chooses to remove, restructure, and rebuild her DFS namespace to a simpler configuration than the current production version. She isn't required to rebuild her namespace; previous versions are compatible with R2 and can take advantage of new features when the participating servers are upgraded. Because she has to upgrade all of her servers, however, she takes advantage of this opportunity to restructure the DFS Namespace's logical configuration.
Before rebuilding her namespace on R2, she has to install the Microsoft Management Console (MMC) DFS Management snap-in. (If she's going to use the snap-in on a server other than the namespace servers—say, a management server—she first needs to upgrade that server to R2.) She goes to the Windows Components section of the Add/Remove Programs Control Panel applet, chooses Distributed File System, clicks the Details button, then selects the DFS Management and DFS Replication Diagnostic components and clicks OK.
On an XP client, upgrading to R2 isn't as straightforward: R2 is on two CD-ROMs, and if you try to install the administration tools using the usual adminpak.msi package, you'll go awry. Remember that the first R2 CD-ROM is actually Windows 2003 SP1, so the administrative pack on that disk simply installs the Windows 2003 administration tools. Emily needs to go to the \admin folder on the second CDROM, where she'll find four unhelpfully named files. She needs only two of them: Windowsxp-MMC30KB907265-X86.exe is a hotfix that upgrades MMC to MMC 3.0, and fsrmgmt.exe installs the File Server Resource Manager, Print Management, and DFS Management snap-ins.
Build a Namespace
After the AD schema and Emily's distribution servers are upgraded, she's ready to build her namespace. First, she takes down her existing namespace using either the new DFS Management snap-in or the legacy DFS administrative tool. Because she has FRS configured to replicate member-server data, it's best to use the legacy tool to disable FRS replication. She then removes the root targets by using either the snap-in or the legacy tool.
Next, she builds her new namespace. Because HardwareTX is a small company, its two domain controllers (DCs)—HoustonDC and FortWorthDataDC—will also be namespace servers for the new namespace. HoustonData, FortWorthDataDC, and SweetwaterData servers will provide file shares as link targets.
To build the new namespace, Emily uses the new DFS Management console. The left pane displays the familiar console tree that shows the elements you can manage. The right pane is the new MMC 3.0 Actions pane, which contains the same choices as the right-click context menu. The center detail pane displays two step-by-step guides that help the administrator configure DFS. It also displays illustrations of what namespaces and replication groups look like and provides links to the DFS Web site and newsgroups.
To create her namespace, Emily could right-click the Namespaces node and choose New Namespace. Because she's trying out new features of the MMC 3.0 console, however, she uses the Actions pane instead. Clicking New Namespace launches the New Namespace Wizard. The wizard's Steps pane shows how many steps she has to complete and which step she's on and lets her return to an earlier step by clicking that step rather than clicking Back numerous times.
Emily enters the HoustonDC as the server in the Namespace Server step and selects Next. If the selected server doesn't have the DFS service running, the wizard automatically starts the service.
In the next step, she enters a name for the namespace. The wizard creates the DFS folder structure on the server and sets the share permissions. Remember that this isn't the share for the servers that hold the data to be published in the namespace—it's the share for the namespace itself. Since the folder and share are managed by the DFS service, it's best to click Next and take the default of All users have read-only permissions. This default prevents two people from simultaneously updating the same file on different replica members.
The next step asks whether Emily wants to create a domain-based namespace or a standalone namespace. She creates a domain-based namespace. Unless you have a special requirement for a standalone namespace on a single server, you should choose the domain-based configuration because it provides scalability and fault tolerance that a standalone namespace can't. (The sidebar "How DFS and AD Work Together" details the advantages of a domain-based namespace.)
The next step presents Emily with a summary of her choices. Clicking Create triggers the namespace creation process. Individual steps are shown and confirmed when complete, and when the entire process is finished she gets a clear confirmation.
Add a Namespace Server
The Houston server is now a single point of failure for the entire namespace. Adding the Fort Worth server as a second namespace server provides fault tolerance in case the first server becomes unavailable. Because Fort Worth is closer than Houston to Sweetwater, the Fort Worth server also provides both local and Sweetwater users faster access to the namespace.
To add a namespace server, Emily clicks \\hardwaretx.net\Software in the left pane of the DFS Management snap-in, then selects Add Namespace Server in the right pane. She enters FORTWORTHDATADC, the name of her server, in the search dialog box to complete the addition. By clicking the Namespace Servers tab in the middle pane, she can see both of the DCs that support the namespace. The actions available in the right pane make it easy to delegate permissions on the namespace and view or modify properties of the namespace or a namespace server.
Add Folder Targets
To make her working namespace useful, Emily needs to add folders to it. She wants to add just one folder, ISOs, to the namespace. She clicks the \\hardwaretx.net\Software namespace, then clicks New Folder in the Actions pane. In the New Folder dialog box, she types the folder name. Clicking Add to add a folder target—the real server or servers hosting the data—lets her enter a server name, examine the server's existing shares, add a share if necessary, and even configure the share permissions. The ability to configure the link servers in the console is a big improvement over the earlier version of DFS. She adds the ISO network shares on her four servers as folder targets in her new namespace, as Figure 1 shows.
Because she adds more than one folder target, the DFS Management console assumes she wants to share data among her four folder targets and prompts her to set up a DFSR replication group. She chooses Yes, which launches the Replicate Folder Wizard.Set Up DFS Replication
Because Emily implicitly chose to configure replication by immediately adding a second folder target to her namespace, the Replicate Folder Wizard prepopulates many fields with information from the folder. Clicking Next from the first screen causes the wizard to determine which folders are available to be members of the replica set. Any servers that contain folder targets and don't run R2 are ineligible to join the replica set.
Clicking Next again takes Emily to the Primary Folder Target step, where she chooses the folder that will initially be authoritative for the replica set. Designating a primary folder means that if there's any content already in the replica set, the contents of the primary folder will override the preexisting content in the replica set. Files that don't exist in the primary folder are removed from view. Those files aren't deleted, but are moved to a hidden system folder named DfsrPrivate\PreExisting under the folder root, as a DFSR log message tells you in clear English. (Concise, informative logging is one of the much-needed improvements DFSR offers over FRS.)
In the Topology Selection step, which Figure 2 shows, Emily can choose from a huband-spoke replication topology, a full mesh topology, or build her own custom topology. A hub and spoke topology (where multiple branches replicate with one or more centralized servers) is more scalable, but a mesh topology is highly fault tolerant. In a mesh topology, each node is connected directly to the other nodes; consequently, in larger configurations a mesh topology can generate a lot of overhead for the servers. Since she has only four servers' content to replicate, however, Emily chooses the mesh topology.
The next step lets Emily tailor the amount of bandwidth that replication consumes. Because she doesn't need to replicate data on less than 24 hours notice and the company's WAN circuits are only lightly used after working hours, she decides to replicate with a bandwidth of 64Kbps—a trickle—during the day and with full bandwidth between 8 P.M. and 7 A.M. She clicks the Days and Times button, then clicks Edit Schedule. In the Edit Schedule dialog box, which Figure 3 shows, she chooses a day and time by dragging the cursor over a block of hours, then selects a bandwidth to use during that period and clicks Add. The final screen of the wizard appears, and with the configuration complete, she reviews her choices by clicking the Create button to create the replication group and schedule. The Errors tab provides details on any errors in her configuration.
Figure 4 shows the DFS Management snap-in focused on the ISOs replication group Emily created. In the Actions pane, she can add members, change or verify the topology, delegate permissions, edit the replication schedule, and pretty much alter anything she initially configured. The Connections tab in the middle pane provides details of the replication connections, which were previously available only through the complicated Ultrasound utility for monitoring FRS.
Replication might not begin immediately, regardless of the schedule, because DFSR has preliminary work to do. It must create and populate a staging directory with files to be replicated, and it must establish both inbound and outbound connections with its other replica set members. The amount of time it takes to complete these steps depends on the amount of data on the primary originating share and on how much of that data already exists on the replica members. When it's ready, DFSR begins replicating data.
Emily wants to create a diagnostic report to see how the initial replication is coming along, so she clicks Create Diagnostic Report in the Actions pane. She can choose to look at disk space used, backlogged transactions and files, and replication efficiency, but accepts the default settings and gets the HTML report shown in Figure 5. The report reveals that she still has a little troubleshooting to do, because two servers aren't reporting and the others are low on disk space.
Not Your Father's DFS
Windows 2003 R2's DFS Namespaces and DFSR service represent major improvements in all aspects of DFS. Setting up and maintaining a namespace are much easier than they've ever been. If limitations in DFS have prevented you from using it in the past, it's time to revisit this capability.