Any administrator who manages a distributed network knows what a pain it is
to distribute data around the network to where it's needed and keep it all in
sync. Historically, data distribution hasn't been a strong point of Windows:
Most administrators use the old Robocopy utility to do their replication and
synchronization work because the built-in File Replication Service (FRS) just
isn't up to the task.
Robocopy works great, but it's just a simple command-line utility with limited
features. Now, you have a better alternative. One of the coolest new features
of Windows Server 2003 Release 2 (R2) is DFS Replication (DFSR). DFSR is a complete
reworking of FRS, with none of the limitations of its predecessor. Using DFSR
with R2's improved DFS (now called DFS Namespaces), it's really easy to set
up a replicated, fault-tolerant data-publication system.
The Basics of DFS and DFSR
DFS is a utility that abstracts Universal Naming Convention (UNC) names into
a folder hierarchy using names you choose. For example, you can take a collection
of shares such as \\flaserver1\reports, \\tx-server4\reports, and \\caserver2\reports
and create a namespace out of it. A namespace is a virtual tree of folders that
begins with \\ServerOrDomainName\RootName. In our example,
the namespace would have a top-level folder named Reports and three subfolders:
Florida, Texas, and California. Thanks to DFS, an employee looking for Texas
reports could simply connect to \\domain\reports and click the Texas
folder, instead of trying to remember which server the share resides on.
You can also use DFS to make data easily accessible by grouping it under a
common UNC name, regardless of which server the data is stored on. DFS does
this grouping when you map a DFS folder to multiple network shares (aka link
targets) scattered in different locations. Because DFS is Active Directory (AD)
site aware, a Windows XP or Windows 2000 client accessing a DFS folder will
attempt to find the closest link target—a process Microsoft calls the
data distribution scenario. (To learn about how DFS and AD integrate with each
other, see the sidebar "How DFS and AD Work Together", page 34.) For the
data distribution scenario to work properly, users must see the same files regardless
of which link target they connect to, so all the network shares to which a folder
is mapped must contain identical data.
DFSR is a multimaster replication engine used to distribute copies of data
across multiple servers. It can run with or without DFS Namespaces, but its
most popular use is to ensure that every member of a set of servers—a
replica set—contains identical data and that replication is fast and
bandwidth-efficient. It has many features, including bandwidth management,
replication scheduling, and an innovative compression algorithm, that together
dramatically decrease the amount of network bandwidth needed to keep data synchronized
across your network. Microsoft reports that using DFSR results in up to a 300
percent improvement in the speed of large-file replication and 40 percent less
administrative time spent managing the replication set.
A DFSR Scenario
Let's take a hypothetical software distribution system that uses DFS and FRS
and rebuild it using DFS Namespaces and DFSR. HardwareTX is a small company
with offices in Houston, Fort Worth, and Sweetwater, Texas. One of the services
it provides to its clients is installing customized OS builds on notebooks,
desktops, and servers. To support this service, the company has a software distribution
system that keeps copies of CD-ROM and DVD ISO images at each office. When a
build needs to be updated, the home office in Houston changes the master copy,
creates a new image of it, and copies the image to the distribution system's
local share at each location.
FRS keeps the network shares at the Fort Worth
and Sweetwater branches synchronized. FRS is so
troublesome, however, that Emily, the distribution
system administrator, planned to use Robocopy
instead. However, with the availability of Windows
2003 R2, she scrapped those plans and will use
DFSR.
Prep Work
Before Emily can start using DFSR, her IT colleagues need to upgrade the company's
AD schema to Windows 2003 R2 by running the Adprep utility. This schema upgrade,
or extension, is necessary to support DFSR's required classes and attributes.
(See the x86\setup\CMPNENTS\R2\ADPREP\SCH31.LDF file on the distribution media
for more information about the schema changes.)
While she waits for the schema extension, Emily must upgrade her distribution
servers to R2. If you're already running Windows 2003 and you've installed Service
Pack 1 (SP1), you'll find R2 installation a snap. (For instructions on installing
R2, see the Web-exclusive article "How do I install Windows Server 2003 R2 on
a Windows 2003 Service Pack 1 installation?" March 2006, InstantDoc ID 49716.)
Then, she needs to install DFS from the Windows Components section of the Add/Remove
Programs Control Panel applet.
Like AD replication, DFSR is designed for a "read mostly" environment. Because
the replication engine is loosely coupled, updates to a file on one member of
a replica set don't lock that file on other members, nor are the updates transmitted
immediately. Therefore, DFSR isn't suited for a highly active, update-rich system.
You might work around this technical restriction by using business processes
to restrict updates of a particular set of files to one replica member; file
locking will then ensure that changes are made in a sustainable manner.
Namespace Considerations
After the schema and distribution servers are upgraded, Emily chooses
to remove, restructure, and rebuild her DFS namespace to a simpler configuration
than the current production version. She isn't required to rebuild her namespace;
previous versions are compatible with R2 and can take advantage of new features
when the participating servers are upgraded. Because she has to upgrade all
of her servers, however, she takes advantage of this opportunity to restructure
the DFS Namespace's logical configuration.
Before rebuilding her namespace on R2, she has to install the Microsoft Management
Console (MMC) DFS Management snap-in. (If she's going to use the snap-in on
a server other than the namespace servers—say, a management server—she
first needs to upgrade that server to R2.) She goes to the Windows Components
section of the Add/Remove Programs Control Panel applet, chooses Distributed
File System, clicks the Details button, then selects the DFS Management and
DFS Replication Diagnostic components and clicks OK.
On an XP client, upgrading to R2 isn't as straightforward: R2 is on two CD-ROMs,
and if you try to install the administration tools using the usual adminpak.msi
package, you'll go awry. Remember that the first R2 CD-ROM is actually Windows
2003 SP1, so the administrative pack on that disk simply installs the Windows
2003 administration tools. Emily needs to go to the \admin folder on the second
CDROM, where she'll find four unhelpfully named files. She needs only two of
them: Windowsxp-MMC30KB907265-X86.exe is a hotfix that upgrades MMC to MMC 3.0,
and fsrmgmt.exe installs the File Server Resource Manager, Print Management,
and DFS Management snap-ins.
Build a Namespace
After the AD schema and Emily's distribution servers are upgraded, she's ready
to build her namespace. First, she takes down her existing namespace using either
the new DFS Management snap-in or the legacy DFS administrative tool. Because
she has FRS configured to replicate member-server data, it's best to use the
legacy tool to disable FRS replication. She then removes the root targets
by using either the snap-in or the legacy tool.
Next, she builds her new namespace. Because HardwareTX is a small company,
its two domain controllers (DCs)—HoustonDC and FortWorthDataDC—will
also be namespace servers for the new namespace. HoustonData, FortWorthDataDC,
and SweetwaterData servers will provide file shares as link targets.
To build the new namespace, Emily uses the new DFS Management console. The
left pane displays the familiar console tree that shows the elements you can
manage. The right pane is the new MMC 3.0 Actions pane, which contains the same
choices as the right-click context menu. The center detail pane displays two
step-by-step guides that help the administrator configure DFS. It also displays
illustrations of what namespaces and replication groups look like and provides
links to the DFS Web site and newsgroups.
To create her namespace, Emily could right-click the Namespaces node and choose
New Namespace. Because she's trying out new features of the MMC 3.0 console,
however, she uses the Actions pane instead. Clicking New Namespace launches
the New Namespace Wizard. The wizard's Steps pane shows how many steps she has
to complete and which step she's on and lets her return to an earlier step by
clicking that step rather than clicking Back numerous times.
Emily enters the HoustonDC as the server in the Namespace Server step and selects
Next. If the selected server doesn't have the DFS service running, the wizard
automatically starts the service.
In the next step, she enters a name for the namespace. The wizard creates the
DFS folder structure on the server and sets the share permissions. Remember
that this isn't the share for the servers that hold the data to be published
in the namespace—it's the share for the namespace itself. Since the folder
and share are managed by the DFS service, it's best to click Next and take the
default of All users have read-only permissions. This default prevents
two people from simultaneously updating the same file on different replica members.
The next step asks whether Emily wants to create a domain-based namespace or
a standalone namespace. She creates a domain-based namespace. Unless you have
a special requirement for a standalone namespace on a single server, you should
choose the domain-based configuration because it provides scalability and fault
tolerance that a standalone namespace can't. (The sidebar "How DFS and AD Work
Together" details the advantages of a domain-based namespace.)
The next step presents Emily with a summary of her choices. Clicking Create
triggers the namespace creation process. Individual steps are shown and confirmed
when complete, and when the entire process is finished she gets a clear confirmation.
Add a Namespace Server
The Houston server is now a single point of failure for the entire namespace.
Adding the Fort Worth server as a second namespace server provides fault
tolerance in case the first server becomes unavailable. Because Fort Worth is
closer than Houston to Sweetwater, the Fort Worth server also provides both
local and Sweetwater users faster access to the namespace.
To add a namespace server, Emily clicks \\hardwaretx.net\Software in the left
pane of the DFS Management snap-in, then selects Add Namespace Server in the
right pane. She enters FORTWORTHDATADC, the name of her server, in the
search dialog box to complete the addition. By clicking the Namespace Servers
tab in the middle pane, she can see both of the DCs that support the namespace.
The actions available in the right pane make it easy to delegate permissions
on the namespace and view or modify properties of the namespace or a namespace
server.
Add Folder Targets
To make her working namespace useful, Emily needs to add folders to it. She
wants to add just one folder, ISOs, to the namespace. She clicks the \\hardwaretx.net\Software
namespace, then clicks New Folder in the Actions pane. In the New Folder dialog
box, she types the folder name. Clicking Add to add a folder target—the
real server or servers hosting the data—lets her enter a server name,
examine the server's existing shares, add a share if necessary, and even configure
the share permissions. The ability to configure the link servers in the console
is a big improvement over the earlier version of DFS. She adds the ISO network
shares on her four servers as folder targets in her new namespace, as Figure
1 shows.
Because she adds more than one folder target, the DFS Management console assumes
she wants to share data among her four folder targets and prompts her to set
up a DFSR replication group. She chooses Yes, which launches the Replicate Folder
Wizard.
Set Up DFS Replication
Because Emily implicitly chose to configure replication by immediately adding
a second folder target to her namespace, the Replicate Folder Wizard prepopulates
many fields with information from the folder. Clicking Next from the first screen
causes the wizard to determine which folders are available to be members of the
replica set. Any servers that contain folder targets and don't run R2 are
ineligible to join the replica set.
Clicking Next again takes Emily to the Primary Folder Target step, where she
chooses the folder that will initially be authoritative for the replica set.
Designating a primary folder means that if there's any content already in the
replica set, the contents of the primary folder will override the preexisting
content in the replica set. Files that don't exist in the primary folder are
removed from view. Those files aren't deleted, but are moved to a hidden system
folder named DfsrPrivate\PreExisting under the folder root, as a DFSR log message
tells you in clear English. (Concise, informative logging is one of the much-needed
improvements DFSR offers over FRS.)
In the Topology Selection step, which Figure
2 shows, Emily can choose from a huband-spoke replication topology, a full
mesh topology, or build her own custom topology. A hub and spoke topology (where
multiple branches replicate with one or more centralized servers) is more scalable,
but a mesh topology is highly fault tolerant. In a mesh topology, each node
is connected directly to the other nodes; consequently, in larger configurations
a mesh topology can generate a lot of overhead for the servers. Since she has
only four servers' content to replicate, however, Emily chooses the mesh topology.
The next step lets Emily tailor the amount of bandwidth that replication consumes.
Because she doesn't need to replicate data on less than 24 hours notice and
the company's WAN circuits are only lightly used after working hours, she decides
to replicate with a bandwidth of 64Kbps—a trickle—during the day
and with full bandwidth between 8 P.M. and 7 A.M. She clicks the Days and Times
button, then clicks Edit Schedule. In the Edit Schedule dialog box, which Figure
3 shows, she chooses a day and time by dragging the cursor over a block
of hours, then selects a bandwidth to use during that period and clicks Add.
The final screen of the wizard appears, and with the configuration complete,
she reviews her choices by clicking the Create button to create the replication
group and schedule. The Errors tab provides details on any errors in her configuration.
Figure 4 shows the DFS Management
snap-in focused on the ISOs replication group Emily created. In the Actions
pane, she can add members, change or verify the topology, delegate permissions,
edit the replication schedule, and pretty much alter anything she initially
configured. The Connections tab in the middle pane provides details of the replication
connections, which were previously available only through the complicated Ultrasound
utility for monitoring FRS.
Replication might not begin immediately, regardless of the schedule, because
DFSR has preliminary work to do. It must create and populate a staging directory
with files to be replicated, and it must establish both inbound and outbound
connections with its other replica set members. The amount of time it takes
to complete these steps depends on the amount of data on the primary originating
share and on how much of that data already exists on the replica members.
When it's ready, DFSR begins replicating data.
Run Reports
Emily wants to create a diagnostic report to see how the initial replication
is coming along, so she clicks Create Diagnostic Report in the Actions pane.
She can choose to look at disk space used, backlogged transactions and files,
and replication efficiency, but accepts the default settings and gets the HTML
report shown in Figure 5. The report
reveals that she still has a little troubleshooting to do, because two servers
aren't reporting and the others are low on disk space.
Not Your Father's DFS
Windows 2003 R2's DFS Namespaces and DFSR service represent major improvements
in all aspects of DFS. Setting up and maintaining a namespace are much easier
than they've ever been. If limitations in DFS have prevented you from using
it in the past, it's time to revisit this capability.
| SOLUTIONS
SNAPSHOT
PROBLEM: You need an easy way to set up a replicated, fault-tolerant
data-publication system.
SOLUTION:
Use the new DFS Replication (DFSR) feature in Windows Server 2003
R2.
WHAT YOU NEED: Windows Server 2003 R2
DIFFICULTY: 3 out of 5
SOLUTION STEPS:
- Do prep work, including installing Windows 2003 R2 and upgrading AD
schema.
- Build a namespace.
- Add a namespace server.
- Add folder targets.
- Set up DFSR.
- Monitor replication by running reports.
|
Just an FYI there is a GUI version out for Robocopy. He mentions that it's only command line... that isn't so.