Monitor data-sync problems between replication partners
Lately I've noticed more companies using replication to copy data to remote sites. Another common replication scenario involves copying data from a source to a destination location before a company switches over to a new server or storage area. Often replication occurs over VPNs and sometimes over slower network links. Should a replication not work correctly and synchronizations not finish or actually fail, the systems administrator or data owner might be unaware of the problems until users complain about obsolete or missing data. I've written a batch script, ReplicationTest, which makes dealing with such difficulties easier by notifying administrators and data owners of data-variation problems that occur during replication and that warrant further investigation. Let's delve into the workings of the script.
The Problem: Compare Source and Destination Files
Often, data on the master or source location is in a writeable state, whereas the remote (i.e., destination) location is read-only. Therefore, changes to files at the source location might not always instantly be reflected at the destination location until after replication finishes. If you take a snapshot of the directory size and number of files on the source, these values could be different on the destination location until replication to the destination is triggered by changes or time-of-day scheduled copy operations. Consequently, comparing the source and destination locations at any point in time might reveal small differences in the directory size and file count between the two areas, even if replication is working correctly. However, if the size and file count on the master and destination locations differ significantly, you'd probably need to investigate further to determine whether you have a replication configuration problem or a network link that can't adequately support replication traffic.
ReplicationTest compares the total directory size and number of files on the source and destination devices within a predetermined threshold of variation. To do this, the script first invokes the Diruse command-line utility (diruse.exe), which displays directory-size information, to capture the point-in-time directory sizes at the source and destination locations. You can find Diruse in the Windows XP and Windows Server 2003 Support Tools and in the Microsoft Windows 2000 Resource Kit. The syntax for the Diruse command is
find " TOTAL:"
(Some commands in this article wrap to several lines because of space limitations; you should type commands on one line.) ReplicationTest uses a default threshold of 10 percent, although this value is configurable. If the destination-location values vary by more than 10 percent compared with the source location, the script notifies the administrator of the discrepancy by sending a page or email message.
In most cases, merely comparing the size and file count on the replication partners provides an adequate comparison of the file differences between them. However, if any of the replication partners has limited changes—for example, the overall size or file count changes infrequently, new files are seldom introduced, or old files are overwritten with new versions—this approach might not work well for you. In the replication testing I've done for clients, I've found that comparing total directory size and file count can be an accurate way to gauge data concurrency. The alternative—performing file-for-file comparisons—is a high-overhead operation compared with the snapshot-based technique that ReplicationTest uses.
ReplicationTest performs three primary tasks: It launches Diruse commands on the replication partners simultaneously, performs math operations to determine whether the variation in the Diruse results exceeds the threshold, and, if the threshold is exceeded, sends a pager or email notification to the administrator.
Launching two Diruse commands simultaneously. If you launched the Diruse command on one share and waited for it to finish before launching it on the second location, the results could be skewed because of timing. If the commands were launched many minutes apart and replication was in progress, a false variation might result. Ideally, the script should launch the Diruse command for the source and destination areas simultaneously, then return the results of Diruse while the script idles and waits for a signal to move ahead and compare the results.
When you run a command such as Diruse in a script, the command finishes executing before script flow passes to the next command. We want to start Diruse on the source location and simultaneously launch a second instance against the destination location. There are several ways to accomplish this, but the method I've chosen is to have ReplicationTest create two scripts, then use the Start command to launch them. The output from these two spawned scripts is sent to two output files, then returned into the main script to make the directory-size and file-count comparisons.
ReplicationTest creates .bat and .tmp files to run and retain the Diruse run results. Therefore, the script needs a way to ensure that the created files have unique names to avoid accidentally overwriting files. For example, if you launched a second overlapping instance of a script and didn't use unique filenames for .tmp files, you could overwrite another temporary file. The sidebar "Naming Temporary or Output Files" explains several techniques for creating unique filenames. After ReplicationTest compares the information in the .tmp files, it performs a cleanup operation and removes the two spawned scripts and their two temporary output files.
A caveat about the Diruse utility is that it might fail when invoked inside a For command because double quotes are used in the command path. (Generally, you'd enclose a command path in double quotes to accommodate spaces.) The sidebar "Handling Spaces in Command Paths," page 4, discusses methods for dealing with spaces in command paths.
Perform math operations to analyze the variation. After the Diruse results are returned, the script must perform math operations to determine whether the variation exceeds the specified threshold. You could use the Windows Set command with the /a switch to perform simple calculations. However, Set /a has some constraints in dealing with nonintegers: It can't handle nonintegers as input and outputs only integers. It's likely the Diruse command will produce directory-size output that contains decimals, which Set /a doesn't support. I enabled ReplicationTest to perform the necessary math operations by doing some creative coding and invoking Mathomatic, a short Perl script I wrote to handle simple math operations. The script uses Mathomatic in a couple of ways; one of the more interesting uses is to perform the equivalent of an If x GTR y statement, as the ReplicationTest excerpt in Listing 1 shows. If you're new to Perl, you'll probably find it helpful to review the comments in the Mathomatic script.
Send a pager or email notification. A useful notification mechanism is to send an email message, which you can direct either to an email address or to a pager or cell phone as a text message via an email address. ReplicationTest uses Blat, a popular command-line SMTP mailer program, to send the notification, as Listing 2 shows. You can download Blat (the current version is 2.5.0) at http://www.sourceforge.net. When you unzip the Blat download file, look for the syntax.html file in the \docs folder. This file contains descriptions of the utility's 80-plus command-line switches. Blat has more switch options than almost any other command-line utility I've used and even has its own Yahoo! discussion group (http://groups.yahoo.com/group/blat) of Blat devotees. If you're completely overwhelmed by Blat's many switch options, see the examples.txt file in the \docs folder for examples of a few simple switch options.
ReplicationTest Scripting Tricks
In addition to the techniques I've described, ReplicationTest uses a couple coding tricks that you might want to add to your scripting tool belt: code that anticipates future requirements changes and code that combines the contents of two files. Let's look at these in a bit more detail.
Anticipating future requirements. After you've been creating scripts for a while, you'll begin to anticipate changes to scripts that users might request in the future. Often, you can easily add code to the original script to accommodate these anticipated changes. By coding proactively, you can minimize the productivity loss that can occur when, several months down the road, you need to refamiliarize yourself with the code and add and debug new sections to accommodate a user request. The original requirements for ReplicationTest were for the script to send an email notification only when the specified variation percentage was exceeded. However, I knew it would be only a matter of time before someone wanted ReplicationTest to return results on every script run and not just when the threshold was met. To address this possible additional requirement, I used the % ScriptBehav% variable. Configuring this variable to A tells the script to always notify the mail recipients. Setting it to B tells the script to notify recipients only when the threshold is exceeded.
Combining two files. Sometimes you'll need to combine the contents of two files as I did in ReplicationTest. I needed to add a header to the Blat email message—but only if the mailmessage file was created. Thus, I needed to append text to the beginning of an existing file. Although you can use the Copy command to combine two files into a new third file, I wanted to avoid introducing in the email and pager messages the annoying characters that can result from a Copy operation. Instead, I used the Type command with a redirection operator (>>) to echo the lines from one file to a new file, as Listing 3 shows.
I tested ReplicationTest on a Windows XP Service Pack 1 (SP1) system. To get the ReplicationTest script working in your environment, follow these steps:
1. Download the ReplicationTest and Mathomatic scripts from the Windows Scripting SolutionsWeb site. Go to http://www.windowsitpro.com/windowsscripting, enter 48746 in the InstantDoc ID text box, then click the 48746.zip hotlink. Line-wrap limitations in the print article could cause errors if you try to copy and paste line-wrapped code into your script.
2. Configure the following items:
- Path to the four utilities that ReplicationTest uses: diruse.exe (version 1.20), sleep.exe (available in the Microsoft Windows Server 2003 Resource Kit and Microsoft Windows XP Professional Resource Kit), blat.exe, and Mathomatic. (Note: To use the Mathomatic script, you need to first install ActivePerl on the machine on which you'll run the ReplicationTest script. You can download ActivePerl at http://www.activestate.com/ products/download/download .plex?id=activeperl.)
- Mail-message recipients: You can specify multiple recipients by using commas (not spaces) to separate the email addresses.
- From addressee: Check with your email administrator to determine whether your mail system requires valid email addresses. If it supports nonvalid addresses as well, you can use a sending address such as ReplTestScript@yourcompany.com.
- SMTP mail server through which you want the message sent.
3. Configure the size setting for Diruse to use: K (for KB) or M (for MB). You might need to fine-tune your size and decimal settings after running the script to attain more granular results.
4. Configure the number of decimal place values you want. For example, a setting of 4 would result in percentage values such as 5.1234%.
5. Configure the script behavior: A=Always notify or B=Notify only if threshold is exceeded.
6. Run the script following the syntax
For example, the command might look like
7. If the source or destination location paths have spaces, be sure to use double quotes to encapsulate these paths in your run command, which might look like
8. Test the script on a small local folder area before using it on larger remote locations.
9. Use a user account that has at least Read permissions on the source and destination folders.
10. If you need regular reports, set up ReplicationTest as a scheduled task and run it as often as you require.
Take Charge of Replications
The ReplicationTest script should give you a good handle on how data replication or a large data migration is proceeding. By using the script during a replication, you'll be able to detect and take action on directory-size and file-count variations before they become a problem.
Dick Lewis (firstname.lastname@example.org) is a senior systems engineer with Lewis Technology in Riverside, California. He is an MCSE and an MCT, specializing in enterprise management of Windows Server 2003, Windows 2000, and Windows NT servers and workstations.