Using Microsoft RSS to manage Hierarchical Storage

One of the more subtle changes that Microsoft has introduced in Windows 2000 (Win2K)—and yet one of the most significant—is in NTFS. Although NTFS might look the same cosmetically, Microsoft has added new features under the surface that give administrators more capabilities for managing storage on enterprise networks. Windows 2000 Magazine has covered many of these capabilities, which include encrypting, journaling, and using reparse points. (For more information, see "Related Articles in Previous Issues," page 82.) Although reparse points might not seem significant, they enable one of network storage's most advanced features—hierarchical storage. Using hierarchical storage, you can have—in theory—unlimited storage space available to your system at any time. And Microsoft has made the technology fairly easy to set up and administer.

Hierarchical Storage Management 101
To understand how hierarchical storage works, you need to understand reparse points and how they function in Windows 2000 Server (Win2K Server). Think of a reparse point as a special NTFS object. When an OS process or a user requests a file or directory, a reparse point effectively tells the file system, "Instead of finding that file (or directory) here, try this other location." The file system obeys the instruction and retrieves the file (or directory) from the alternative location. As far as the OS or the user is concerned, the file came from the original location, but it actually resides elsewhere. Figure 1 illustrates this file-retrieval process.

UNIX-based OSs have included this capability for years. Reparse points are important because Microsoft wants to compete in large enterprise environments, and one means of competing in the enterprise market is the ability to provide hierarchical storage services. Organizations both large and small simply can't afford (and don't need) to keep all their historical data online forever. Therefore, a product such as Remote Storage Service (RSS) is a welcome addition to Win2K Server.

If you're unfamiliar with hierarchical storage, the concept is simple, so I'll start with a quick primer. Most organizations never seem to have enough online storage. Although disks are relatively cheap today, servers always seem to fill up no matter how much storage administrators add. Therefore, many organizations are now backing up historical data to tapes—removing the data from online storage, then storing the tapes permanently in case the organization ever needs the data again. Obviously, this manual process is time-consuming, especially months later, when someone needs to access the historical data.

Hierarchical Storage Management (HSM) automates the entire process. HSM defines two types of storage systems, which I call online storage and nearline storage. Online storage typically comprises hard disks and other relatively high-cost media that can return data almost instantly. Nearline storage, such as a magnetic tape, is media that costs less than online storage but is usually slower to return data. For HSM to function, all storage media must be readily accessible without requiring human intervention. Because typical tape backups (i.e., offline storage) require hand-mounting, offline storage doesn't play a role in HSM.

Microsoft's preferred terms for online storage and nearline storage are local storage and remote storage, respectively. Because Microsoft uses these terms, I use them throughout the remainder of this article.

HSM products can look at files stored in an NTFS file system, determine which files need to move from local storage to remote storage, then move them—all without requiring human intervention. Microsoft has licensed technologies from VERITAS Software and bundled them with Win2K Server as RSS.

Remote Storage Service
RSS is an important addition to Win2K Server. Essentially, RSS keeps an eye on your local media (i.e., disks) and moves the data if the media become too full. RSS moves (i.e., migrates) the data to remote storage, then puts an NTFS reparse point on the local media in place of the file. For example, suppose you have 8GB of local storage and 24GB of remote storage available on a 12/24GB DAT. In theory, as far as your end users are concerned, you have 32GB of data available because when a user requests a migrated file, RSS retrieves the file from remote storage and restores it to the appropriate location on local storage. If you have the right equipment, RSS is fairly easy to set up and lets you increase your storage capacity without moving data completely offline.

Removable Storage
Although the storage media that RSS uses are typically called remote, they're also usually removable. For example, the physical media (i.e., tapes) of an autoloading DLT library are removable. However, for RSS to work properly, you must not remove any media that you use in remote storage. And you can't use every type of removable storage as remote storage.

RSS doesn't work with just any tape drive. Microsoft's RSS documentation states that it "supports all SCSI class 4mm, 8mm, and DLT tape libraries." As far as Microsoft is concerned, a library can be an automatic multimedia multidrive device or a manually operated single-drive device. Microsoft doesn't support quarter-inch cartridge (QIC) tape libraries or optical disk libraries—an unfortunate exclusion because optical media is becoming quite prevalent. (If you need the capability to migrate to optical storage, you might want to take a look at other HSM products on the market.)

Setting Up RSS
Assuming that you have a compatible tape drive, you can begin working with RSS immediately. If the application isn't already installed on your server, open the Add/Remove Programs Control Panel applet and select Add/Remove Windows Components. Choose the Remote Storage option, then let Win2K install RSS. For this article, I ran Win2K Server on a Compaq machine with an external-SCSI HP SureStore 12/24GB DAT drive. If you have a working SCSI tape drive (or tape library) on your Win2K server, you can start using RSS by selecting the Remote Storage option from the Start menu's Administrative Tools group.

When RSS starts, Win2K takes you through the Remote Storage Setup Wizard, which tells you whether your tape drive will work. If you don't have a compatible drive, the wizard fails at its first step, when it attempts to detect compatible media. If the wizard's first step completes and you reach the second step, your tape drive will work.

The Remote Storage Setup Wizard asks you which NTFS volumes you want to manage. In RSS parlance, a managed volume is simply an NTFS partition that RSS monitors and migrates files from, if necessary. You can manage any NTFS volumes on your system, but you can't manage FAT volumes (including FAT32) because they don't support reparse points. Select the volumes you want to manage, then click Next to proceed to the Volume Settings step, which Screen 1 shows.

You perform about 80 percent of your RSS configuration in the Volume Settings step. In this step, you define—based on the volumes you selected in the first step—the parameters that RSS uses to determine whether and when to migrate a file from local storage to remote storage. To better understand the migration process and decide which values to use, you need to understand migration logic.

RSS monitors your managed volumes for the amount of remaining free space and for files that users haven't accessed for a while. Each monitoring process has a separate function, and the processes work together to provide a complete HSM solution. First, the system monitors which files users haven't used recently. If users haven't accessed a file in x number of days (a value you denote in the Volume Settings step's Not accessed in: __ days field), RSS premigrates the file from local storage to remote storage. In other words, RSS copies the file to remote storage (i.e., your tapes) but doesn't yet remove the file from local storage (i.e., your hard disks). Instead, RSS flags the file as premigrated.

RSS fully kicks in only when you start running out of space on your volume. When your volume's free space drops below x percent (which you denote in the Desired free space field), the RSS process starts removing premigrated files from your local storage and replacing them with NTFS reparse points. Therefore, if your environment has a sudden surge in disk usage, the RSS process doesn't need to worry about migrating files to make more space. Files have already premigrated, so to keep up with the sudden increase in disk space requirements, the RSS process needs only to remove the actual file from local storage and put a reparse point in its place. This process occurs quickly, so RSS can promptly cope with sudden increases in disk space requirements.

The final configurable parameter in the Volume Settings step is the minimum file size for migration. Microsoft's default value for this field is 12KB, which is probably suitable for most environments. However, you might want to take a full inventory of your system's files and determine how many files of each size your system has and what the average file size is. Obviously, migrating 1KB files to tape probably doesn't make sense because of the performance overhead involved with moving such a small file, but you'll need to make that judgment for your environment. Choose a suitable number for your environment, then press Next to go to the wizard's next step.

In the Media Type step, the RSS wizard simply wants you to specify the type of media you want to use for remote storage. (You can't use the same tape drive you use for routine backups because the tapes that RSS uses must always be mounted.) After you make your selection, press Next to complete the wizard.

Initializing Media
RSS is now fully configured and ready to begin monitoring your volumes and migrating data as necessary. However, your work with RSS isn't finished. All RSS functions are available and operating, but until you initialize media and mark them for remote-storage use, RSS won't be able to migrate files off your system.

To initialize media, ensure that the tapes you want to use reside in the appropriate drives, and make sure that the tapes appear in the Remote Storage Microsoft Management Console (MMC) snap-in. Specifically, look in the Removable Storage, Physical Locations folder under your tape system's drive designation. Although Removable Storage sees that you have tapes in your system, Removable Storage doesn't presume that it can use them for remote storage. To make the tapes available to RSS, simply copy the icons that represent the media you want to use to the appropriate media type listed under Media Pools, Free. This action informs Win2K Server (and RSS) that these tapes are available for any purpose. Win2K initializes the tapes, and RSS is ready to run.

Modifying Managed Volumes
Following implementation of the application, you might want to fine-tune RSS's operating parameters, as well as the volumes you've told RSS to manage. You can easily modify these items' properties through the Remote Storage MMC snap-in.

To modify general RSS properties, right-click Remote Storage in the Remote Storage snap-in's left pane and select Properties. The Remote Storage (Local) Properties page, which Screen 2 shows, offers four tabs on which you can monitor how much data has migrated to remote storage, change the time and frequency that the service runs (Microsoft's default is daily at 2:00 a.m.), and modify media properties.

After you set the properties for a specific managed volume, you might decide that you want to change these properties at some point in the future. To change properties for a managed volume, select Managed Volumes in the Remote Storage snap-in's left pane, right-click the volume you want to modify in the right pane, then select Properties. As Screen 3 shows, you can modify individual properties for each managed volume, so you can set different types of migration parameters for different types of data. In my example, I modified my E drive's properties to ensure that 50 percent of the volume is always available and that any files larger than 10KB (and that no one has accessed for 24 hours) can migrate.

Testing RSS
After I enabled the configuration you see in Screen 3, RSS began to quickly premigrate files off my system and onto tape. (Remote Storage logs its activities in the event log, which you can view through the Event Viewer or the Remote Storage MMC snap-in.) To force RSS to completely migrate the premigrated files, I copied about 500MB of new data onto my E drive. As I expected, storage on the drive dipped below the 50 percent threshold during the copy process but came back up as the RSS process removed the premigrated files from my system and replaced them with appropriate NTFS reparse points.

To ensure that RSS worked successfully, I decided to find some files that had completely migrated off my local storage (i.e., not just premigrated) and bring them back. To find a migrated file or directory, simply look at the migrated item's Properties page, which Screen 4 shows. If RSS has moved the file, you'll see distinctively different readings in the Size and Size on disk values. (As you can see in my example, nearly half of my system's \i386 directory had migrated to tape.) After I found a migrated file, I copied it back to my desktop. Win2K began the typical copy process, but RSS quickly intervened and presented the dialog box that Screen 5 shows, asking me to wait while the system retrieved my file from remote storage.

Related Articles in Previous Issues
You can obtain the following articles from Windows 2000 Magazine's Web site at http://www.win2000mag.com/articles.

SEAN DAILY
"NTFS5 vs. FAT32," April 2000, InstantDoc ID 8294
MARK RUSSINOVICH
"Inside Encrypting File System," Part 2, July 1999, InstantDoc ID 5592
"Inside Encrypting File System," Part 1, June 1999, InstantDoc ID 5387
DOUGLAS TOOMBS
"Scalability Enhancements in Windows 2000," November 1999, InstantDoc ID 7275
Moving a 5MB file off a 12/24GB DAT took about 1 minute. (Most of that time probably consisted of the drive spinning the tape up to speed and finding the right location for the file.) One minute isn't terribly long, but if you decide to provide RSS to your users, you need to consider the service's inherent delays. If desktop applications timeout while RSS is retrieving archived files, your users will be making some unnecessary Help desk calls. Before you implement RSS, you might want to first try it in a test environment, using the basic production applications that most users use.

Not a Backup Substitute
Although I haven't discussed backups, you need to understand that RSS is not a substitute for routine system backups. Because RSS doesn't migrate all your files (and won't migrate your \winnt directory), you can't use the service as a suitable backup in the event of a complete system failure. Routine backups are still necessary.

The ability to always keep archived data online is appealing and can eliminate some of the mundane tasks that administrators perform. Particularly, the availability of a potentially unlimited amount of storage at any given time is quite gratifying. HSM technologies, which previously presented a significant add-on cost for any enterprise, are now free with Win2K Server—a great move on Microsoft's part.