Using Microsoft RSS to manage Hierarchical Storage
One of the more subtle changes that Microsoft has introduced in Windows 2000 (Win2K)and yet one of the most significantis in NTFS. Although NTFS might look the same cosmetically, Microsoft has added new features under the surface that give administrators more capabilities for managing storage on enterprise networks. Windows 2000 Magazine has covered many of these capabilities, which include encrypting, journaling, and using reparse points. (For more information, see "Related Articles in Previous Issues," page 82.) Although reparse points might not seem significant, they enable one of network storage's most advanced featureshierarchical storage. Using hierarchical storage, you can havein theoryunlimited storage space available to your system at any time. And Microsoft has made the technology fairly easy to set up and administer.
Hierarchical Storage Management 101
To understand how hierarchical storage works, you need to understand reparse points and how they function in Windows 2000 Server (Win2K Server). Think of a reparse point as a special NTFS object. When an OS process or a user requests a file or directory, a reparse point effectively tells the file system, "Instead of finding that file (or directory) here, try this other location." The file system obeys the instruction and retrieves the file (or directory) from the alternative location. As far as the OS or the user is concerned, the file came from the original location, but it actually resides elsewhere. Figure 1 illustrates this file-retrieval process.
UNIX-based OSs have included this capability for years. Reparse points are important because Microsoft wants to compete in large enterprise environments, and one means of competing in the enterprise market is the ability to provide hierarchical storage services. Organizations both large and small simply can't afford (and don't need) to keep all their historical data online forever. Therefore, a product such as Remote Storage Service (RSS) is a welcome addition to Win2K Server.
If you're unfamiliar with hierarchical storage, the concept is simple, so I'll start with a quick primer. Most organizations never seem to have enough online storage. Although disks are relatively cheap today, servers always seem to fill up no matter how much storage administrators add. Therefore, many organizations are now backing up historical data to tapesremoving the data from online storage, then storing the tapes permanently in case the organization ever needs the data again. Obviously, this manual process is time-consuming, especially months later, when someone needs to access the historical data.
Hierarchical Storage Management (HSM) automates the entire process. HSM defines two types of storage systems, which I call online storage and nearline storage. Online storage typically comprises hard disks and other relatively high-cost media that can return data almost instantly. Nearline storage, such as a magnetic tape, is media that costs less than online storage but is usually slower to return data. For HSM to function, all storage media must be readily accessible without requiring human intervention. Because typical tape backups (i.e., offline storage) require hand-mounting, offline storage doesn't play a role in HSM.
Microsoft's preferred terms for online storage and nearline storage are local storage and remote storage, respectively. Because Microsoft uses these terms, I use them throughout the remainder of this article.
HSM products can look at files stored in an NTFS file system, determine which files need to move from local storage to remote storage, then move themall without requiring human intervention. Microsoft has licensed technologies from VERITAS Software and bundled them with Win2K Server as RSS.
Remote Storage Service
RSS is an important addition to Win2K Server. Essentially, RSS keeps an eye on your local media (i.e., disks) and moves the data if the media become too full. RSS moves (i.e., migrates) the data to remote storage, then puts an NTFS reparse point on the local media in place of the file. For example, suppose you have 8GB of local storage and 24GB of remote storage available on a 12/24GB DAT. In theory, as far as your end users are concerned, you have 32GB of data available because when a user requests a migrated file, RSS retrieves the file from remote storage and restores it to the appropriate location on local storage. If you have the right equipment, RSS is fairly easy to set up and lets you increase your storage capacity without moving data completely offline.
Removable Storage
Although the storage media that RSS uses are typically called remote, they're also usually removable. For example, the physical media (i.e., tapes) of an autoloading DLT library are removable. However, for RSS to work properly, you must not remove any media that you use in remote storage. And you can't use every type of removable storage as remote storage.
RSS doesn't work with just any tape drive. Microsoft's RSS documentation states that it "supports all SCSI class 4mm, 8mm, and DLT tape libraries." As far as Microsoft is concerned, a library can be an automatic multimedia multidrive device or a manually operated single-drive device. Microsoft doesn't support quarter-inch cartridge (QIC) tape libraries or optical disk librariesan unfortunate exclusion because optical media is becoming quite prevalent. (If you need the capability to migrate to optical storage, you might want to take a look at other HSM products on the market.)
Setting Up RSS
Assuming that you have a compatible tape drive, you can begin working with RSS immediately. If the application isn't already installed on your server, open the Add/Remove Programs Control Panel applet and select Add/Remove Windows Components. Choose the Remote Storage option, then let Win2K install RSS. For this article, I ran Win2K Server on a Compaq machine with an external-SCSI HP SureStore 12/24GB DAT drive. If you have a working SCSI tape drive (or tape library) on your Win2K server, you can start using RSS by selecting the Remote Storage option from the Start menu's Administrative Tools group.
When RSS starts, Win2K takes you through the Remote Storage Setup Wizard, which tells you whether your tape drive will work. If you don't have a compatible drive, the wizard fails at its first step, when it attempts to detect compatible media. If the wizard's first step completes and you reach the second step, your tape drive will work.
The Remote Storage Setup Wizard asks you which NTFS volumes you want to manage. In RSS parlance, a managed volume is simply an NTFS partition that RSS monitors and migrates files from, if necessary. You can manage any NTFS volumes on your system, but you can't manage FAT volumes (including FAT32) because they don't support reparse points. Select the volumes you want to manage, then click Next to proceed to the Volume Settings step, which Screen 1 shows.
You perform about 80 percent of your RSS configuration in the Volume Settings step. In this step, you definebased on the volumes you selected in the first stepthe parameters that RSS uses to determine whether and when to migrate a file from local storage to remote storage. To better understand the migration process and decide which values to use, you need to understand migration logic.
RSS monitors your managed volumes for the amount of remaining free space and for files that users haven't accessed for a while. Each monitoring process has a separate function, and the processes work together to provide a complete HSM solution. First, the system monitors which files users haven't used recently. If users haven't accessed a file in x number of days (a value you denote in the Volume Settings step's Not accessed in: __ days field), RSS premigrates the file from local storage to remote storage. In other words, RSS copies the file to remote storage (i.e., your tapes) but doesn't yet remove the file from local storage (i.e., your hard disks). Instead, RSS flags the file as premigrated.
RSS fully kicks in only when you start running out of space on your volume. When your volume's free space drops below x percent (which you denote in the Desired free space field), the RSS process starts removing premigrated files from your local storage and replacing them with NTFS reparse points. Therefore, if your environment has a sudden surge in disk usage, the RSS process doesn't need to worry about migrating files to make more space. Files have already premigrated, so to keep up with the sudden increase in disk space requirements, the RSS process needs only to remove the actual file from local storage and put a reparse point in its place. This process occurs quickly, so RSS can promptly cope with sudden increases in disk space requirements.
The final configurable parameter in the Volume Settings step is the minimum file size for migration. Microsoft's default value for this field is 12KB, which is probably suitable for most environments. However, you might want to take a full inventory of your system's files and determine how many files of each size your system has and what the average file size is. Obviously, migrating 1KB files to tape probably doesn't make sense because of the performance overhead involved with moving such a small file, but you'll need to make that judgment for your environment. Choose a suitable number for your environment, then press Next to go to the wizard's next step.