Protect applications and files and provide granular restore capability to admins and end users
| Executive Summary:|
Microsoft System Center Data Protection Manager (DPM) 2007 uses Microsoft Volume Shadow Copy Service (VSS) to provide in-depth protection for Microsoft SQL Server, Microsoft Exchange Server, SharePoint 2007, and Microsoft Virtual Server 2005 R2 SP1. DPM protects three types of information: transactional applications, nontransactional applications, and regular files; the type of data dictates how you set up DPM protection. DPM 2007 lets you enable end-user restores and perform bare-metal restores.
After spending some time with System Center Data Protection Manager (DPM) 2007, I’m sure you’ll see it’s a fantastic data protection solution for your Microsoft platform. It has a lot of clever stuff going on under the covers. In “DPM 2007: Set It Up and Get Started,” I gave an overview of DPM 2007 and described its requirements and how to get it up, running, and protecting data. In this article, I describe in more detail how DPM really protects your data and show you how to use DPM to restore protected data—both for administrators and end users.
Under the Covers
As you’ll see frequently in Microsoft presentations and white papers on DPM, Microsoft Volume Shadow Copy Service (VSS) is the “secret sauce” that makes DPM tick. It’s through VSS that DPM provides in-depth protection for Microsoft SQL Server, Microsoft Exchange Server, SharePoint 2007, and Microsoft Virtual Server 2005 R2 SP1. Let’s take a look at exactly how VSS is used to protect and store data in DPM.
The DPM agent injects an application-aware block-level filter on protected systems. The important part here is that DPM watches blocks of the file system. But rather than watching all blocks on a disk, the filter is application aware: It monitors only blocks that contain data for protected files, a process that leads to minimal protection overhead. This list of protected files is fluid and changes as the services write to different files. At all times, however, DPM watches only the protected blocks of files, no matter where those blocks currently are. Essentially, DPM lays down a bitmask over the disks it’s protecting; when a block changes, DPM flips a bit in the mask to signify the block has changed, and on the next express full backup, the content of the block is sent to the DPM server.
Remember that with express full backups, you get a full backup on the DPM server but only copy information that has changed on the protected client since the last backup. Figure 1 shows the express full backup process. In step 1 in the figure, the red blocks contain data protected by DPM that has already been fully copied to the DPM server. In step 2, the yellow blocks indicate that data on those blocks has changed. The changed blocks that DPM protects are sent to the DPM server and stored on the DPM replica. VSS is critical to this process because DPM calls the VSS writers of protected clients—for example, the OS file storage VSS writer, the Exchange VSS writer—and these VSS writers make sure all the data on disk is in a consistent state and stays in a consistent state while DPM copies the changed blocks to the DPM server.
Does this process mean that while the data is copied to the DPM server, the data on disk can’t be modified and your applications won’t function? No. When DPM requests a VSS snapshot, the VSS writer ensures the data on disk is consistent, then goes into copy-on-write mode until the snapshot is complete. In copy-on-write mode, the VSS writer monitors the data on disk. If a block that’s part of the snapshot needs to be changed during the snapshot process, the current content of the block is first copied to another location on disk and the snapshot map is updated to point to the new location for that block. Thereafter, the data on the block in its original location can be changed without affecting the snapshot.
Obviously, copy-on-write mode causes a performance drop, but this mode is used only while the data is copied to DPM. When the copy is complete, the copy-on-write is stopped, so this performance drop should be minimal. The copy-on-write is much faster than copying all the blocks to another location: Only blocks that change during the creation of the snapshot are moved. It also uses much less space on disk than a full copy.
But wait—there’s more! DPM protects three types of information: transactional applications such as Exchange and SQL Server, nontransactional applications such as SharePoint and Virtual Server, and just plain files such as, well, a file. The type of data dictates how you set up DPM protection.
When you’re protecting files with DPM, you configure how often to synchronize changes and at what times to create file restore points, which are the specific point-in-time views from which end users can restore files. You might synchronize every 30 minutes, for instance, but create restore points only at 8:00 A.M., 12:00 P.M., and 6:00 P.M. Users will see and be able to select only from the restore points instead of seeing every 30-minute synchronization as a possible restore time. Establishing restore points limits the number of possible recovery times, which is necessary because the previous versions client, which is the software that runs on client machines to enable the view of point-in-time copies of data, can see only 64 recovery points. Therefore, DPM also limits protected files to 64 recovery points. The DPM server still has the latest content from the synchronization schedule, so in a disaster you should be able to restore to within 30 minutes of the failure (or whatever your synchronization interval is set for).
You use the same approach for nontransactional applications; because they don’t have transaction logs, you would typically perform multiple express full backups during the day. For example, if you set DPM to perform an express full backup every two hours, you would never lose more than two hours of data. And remember, you can have as many as 512 express full backups, with each of those 512 as a possible point to recover to.
For transactional applications, you still perform express full backups; however, generally you perform only one a day. In addition, at a specified interval—for example, every 15 minutes—DPM pulls and stores the transaction logs of the application. In the event of a recovery, DPM restores the last express full backup, then applies all the transaction logs created since that backup was made. If only the database disk was corrupt and the transaction logs were on a separate disk on the live machine, DPM would also play back any transaction logs that were still on the live server that had not yet made it to DPM. This process means zero loss for your transactional applications.
You need to be careful that you don’t have anything else running on the transactional application servers that might interfere with the transaction logs. If DPM sees something that could truncate transaction logs, such as another backup solution or log shipping or mirroring technology, the transaction log pull won’t be available, nor will it be available if you have a configuration that’s not using transaction logs such as a simple, nontransactional SQL Server database.
So DPM performs express full backups at certain intervals, and for transactional applications it also pulls the transaction logs more frequently. DPM uses VSS to manage the various previous version states you have. What does this level of protection mean when you need to restore data, and what can you restore?
As Figure 2 shows, when you need to restore information from a transactional application, you can restore from the times you’ve performed an express full backup but also from any time that you’ve collected transaction logs, which could be as often as every 15 minutes. For nontransactional applications, you can restore from the times you performed express full backups, and for file resources, you can restore from the file recovery points specified plus the latest synchronized version, which is most likely more recent than the last recovery point defined.
So, that tells you what points in time you can recover, but what data can you recover and where to? Although you can set protection only at high-level containers with DPM—for example, you can protect an Exchange storage group but not an individual database or mailbox within a storage group—when you restore, you select from those smaller units within the high-level protection, as you can see in Figure 2. Because the entire storage group was captured, you can select to restore everything in the storage group, only specific databases in the storage group, or specific mailboxes in a database; DPM gives you very granular restoration capabilities. With SharePoint you can restore individual sites and even pages; with SQL Server, you can restore at the database level; with Virtual Server, you can restore a specific virtual machine (VM), and with file-based protection, you can recover all the way down to an individual file. When you restore data, only the necessary blocks are sent back, not a full snapshot, which would waste bandwidth and slow down the restore process.
You have a number of options for the location to recover data to—aside from the obvious choice of where the data is originally from. For SQL Server, you can restore a database over the database being protected or you can recover to an alternate SQL Server instance, to a network folder, or to tape. If you’re restoring an Exchange database, DPM can use the Recovery Storage Group (RSG) on an Exchange server, which lets you restore to the original Exchange server, overwriting the existing copy of the database; restore to another database on an Exchange server; restore to the RSG; restore to a network location; or copy to tape. For file resources, you can recover to the original location, an alternate location, or to tape.
You might not know where the data you want to restore is. As Figure 3 shows, DPM includes a search capability in the Recovery area that lets you search files and folders, Exchange mailboxes, and SharePoint sites or documents. The search returns a list of matches, which you can then select for recovery via the results window.
So far I’ve talked primarily about administrator recovery options. You can enable end-user recovery for files on shares, a feature that utilizes the shadow copy client software available for DPM. This functionality will seem familiar for administrators who have enabled the Windows Shadow Copies of Shared Folders feature to let end users restore previous copies of their files from a share. DPM uses an updated version of the shadow copy client, so if you’ve deployed the previous client, you need to replace it with the new DPM-compatible version.
End users familiar with Shadow Copies of Shared Folders won’t notice a difference when using DPM. However, instead of snapshots of the share staying on the original file servers and using disk resources, DPM uses its own data store. To enable this redirection, you need to make a minor change to the Active Directory (AD) schema, which you can initiate via the DPM Administrator Console.
The shadow copy client is part of Windows Vista and downloads for Windows XP SP2 and Windows Server 2003 are available; you’ll find links for the specific downloads on the Microsoft web page “How to Install the Shadow Copy Client Software.”. When the client is deployed, you can select Restore previous versions on the context menu of an item to open the Previous Versions tab on the Properties sheet. As Figure 4 shows, the Previous Versions tab is populated via the information in DPM.
What do you do if the server itself dies and requires a bare-metal restore? DPM 2007 includes DPM System Recovery Tool (SRT), a bare-metal recovery solution that uses a separate agent on the protected clients. Although DPM SRT can be installed on DPM servers, it isn’t recommended due to the substantial differences in its I/O patterns. The great news is that you don’t need a separate license: Your DPM server license entitles you to install DPM on one server and DPM SRT on a separate server. The enterprise DPM license also lets you use both the DPM and DPM SRT agents; without an enterprise license, you’d need to license the agents separately.
DPM SRT protects servers not just from critical failures such as hard disk corruption but can also be used to restore a system that simply won’t boot and even to roll back changes such as updates. Essentially, DPM SRT has a complete copy of the entire content of a protected system. When you need to perform a restore, you use a customized Windows Preinstallation Environment (WinPE), which is provided as an ISO image as part of the DPM SRT solution. The WinPE allows media-based booting, or you can integrate the WinPE image into a Preboot Execution Environment (PXE) such as Windows Deployment Services for network-based recovery.
The other great feature of DPM SRT is that it uses Single Instance Storage (SIS). If you’re protecting 50 Windows 2003 servers, chances are that 99 percent of the files from each are duplicates. With SIS, these duplicate files are stored only once, meaning significant disk-space savings.
If you look at your DPM server by using the Microsoft Management Console (MMC) Disk Management snap-in, you’ll see two volumes for each protection group, as Figure 5 shows. The first volume is a copy of the latest data, and the second contains all the various recovery points. Although you shouldn’t manually mount or modify this data, understanding how it’s stored is useful in certain disaster-recovery scenarios—for instance, if DPM itself is unavailable but you have the protected data on a shared store such as a SAN, providing alternate access.
Just the Beginning
DPM has more advanced capabilities that I haven’t touched on. For example, you can run pre-backup scripts prior to synchronizations and post-backup scripts afterward, which lets you protect types of applications and data that DPM doesn’t natively understand. Going forward, we’re going to see DPM support more Microsoft solutions out of the box. Hopefully, Microsoft will partner with other companies so that third-party applications can take full advantage of the DPM store as well.