Improve your backup system with disk-to-disk-to-tape and Microsoft VSS
When I first started working with Microsoft Exchange Server about a decade ago, the organization I worked for used a tape-based backup solution for its servers. Exchange Server has changed a lot since that time, but what hasn't changed much is the way most organizations perform backups. Tape-based backups have been around for decades, and they've always gotten the job done, but they are quickly becoming impractical because of the rising amounts of data that must be backed up and the need to be able to restore that data quickly. Fortunately, Exchange Server 2007's replication capabilities are pointing the way to more practical backup methods.
The Problem with Tape Backups: Performance
Although many administrators consider traditional tape backup techniques to be tried and true, these methods actually have several major logistical problems. Probably the biggest problem with traditional backups is that there is a huge potential for data loss. No, I'm not talking about bad tapes or hungry tape drives that eat tapes—although those things can happen. I'm talking about the fact that most companies run a backup only once a day.
Imagine for instance that your company runs a full backup every night at 11:00 P.M. Now suppose that a catastrophic disk failure occurs at 10:30 P.M. The daily backup hasn't yet run, so any data written to the disk since last night's backup is lost. Consider what such a situation might mean for an Exchange server: You could lose a full day's worth of email messages.
Safeguards are in place to keep such a catastrophic failure from occurring—Microsoft separates the database from the transaction logs for this very reason. But that doesn't mean this type of failure can't happen. In fact, it has happened to me.
Last June, I flew to Orlando for TechEd. I called my wife to tell her I had arrived safely and learned a big storm had hit shortly after my flight left. My home was struck by lightning. Because I work from my home, I've built the entire second floor into an enterprise-class data center. Although I had surge protection and battery backups in place, the lightning destroyed several servers, including an Exchange server. It didn't matter that my transaction logs were separated from the databases because the entire server was destroyed. So even though Microsoft has designed Exchange to minimize the chances of data loss, the protection Exchange provides isn't foolproof.
Traditional backups affect an Exchange server's performance. The performance hit isn't a big deal if an organization maintains a 9-to-5 schedule and the backups take place late at night. However, many organizations operate 24 hours a day, so decreased performance can be a problem.
Exchange Server doesn't cease functioning during a backup; new messages are written to the transaction logs just as they always are. When the backup is complete, the transaction logs that have been backed up are committed to the database. Any messages that arrive during or after the backup stay in the transaction logs until the next backup occurs. The backup process slows down Exchange because it requires the use of disk, CPU, and memory resources, but Exchange remains available.
It's unrealistic to expect a modernized backup solution to remove all the inefficiencies from your Exchange servers. Exchange spends several hours each night performing a set of automated maintenance tasks, which explains why it can slow to a crawl late at night. Switching to a more modern backup technique can keep the backup process from being so resource-intensive, but it won't do anything to alleviate the drain on resources caused by the nightly maintenance schedule. (See "10 Tips to Keep Your Microsoft Exchange Server Humming" for more information about Exchange's automated maintenance tasks and the steps you can take to make sure your servers run efficiently.)
Yet another problem with traditional backups is that they don't always account for the rate at which data accumulates. The Exchange Information Store tends to grow exponentially. The volume of email flowing through an organization constantly increases; email attachments have grown much more common and increased in size as the availability of broadband Internet connectivity has become more widespread.
Therefore, the data being backed up each night is growing exponentially, requiring both more time and storage. Even so, most network administrators are allocated a finite amount of time to use as a backup window. Regardless of what upper management might think, it's unrealistic to expect to indefinitely backup an ever-growing data set within a static, or even shrinking, backup window.
In some organizations, the Store can grow to exceed the capacity of a backup tape. Sure, you can invest in an automatic tape loader to buy yourself time, but unless you implement strict mailbox quotas, the Exchange Store will continue to grow indefinitely. Mailbox quotas can be tricky for some organizations because either business needs or regulations require long-term retention of data, although going without mailbox quotas isn’t practical for most organizations. Many organizations have turned to message archiving solutions as a supplement to their backups. Message archiving lets you archive older messages and remove them from Inboxes and the server so that they don’t affect your regularly scheduled backups.
The Problem with Tape Backups: Recovery
Another aspect of traditional backups that I've always found frustrating is that they don't offer true point-in-time restore capabilities. For example, suppose that you create a backup at 11:00 P.M. and the server crashes at 2:00 P.M. the next afternoon. In such a situation, you have two options: restore the backup and let Exchange process the transaction logs that have accumulated since the last backup was made, bringing the Store back to a current state; or blow away the transaction logs and restore the backup, bringing Exchange back to its state at the time the backup was made.
Now suppose the reason the server crashed was because a virus unleashed havoc on the server an hour before the crash. In this situation, it would be great if you could restore the server to its state just before the infection. Unfortunately, traditional backups don't offer this capability.
Finally, managing backup tapes can be difficult for Exchange organizations. Typically, when a backup is made, the tape is taken offsite so that it can be stored away from the facility. That way, if the office were wiped out by fire, hurricane, or some other catastrophe, the data is still safe because it's at another location.
Although storing backup tapes offsite is a good idea, it has its downside: Not having backup tapes immediately available can greatly increase the amount of time it takes to recover from a disaster. After all, you can't restore a backup if you don't have the tape in your possession. Furthermore, most tape-management services charge a hefty fee for emergency tape retrievals. Because of this variety of limitations, traditional tape backups simply aren't sufficient for most organizations these days.
Fortunately, there are new methods of backup to turn to. One solution that's been around a few years is disk-to-disk-to-tape backup. At its simplest, disk-to-disk-to-tape means that backups are written to a disk array somewhere on the network, then the array's contents are later written to tape. Disk-to-disk-to-tape solutions often offer advanced capabilities that you just can't achieve through traditional backups.
For instance, disk-to-disk-to-tape backup solutions offer continuous data protection (CDP). Snapshot backups are taken on a periodic basis throughout the day; the frequency of snapshots varies by product. Each snapshot typically contains only the data that has changed since the previous snapshot was created. This type of backup eliminates the need for a large backup window. Because backups are made throughout the day, there's no colossal backup late at night, although the systems does need to copy the disk-based backup to tape. Also, because small backups are taken on a nearly continuous basis, the impact on the server’s performance is often far less than during a traditional backup.
Storing tapes offsite can result in longer downtime because you need to retrieve the tapes before a recovery operation can begin. With a disk-to-disk-to-tape solution, you won't have this problem except in the most extreme circumstances. The disk-based backups are readily available, and you can initiate a recovery operation without waiting to retrieve a tape from offsite storage. Tapes are still stored offsite, but the tapes act only as a contingency against the destruction of the facility or against the loss of the storage array that contains the backups.
As I mentioned, tape capacity can be a problem with traditional backups because the Store tends to grow exponentially and tape capacity remains static. Unfortunately, you can never completely get away from this problem because even disk-based backups are eventually written to tape as an extra safeguard. However, the size of the backups when they're written to disk isn't as big a problem because in most cases you can add additional disks to the storage pool on the backup server as necessary.
Data Protection Manager
There are many different disk-to-disk-to-tape backup solutions on the market, such as Lucid8's DigiVault, CA XOsoft CDP Solo (formerly Enterprise Rewinder), and FalconStor Continuous Data Protector. Microsoft offers its own product, System Center Data Protection Manager, which is specifically designed for backing up Exchange Server. Of course, you can use DPM to backup your other servers as well.
DPM is a CDP solution. Rather than backing up your Exchange organization once a day, DPM performs backups continuously. In fact, you can configure DPM to make backups as frequently as every 15 minutes. Keep in mind that it doesn't back up the entire Store every 15 minutes, which would be impossible for just about any real-world Exchange installation. Instead, DPM backs up only the transaction logs during these frequent backups.
The Store itself is backed up once a day with an express full backup, which uses Microsoft Volume Shadow Copy Service (VSS) to take a snapshot of the database, then write that snapshot to the backup storage array. Assuming your server has adequate disk space, you can retain as many as 512 express full backups—meaning that you can have that many backup points to choose from when you need to restore. So you can perform a point-in-time restore of the Exchange database and restore the server to the point just before a problem started.
It takes much less time to create backups with this method than it does with traditional tape backup. After all, transaction logs are backed up throughout the day, so there's a lot less work to be done when the express full backup runs. They still require some system resources, but those resources are released much more quickly than they would be if you were running a traditional backup. Furthermore, express full backups only back up the portions of the database that have changed since the previous backup. So you get faster backups and much less disk space consumption than if you were backing up the entire database every time.
Most, if not all, CDP solutions for Exchange rely on VSS. VSS is the mechanism that allows snapshots of data to be made, regardless of whether a file is open. If you're using a CDP solution or a traditional backup that incorporates VSS, you need to be aware of some limitations. For starters, you can't make a VSS backup of an individual database. You must backup storage groups as a whole because all of the databases within a storage group share a common set of transaction logs. You also can't simultaneously back up multiple storage groups on an Exchange server.
When you run a VSS backup of Exchange, the backup software must ensure that no changes are made to the database or to the transaction logs during the backup, but it must also keep these items available to users. To accomplish this, the databases and transaction logs are temporarily locked; any messages sent or received while the database is locked are placed in a queue. The backup software quickly makes a read-only copy—a snapshot—of the database, then releases the lock.
Snapshot data isn’t streamed to tape—at least not until later; it's written to a disk volume first. To make this process practical, the volume must be large enough to accommodate the snapshots and fast enough that the snapshots can be made before the database lock becomes disruptive to end users. Therefore, Microsoft recommends using a SAN for snapshot backups.
Continuous Replication Solutions
With Exchange 2007, Microsoft introduced the concept of continuous replication in the form of local continuous replication (LCR) and cluster continuous replication (CCR); Exchange 2007 SP1 introduced standby continuous replication (SCR). Although they're not complete backup solutions in themselves, these replication methods do provide you with a way of making your backups more efficient.
The continuous replication features in Exchange 2007 use a technique called log shipping to make backup copies of the transaction logs in an alternate location. When you use LCR, the logs are written to a separate volume on the same physical server. CCR works similarly, but the log files are written to a separate server. SCR replicates data to a remote site. The end result with each method is a backup copy of the database and its log files that you can use to restore data in the event of a failure.
Obviously, the point of using continuous replication is to provide yourself with an extra safeguard for your data, but this type of protection alone isn't enough because you're not generating a removable backup. But continuous replication technology can be used to help make your backups more efficient. When you use continuous replication, you can run a traditional tape-based backup against the replicated database rather than the production database and thereby suffer much less of a performance impact during the backup process.
DPM is designed to work with continuous replication. If you use CCR or SCR, DPM lets you perform your backup on the copy of the Exchange database, which helps prevent load from being placed on your live mailbox server.
In today's economic climate, companies can't afford to rely on outdated systems when there are quicker and more reliable alternatives. If your organization still relies exclusively on traditional tape-based backups, it's time to investigate a more modern approach. Disk-to-disk-to-tape and CDP solutions can help you overcome some of the problems associated with traditional backups.