Maximize your Exchange deployment's availability

Hello all,
Several readers have asked me about replication-based recovery servers for Exchange Server. The topic is worthy of a white paper, but I attempt to address the basics here.

The goal is to provide a reliable and supportable recovery server that can mirror or copy Exchange Information Store (IS) data in realtime or near-realtime. The primary server should also be able to transparently fail over to the recovery server, and the recovery server should be able to fail back to the primary server. Exchange doesn't inherently provide this type of functionality, but several third-party solutions exist. (Be aware, though, that Microsoft provides limited support for these types of setups.) Of the available solutions, two flavors exist: server mirroring and data replication.

At the core of all the available solutions is some sort of shared storage or I/O mechanism—typically a Storage Area Network (SAN). When you're building a data replication or mirroring solution for a mission-critical application, shared storage or I/O interconnection is imperative. Also, many SAN implementations offer built-in volume or controller mirroring and cloning. This capability simplifies data replication because the SAN hardware and software already support such replication.

Server mirroring solutions are typically proprietary solutions that involve both hardware and software. For example, Marathon Technologies' Endurance products combine special I/O interconnection hardware with specialized software to provide a server mirror of your Exchange server. In addition to mirroring the data at an I/O level, this solution provides a completely redundant hot-backup server that's paired to a production server. Mirroring solutions are virtually lockstep fault tolerant with the production server and data and are great for small server deployments or branch-office Exchange servers that need high availability. However, such products add too much complexity for most environments with large Exchange servers that have many users and stores.

Data replication solutions that build on SAN or Network Attached Storage (NAS) implementations are more mainstream and a bit more flexible. Solutions such as EMC's Symmetrix Remote Data Facility (SRDF) or Compaq's Data Replication Manager (DRM) replicate data volumes (e.g., the transaction log volume, the database files volume) across controllers that can be across the room or across town. From the perspective of Exchange and the Extensible Storage Engine (ESE)/Joint Engine Technology (JET) database engine, data replication from the Exchange server to the redundant data set is transparent. The key to successful implementation of this solution lies in tuning the I/O replication operations according to the physical distance between redundant data sets and the response time that Exchange requires. Data replication solutions also typically support either synchronous or asynchronous I/Os; choose either, depending on factors such as replication distance, I/O load, and application requirements. As you might guess, Exchange is rather sensitive to I/O problems, so you must ensure that your implementation is well tested and well tuned. Otherwise, your high-availability solution will increase downtime instead of decreasing it.

You can also implement other measures (with or without a data replication solution) to provide more rapid disaster recovery for your Exchange servers. For example, if you use a SAN, you can deploy a Redundant Array of Independent Servers (RAIS)—spare servers that you attach to the shared storage and boot from the SAN in the event that a production server or IS fails.

In building a recovery solution, you need to target the area of Exchange that causes the most downtime in your environment. If you want to protect your Exchange servers and data and can justify the cost of server mirroring or data replication solutions, you can maximize your Exchange deployment's availability.

Until the next time,
Jerry Cochran, News Editor, exadmin@winnetmag.com