Should you move ahead with an Exchange 2000 cluster deployment?

In "Clustering Exchange 2000, Part 1," December 2000, I explain the underlying concepts of Windows 2000's Cluster service as well as Exchange 2000 Enterprise Server's clustering enhancements. With that knowledge in hand, you're ready to evaluate the potential benefits and complexities of Exchange 2000 cluster storage design and administration. If you decide that a clustered deployment is right for your organization, several best practices can help even out the learning curve.

Storage Design: First and Foremost
Storage planning is one of the most challenging but also most important parts of deploying Exchange 2000 clusters. An Exchange Server 5.5 cluster contains only one Exchange Virtual Server (EVS), so storage allocation from the shared-cluster storage is simple. Exchange 2000's support of multiple EVSs and storage groups (SGs) per node significantly complicates cluster deployment and management. Despite this challenge, your cluster storage design must be right the first time if you are to achieve a successful implementation.

When you configure Exchange 2000 in a clustered environment, you need to carefully plan the volumes that you want to share between the cluster's nodes. However, "share" isn't the most appropriate word because Cluster service works in a shared-nothing model: Only one cluster member at a time can own, and therefore access, a volume. As a shared-storage mechanism for Cluster service, Storage Area Network (SAN) technology provides greater scalability, reliability, and data-management capabilities, thus enabling greater possibilities for storage design and allocation in a clustered environment. Become familiar with SAN technology's configuration in a cluster as well as SAN features, such as data replication and Business Continuance Volumes (BCVs).

The best way to plan your storage is to take a reverse approach to the design and setup of the cluster and the cluster hardware. In other words, start with the number of users per cluster and the required service levels, then work backward to allocate storage.

First, decide the user-load requirements for the entire cluster. For example, suppose you want to support 10,000 users on a four-node Exchange 2000 cluster. (Only Win2K Datacenter Server supports four-node clusters. For more information about Datacenter, see Greg Todd, "Win2K Datacenter Server," December 2000.) Evenly dividing users across the cluster will yield 2500 users per node. You can then design each cluster node to meet the performance and disaster-recovery requirements for 2500 users. (You can base these requirements on established service level agreements—SLAs.)

Second, determine the failover scenarios that the user and cluster configurations require. At initial release to manufacturing (RTM), Enterprise Server limits SGs to four per cluster node; therefore, no more than four Exchange Server SGs can run on a cluster node at one time. This limit is particularly important during a cluster failover. When a failure occurs and an EVS moves to another node, that node's total SG limit is still four. Microsoft might remove this limitation in future Exchange Server releases, when technologies such as 64-bit memory addressing become available. In the meantime, you must keep this limitation in mind when you design clusters. For any cluster, you must configure failover rules to prevent any one node from exceeding the four-SGs-per-node limitation.

After you consider the per-node SG limitations, you can determine how many users per SG to configure. Remember, however, that one EVS can contain multiple SGs, and take care not to exceed the per-node limit during either normal or failover conditions. To simplify the 10,000-user cluster example, plan one EVS per node and one SG per EVS (a ratio of 1:1). This ratio means that one EVS and one SG will service all 2500 users on each node.

Continuing to work backward, you can now begin to plan storage requirements and configuration for each cluster node. I suggest you use well-known best practices for maximizing disk I/O; see "Best Practices for Maximizing Disk I/O" for suggestions such as separating sequential and random I/O. Figure 1, page 156, shows the cluster design of the example four-node cluster that supports 10,000 users.

Installation and Configuration Requirements for New Clusters
The steps to install and set up an Exchange 2000 cluster appear relatively simple. However, to achieve successful implementation you must understand the process. The most notable change between clustering Exchange 2000 and clustering Exchange Server 5.5 is the placement of installed files. Exchange 2000 setup no longer needs to place files on a shared-cluster disk. Exchange 2000 can run in active/active mode on all members of a cluster, so each server in the cluster needs a local copy of the binary files.

During installation, Exchange 2000 Setup recognizes the cluster configuration and prompts you to install a local cluster-aware product version on each cluster node. After you accept this prompt, the installation proceeds as usual and extends the Active Directory (AD) schema if you haven't already updated the schema. (You should update the AD schema before you install the first Exchange 2000 server in your organization. For more information about extending the AD schema for Exchange 2000, see Tony Redmond, "Exchange 2000 Server over Active Directory," September 2000.) After the Exchange 2000 installation on the cluster node has completed, you must restart the node.

After installation on the first cluster node has completed, you can install Exchange 2000 on the subsequent cluster members' local systems. I recommend you try to select the same options for subsequent installations that you used for the first member installation. During subsequent installations, Exchange 2000 Setup will recognize the cluster configuration and will recognize that the Exchange Server organization already exists. Don't attempt to run several cluster-node installations at the same time.

You use the Microsoft Management Console (MMC) Exchange System Manager snap-in to manage EVSs; you must create the EVSs in the cluster before the EVSs are visible to Exchange System Manager (ESM—for information about ESM, see "Exchange 2000's New Administrative Tool," http://www.win2000mag .com, InstantDoc ID 8059). To create an EVS, you must manually configure a resource group for that EVS. Each resource group must contain (at a minimum) an IP Address resource, a Network Name resource (i.e., the name under which the EVS will appear in the Exchange 2000 organization and to clients on the network), and one or more disk resources that the EVS will use to store transaction logs, databases, and temporary files. After you create a resource group, you need to add the Exchange System Attendant resource to the group. Cluster Administrator will use the Exchange Cluster Administrator DLL (i.e., excluadm.dll) to create all the other resources that you need for an EVS. Cluster Administrator's resource- creation wizard will prompt you for the Exchange System Attendant's resource dependencies; these dependencies must include all the disk resources that the EVS will utilize. (For an explanation of resource dependencies, see "Clustering Exchange 2000, Part 1.") You must configure these disk resources before you create the EVS; this requirement is the reason that storage design is a key component of cluster deployment. The wizard will also prompt you for the data directory's path. (You can initially select the default drive and directory and later make changes by using ESM to reflect the physical volumes on which the transaction logs and database files will reside.) Next, if more than one administrative or routing group exists, the wizard will prompt you for the administrative group and routing group that will contain the EVS. These groups are available only when your Exchange 2000 deployment is running in native mode.

Upgrading Existing Clusters
The Exchange 2000 documentation provides the procedure to upgrade your cluster configuration from Exchange Server 5.5 clusters to Exchange 2000 clusters. In some cases, such as hardware replacement or upgrade, you might not want to perform an in-place upgrade. For those scenarios, I recommend a mailbox-relocation strategy. Also keep in mind that you must migrate to Win2K before you can complete these procedures.

In-place cluster upgrade strategy. If cost or other considerations dictate an in-place cluster upgrade, carefully investigate all aspects of the upgrade process and test the upgrade in a lab environment to reduce the involved risks. Don't attempt this procedure without adequate preparation. You also need a back-out plan to ensure that you can return the cluster to operational status without losing user data if the procedure fails. When you perform an in-place upgrade from an Exchange Server 5.5 cluster to an Exchange 2000 cluster, the process removes the Exchange Server 5.5 system files from the shared-storage cluster drives and installs the Exchange 2000 system files on each cluster node's local drive. The procedure configures Exchange Server to use the existing (i.e., Exchange Server 5.5) Information Store (IS) files (i.e., priv.edb and pub.edb), which remain on the shared-storage resource. After you create the EVS, you can bring the EVS online and proceed to perform an in-place upgrade of the Exchange Server 5.5 user data (i.e., databases).

This process converts the databases from the Exchange Server 5.5 Extensible Storage Engine (ESE) format to the Exchange 2000 format. The process can take some time depending on the size of your Exchange Server 5.5 database files.

Mailbox-relocation strategy. This strategy adds an Exchange 2000 cluster to the same site as an existing Exchange Server 5.5 cluster. Despite its additional cost, most organizations find this strategy more attractive because of the easier, more gradual migration and lower risk.

Because Site Replication Service (SRS)—which an Exchange 2000 server needs to interact in an Exchange Server 5.5 site—isn't supported in a cluster, another nonclustered Exchange 2000 server must already exist in the Exchange Server 5.5 site. You can then move user mailboxes and public folders from the Exchange Server 5.5 cluster to the Exchange 2000 cluster. To accomplish this step, you can use Microsoft Exchange Administrator, the MMC Active Directory Users and Computers snap-in, or a tool such as Exmerge. (Keep in mind that you will lose permissions when you use Exmerge.)

I prefer the mailbox-relocation strategy because it permits a phased or gradual migration from an Exchange Server 5.5 cluster environment to an Exchange 2000 cluster environment. In addition, because you aren't performing an actual upgrade, the procedure is less risky and requires a less complex back-out plan than an in-place upgrade requires. The disadvantage to the mailbox-relocation approach is that operating two parallel systems requires an additional investment in hardware, software, and support resources. Don't forget that all other pieces of your Exchange 2000 migration (e.g., Windows NT 4.0 and AD account migration, Active Directory Connector—ADC—deployment) must already be in place. However, after you transfer all user and public data from the Exchange Server 5.5 cluster to the Exchange 2000 cluster and update user profiles to the new server, you can decommission the Exchange Server 5.5 cluster and remove Exchange Server services—assuming that no other caveats (e.g., the Exchange Server 5.5 cluster is the first in the site) exist.

Increased Complexity
Many systems administrators and managers are initially intimidated by the thought of managing clustered Exchange Server deployments. Microsoft has gone to great lengths to make the administration of clustered Exchange Server machines similar to that of standalone servers. (Figure 2 shows ESM's view of a four-node cluster; the view is similar to ESM's view of a standalone server.) Exchange 2000's dependence on AD simplifies this task and makes necessary administration fairly intuitive. Adding and deleting users, managing storage, and other administrative tasks are no different in a clustered environment than in a nonclustered environment. Permissions and administrative rights are also the same for clustered and nonclustered Exchange 2000 organizations. However, despite the similarities, several aspects of clustering Exchange 2000 can add to the system's administration challenge.

Resource administration. The primary resource-management differences involve Exchange Server services. Cluster resources administration has a somewhat steeper learning curve than does noncluster service and resource administration, but mastering this skill is key to successfully managing clustered applications. Before you deploy Exchange 2000 clusters, ensure that your operations and systems management staff understands the idiosyncrasies of Win2K clusters and services and how Exchange 2000 runs on top of this platform.

Load planning. Another management challenge is cluster load planning. Three basic load-planning strategies exist. I call these strategies Maximum Load-All Nodes, Maximum Load-Standby Node (i.e., N+1 failover), and Balanced Cluster Loading (of which there are two variants: cascading and distributed failover). Limitations in the number of SGs supported per node will somewhat alleviate the problem by limiting the matrix of possibilities. Which strategy you select will depend on several factors. You might select the Standby Node strategy if you simply want a failover node for any downtime. You might select the Balanced Cluster Loading scenario—the cascading variant can tolerate multiple node failures—if you're protecting your EVSs from multiple node failures. Cost constraints, performance considerations, storage design, and management factors can also affect your choice. Each strategy has strengths and weaknesses, so take the time to understand and test each strategy to determine which one best suits your environment. The clustering section of the Exchange 2000 documentation discusses these load-planning strategies.

Backup and restore. A solid and consistent backup-and-restore strategy should be an integral part of your Exchange 2000 deployment. In a cluster environment, you must address additional configuration considerations when you choose the method for your high-availability requirements. Most of these considerations result from non-cluster-aware tape-backup software and relate to performing scheduled backups during regular cluster operation as well as during failover. Because online backup and restore is the recommended method for Exchange Server (the backup software communicates with the IS through an API), disaster-recovery scenarios for Exchange Server clusters can be more complex than for standalone Exchange Server machines. Whether the server is local or remote to the backup device and backup software will add different complexities to clustered Exchange Server environments. You need to study clustering's effect on your backup-and-restore solution; you might need to develop specific policies and procedures for your clustered Exchange Server machines because you must plan for not only OS and Exchange Server information recovery but cluster recovery as well. To determine the best disaster-recovery scenario for Exchange Server running on Cluster service, consult with your backup software and hardware vendors regarding backup-and-restore support for Cluster service and Exchange 2000.

Clustering tends to become more complicated in direct proportion to the number of cluster nodes. Managing, allocating shared storage for, and planning failover for four-node clusters will be the most challenging scenario but will also offer the highest Return on Investment (ROI). You can cope with this challenge by gaining a thorough understanding of Win2K clustering and by planning and testing your Exchange 2000 cluster before putting it into use.

Best Practices
Most organizations are still in the midst of deploying Exchange 2000 and have yet to discover or fine-tune many best practices or rules of thumb. Testing Exchange 2000 clustering at Compaq and working with customers over the past year, I've learned many things about deploying Exchange 2000 clusters. To help make your deployment smoother, I leave you with some of the important lessons I've learned.

Delegate proper permissions. Use care when configuring Exchange Admin accounts and privileges as well as Cluster services accounts (which must have Exchange Full Admin rights). Creating and managing an EVS require Exchange Full Admin account privileges. You can use the Exchange Delegation Wizard to grant these rights.

Use standardized and simplified IP addressing and naming. In a clustered scenario, the cluster nodes and the services they host require IP addresses and unique names. Cluster service requires static IP addresses for the cluster, nodes, and services. (In other words, you can't use DHCP to assign IP addresses.) In addition, each EVS requires a unique IP address. You must preallocate IP addresses to all nodes and services (i.e., EVSs) before you set up and install Exchange 2000. Structure these addresses to permit simplified configuration and management of nodes and services. Likewise, name nodes and EVSs to permit simplified configuration and management. Table 1 shows a good IP addressing and naming strategy for a four-node Exchange 2000 cluster. Note that the IP addresses for the node name and EVS name are closely related.

Configure resource ownership and failover. When you configure EVSs, Cluster Administrator will automatically configure any Exchange 2000 cluster node as a Possible Owner for each resource. However, if you create resources before a node has joined the cluster or before you've installed Exchange 2000 on the node, the EVS's resource properties won't list that node as a Possible Owner. In this situation, you must manually configure the resource properties to permit the resources to fail over to that node. Be careful when you configure hardware (i.e., Disk and Network), addressing (i.e., IP and Network Name), and Exchange Server (i.e., System Attendant) resources to ensure that Cluster Administrator includes all these resources as Possible Owners. In addition, when you configure failover and failback scenarios, you must list all nodes (in the order required to manage failover operations according to the failover scenarios and load strategy you've selected) in each resource group's Properties, Cluster Administrator Preferred Owners dialog box. Doing so will ensure proper failover and failback operations.

Take care when removing EVSs and binaries from a cluster. When you remove Exchange Server from a cluster, be careful not to interfere with other nodes' operations. When removing an EVS, take the cluster group offline first. Next, delete the Exchange Server resources. When you remove the System Attendant resource, Cluster Administrator will remove all other resources (according to resource dependency). After AD has been automatically updated with the configuration change, the EVS will no longer be visible in ESM. Finally, to remove the Exchange binary files from the cluster node, you must run the Exchange 2000 setup program and select the Remove option. The program will prompt you to choose whether to remove the Exchange Server cluster resources; remove these resources only when the node that you're removing is the last cluster node running Exchange Server services.

Design storage before you configure the cluster. Because of Exchange 2000's support for active/active clustering and multiple SGs and databases, storage design can be complex. Consider all aspects before you configure your cluster. In a cluster, Exchange 2000 SGs are a subset of an EVS, so each SG must fail over with the EVS. This requirement affects storage design. If you choose—based on performance considerations—to allocate separate physical storage arrays for transaction log files and database files, each SG will have a minimum of two arrays (one for logs and one for databases) that must provide the necessary independence and granularity to facilitate failover. You must carefully consider the implications of granularity, failover, and EVS-to-SG mapping. Also consider that Exchange 2000 limits the number of SGs to four per node, both before and after failover. Failure to follow this restriction might result in EVS problems or failures. The best recommendation is to start with SLAs in mind and design storage to meet those service levels.

New Opportunities and Challenges
Exchange 2000 offers greatly improved clustering capabilities, making clustering a viable option for significantly increasing availability and facilitating server consolidation. Although the new capabilities are worth investigating, they also create additional complexity. As part of your deployment planning, you need to evaluate clustering along with other new Exchange 2000 capabilities. By starting with a solid understanding of how Win2K implements and Exchange 2000 leverages Cluster service, you'll have a foundation on which to build a successful Exchange 2000 cluster.