According to speakers at the Microsoft Enterprise Conference (MEC) 2001 in Orlando, Florida (September 2001), memory fragmentation is the main reason Microsoft Exchange 2000 Server doesn't take full advantage of Windows 2000 clustering. Before Exchange 2000's launch, many enterprises eagerly anticipated the ability to build large clusters that take advantage of the latest hardware and move away from the expensive active-passive clustering mode that Exchange Server 5.5 supports. In an active-passive cluster, one physical system or node always actively supports users while its mate remains passive, waiting to be brought into action if the active system fails. Exchange 5.5 Server Enterprise Edition supports active-passive clustering on Windows NT 4.0, but this clustering solution is expensive because you need multiple licensed copies of the OS, the application, any associated third-party utilities (e.g., backup or antivirus programs), and hardware.

Exchange 2000 supports active-active two-node clustering, where all operational nodes in the cluster can host users concurrently, and store partitioning, which lets you swap storage groups (SGs) between active nodes. System designers want clustering to provide high levels of system resilience and availability and to consolidate several servers into a smaller set of large clusters. Exchange 2000 Service Pack 1 (SP1) extended clustering functionality with support for four-node clustering based on Win2K Datacenter.

Exchange 5.5 and later versions use Dynamic Buffer Allocation (DBA) to manage memory. Administrators often have heart palpitations when they see store.exe's memory usage grow to where Exchange seems to take over the system. This memory usage is by design as DBA attempts to balance the demands of Exchange against the needs of other applications to keep as many Exchange 2000 Store buffers and as much data in memory as possible. On servers that run only Exchange, the Store reserves large amounts of memory because no other applications compete for this resource.

During typical operation, the Store allocates virtual memory in various sizes to map mailboxes and other structures. The Store sometimes allocates memory in contiguous chunks, such as the approximately 10MB of memory required to mount a database, but as time goes by, Windows might not be able to give the Store enough contiguous memory because virtual memory has become fragmented. This fragmentation is similar to the fragmentation that occurs on disks and usually doesn't cause many problems—except during cluster state transitions.

During a cluster state transition, you must move the active SGs on a failed node to one or more other nodes in the cluster. SGs consist of databases, so Exchange must initialize the SG and then mount the databases so that users can access their mailboxes. On a heavily loaded cluster, the Store might not be able to mount the databases because not enough contiguous memory is available. Thus, a cluster state transition occurs, but the Store is essentially brain-dead because the databases are unavailable.

This situation occurs only on heavily loaded systems, but server consolidation and building large, highly resilient systems is why system designers consider clusters in the first place. Microsoft realized the problem and started advising customers to limit the number of supported Messaging API (MAPI) clients (1000 in Exchange 2000, and 1500 in SP1 and perhaps SP2) when running in active-active clustering mode.

Microsoft also has advice about how active the clusters should be. Microsoft recommends that you keep a passive node available whenever possible, meaning that a two-node cluster will run in active-passive mode and a four-node cluster will be active on three nodes and passive on the fourth. Of course, this approach is most valid when the cluster supports the heavy loads that clients, connectors, or other processing generates. Clusters that support small client numbers and perhaps run only one SG with a few databases on each active node usually operate successfully in a fully active manner because virtual memory fragmentation is less likely to occur.

Because a fresh node is available in an active-passive configuration, clusters can support higher numbers of users per active node, perhaps up to 5000 mailboxes per node. The exact number of users a system will support depends on the system configuration, user load, and careful monitoring of virtual memory on the active nodes. You’ll need to work through a sizing exercise to determine the optimum production configuration. For more information about monitoring clustered systems, read the Microsoft white paper about Exchange 2000 clustering.

Microsoft is trying to fix the memory fragmentation problems, but overall, Exchange 2000 clustering has been a disappointment. Similarly, Exchange 5.5 clustering promised a lot but ended up being an expensive solution for the value it delivered. The problems have convinced many who considered Exchange 2000 clusters to look at alternatives, notably standalone servers that share a Storage Area Network (SAN). In this environment, the major investment is put into building resilience through storage rather than clusters. A server problem will still affect users, but the theory is that most Exchange problems are disk related, so if you take advantage of the latest SAN technology to provide the highest degree of storage reliability, you'll have a reliable and robust clustering solution. SAN technology also offers some long-term advantages because you can treat servers as discardable items, swapping them for newer computers, while your databases stay intact and available in the SAN.

I still feel optimistic about Exchange clusters. If you plan and manage clusters with the appropriate level of knowledge about the hardware, OS, and application environment, clusters work fine. I'm content to have my own mailbox on an Exchange 2000 cluster. But too many hiccups have occurred along the road, and clusters haven’t achieved the promise that Microsoft originally laid out. Developers continue to improve the technology, but in the interim, anyone interested in clustering Exchange servers needs to consider all options before making a final decision.