A Windows NT 4.0 cluster is a two-node failover solution, using a shared-nothing model (i.e., each server owns and manages its own devices). A cluster isn’t a fault-tolerant solution but rather a high-availability solution; that is, it minimizes downtime instead of eliminating it.

The two nodes in a cluster work in either an active/active (i.e., both servers perform meaningful work all the time) or an active/passive (i.e., one server performs no meaningful work or performs work but doesn’t run the same application at the same time as the other server) mode. Some applications can use the active/active mode (e.g., Microsoft SQL Server and File Services), but others must use an active/passive configuration (e.g., Exchange Server).

A key part of an NT cluster is the shared-storage subsystem. The external storage devices must be based on SCSI. The connections to the devices can be based on either SCSI or SCSI over fibre channel (the cluster uses SCSI commands to reserve and release devices and to reset SCSI buses).

Resources in a cluster can be physical disks, IP addresses, network names, file/print shares, generic services, or a custom DLL. A custom DLL makes a server application cluster-aware; therefore, it can report status back to the cluster, respond to requests to be brought online or taken offline, and respond more accurately to IsAlive and LooksAlive requests from Microsoft Cluster Server (MSCS).

The resources can have dependencies on each other (e.g., the network name can be dependent on an IP address and a file share can be dependent on a network name). The cluster moves resources online and offline according to their dependencies. The cluster manages resources collectively in groups and fails them over to another node as an entity.

The cluster uses a special set of Registry keys as a database. The keys are under HKEY_LOCAL_MACHINE\CLUSTER and in the file C:\winnt\cluster\clusdb.

The database’s log file in clusdb is in a special resource called the quorum resource. The quorum resource can be on any shared disk device, but I recommend that you use separate disks or a 100MB hardware partition for this device. (The cluster stops operating if the disk device fills up.) The cluster also uses the quorum resource when the cluster nodes can’t communicate with each other over the network. The node that owns the quorum resource will continue cluster operations (SCSI disk arbitration determines quorum resource ownership). For further information about NT clustering, see the resources listed in the sidebar "For More Clustering Information."