Distribute application workload among your servers

The Network Load Balancing (NLB) feature in Windows 2000 Advanced Server and Win2K Datacenter Server is a highly integrated version of the Windows NT Load Balancing Service (WLBS). NLB lets as many as 32 servers share the load of IP-based applications, such as Web, FTP, and VPN applications. NLB also lets you add and remove servers from an NLB cluster on the fly, and the feature then redistributes client workload among active members of the cluster. (For more information about load balancing IP-based applications, see Tao Zhou, "Web Server Load Balancers," April 2000.)

NLB is compatible with NT 4.0's WLBS but has several important differences. Win2K implements NLB as a driver between the NIC driver and the TCP/IP stack, and you activate NLB by selecting a check box at My Network Places, Properties, LAN Connection Properties on each server in the NLB cluster. (In contrast, NT 4.0 implements WLBS as a virtual NIC.) In addition, NLB works as soon as you configure it and doesn't require a reboot.

NLB uses a media access control (MAC) address-translation model to distribute an IP workload among cluster members. NLB uses a cluster MAC address in place of the preassigned MAC addresses that all NICs have. An Ethernet NIC driver operating at the Open System Interconnection (OSI) model's Layer 2 (i.e., the data-link layer) listens for its MAC address, forwards its packets up the protocol stack, and ignores other packets.

In Unicast mode, which disables network traffic between clustered hosts, NLB converts the cluster's Virtual IP (VIP) address, which you must specify as the same for all servers in the cluster, to a MAC address. NLB then replaces each server's MAC address with this new cluster MAC address. All servers in the cluster use the same MAC address, so every server receives and passes up the TCP/IP protocol stack every packet received at the cluster's IP address.

The next stop up the TCP/IP protocol stack from the NIC driver is the NLB driver, which uses a hashing algorithm to decide which server in the cluster will accept and process the packet. The NLB driver inspects the sender's IP address and the port to which the packet is addressed. Then, the NLB driver takes into account port rules and which servers in the cluster are active for the port rules. The NLB driver decides whether to send the packet up the TCP/IP stack to the application on the driver's server or to ignore the packet because another server in the cluster will accept it.

NLB's Architectural Limitations
The load-balancing process requires the NLB administrator to set up the server system in a specific way. The cluster's VIP address applies to all servers in the cluster, so all servers must be on the same IP subnet. Every server in a cluster needs to receive all network packets that the cluster receives, so all servers must be part of the same collision domain (i.e., servers need to connect to the same Ethernet hub) or must connect to a switch that will broadcast all packets to all ports on the switch. To force the switch to broadcast all packets to all ports, you need to configure NLB to prevent the switch from learning the cluster's MAC address.

The TCP/IP protocol stack intercepts packets addressed to a MAC address on the server that sends the packets; thus, the protocol stack doesn't send those packets to the network. NLB's Unicast mode requires that each server in the NLB cluster have two network adapters—a cluster adapter for load-balanced application traffic and an adapter for server-specific intracluster traffic. The second adapter's unique MAC address enables traffic between cluster members.

A server can't be a member of an NLB cluster and a server cluster at the same time. (Win2K server clusters support application failover for cluster-aware applications running on servers that share a common storage system.) Token-Ring networks don't support NLB.

NLB doesn't monitor applications to make sure they're running. At 1-second intervals, each server in an NLB cluster broadcasts a heartbeat message through the cluster adapter that the other servers in the cluster monitor. As long as a server broadcasts its heartbeat message, the server remains part of the cluster and must process its share of the workload—even if the application (e.g., Microsoft IIS) that is processing the packet fails. You need to use a separate application-monitoring tool to detect whether a load-balanced application has failed. (For information about using Perl scripts to monitor Web applications, see Curt Aubley and Troy Landry, "Monitoring Win2K Web Site Performance and Availability," http://www.win2000mag.com/, InstantDoc ID 8857.)

Configuring and Testing NLB
Win2K AS and Datacenter install NLB when you install a LAN connection. Understanding how NLB works and planning a network configuration to support NLB are the most difficult parts of NLB's setup process. After you set up the physical network configuration and supporting DNS (and WINS, if needed) entries, you can enable NLB for a NIC and configure each server in your cluster.

To test NLB, I used four Hewlett-Packard (HP) NetServer LH 6000 systems. Each system had six 550MHz Pentium III processors with 2MB of L2 cache and 1GB of SDRAM. After installing Win2K AS on each server, I copied to each server the Web application I had selected for testing. Each server had two HP 10/100 NICs, and I picked the first NIC (i.e., Local Area Connection) to connect to a segment of the Windows 2000 Magazine Lab's benchmark network and to serve as the load-balanced cluster adapter. I connected the second NIC in each server (i.e., Local Area Connection 2) to a second network segment.

To test NLB's Unicast mode, I selected the Network Load Balancing check box for my Local Area Connection NIC, then opened the Network Load Balancing Properties page (which you reach from LAN Connection Properties) and supplied the information that NLB needs to operate. The NLB Properties page has three tabs—Cluster Parameters, Host Parameters, and Port Rules. The Cluster Parameters tab, which Figure 1 shows, lets you identify the cluster and operating mode. The Host Parameters tab, which Figure 2 shows, lets you specify server-specific information. The Port Rules tab, which Figure 3, page 130, shows, lets you use a TCP or UDP port number to specify the application services that NLB will load-balance.

On the Cluster Parameters tab, I entered the IP address and fully qualified DNS name for the cluster. Application users use the cluster address and name to access a load-balanced application, so you need to enter the same address and name for each server in the NLB cluster. The Cluster Parameters tab refers to the IP address as the Primary IP address, but NLB users also refer to the address as the VIP address. To let users address the cluster by name, you need to create a DNS host entry for the cluster name and VIP. The Cluster Parameters tab also lets you enable multicast mode and remote control support.

On each server's Host Parameters tab, I entered a dedicated IP address and a unique host ID. NLB uses the unique host ID to prioritize servers in the cluster. The active server that has the highest priority (i.e., lowest number) takes responsibility for IP traffic that isn't load-balanced (i.e., IP traffic traveling to ports that port rules don't define). You can use the dedicated IP address to access individual servers in the cluster from nonclustered systems.

On the Port Rules tab, you define which IP applications NLB will load-balance. TCP and UDP port numbers direct packets to a particular application on a server. For example, Web servers usually listen for packets addressed to TCP port 80. For NLB to work properly, you need to configure every server in the cluster to load-balance the same set of applications. To configure NLB, you can use settings on the Port Rules tab. In my initial tests, I left the default port rule in place, balancing ports 0 to 65535 for TCP and UDP protocols with single IP address-based affinity.

As soon as you click OK on the Network Load Balancing Properties page, NLB begins converging the cluster, a process in which a server advertises its presence and configuration to other cluster members and all members agree how to share the load. To complete the server's NLB configuration, you add the VIP address as a second IP address under the Advanced section of TCP/IP properties for the load-balanced LAN connection.

To test load balancing, I installed an Active Server Pages (ASP)-based Web application to run from each of three clustered NetServer LH 6000 Web servers, and I used a Compaq ProLiant ML530 server to hold the applications' Microsoft SQL Server databases. I used Quest Software's Benchmark Factory to simulate loads from 16 client systems. My tests demonstrated that NLB allocated the workload among the three servers, but not equally. The disparity among server workloads happened because I configured NLB for single affinity and all traffic originated from only 16 computers, all of which were on one Class C network. When I set Affinity to None, the workload became uniform across cluster members.

Configuring Multicast Mode
NLB's multicast mode allows network traffic between clustered hosts. Multicast mode leaves the NIC's original MAC address in place and assigns a multicast MAC address for cluster traffic. Address Resolution Protocol (ARP) resolves the cluster adapter's dedicated IP address to the NIC's original MAC address and resolves the cluster's VIP to the multicast MAC address. For the ARP resolution process to work properly, the router upstream from the cluster must resolve the cluster IP address to the cluster's multicast MAC address.

Cisco routers that find a non-multicast IP address fail to cache the multicast MAC address. To work around this problem, you can add a static entry to the router's ARP cache, mapping the cluster's VIP address to the cluster's multicast MAC address. According to Microsoft documentation, multicast mode is NLB's preferred operation mode. However, because of the problem with Cisco routers, multicast isn't NLB's default operation mode.

To reconfigure the cluster from unicast mode to multicast mode, select the Multicast support check box on the Cluster Parameters tab of each cluster host and reboot the cluster servers. When I reconfigured the first server cluster, a "duplicate IP address" message informed me that the new multicast MAC address for the cluster's VIP conflicted with the unicast MAC address still in place on the other servers in the cluster. Shutting down and rebooting all the servers in the cluster solved the problem.

Load-Balanced Applications
NLB's Port Rules tab offers you several options (e.g., Filtering mode, Affinity) to configure load balancing for a particular application. Filtering mode determines whether NLB will allocate an application workload to multiple hosts or one host. When you select Multiple hosts filtering, NLB allocates application traffic among all the servers in the cluster in which the rule is active.

To override the default equal distribution of the workload and cause NLB to send more work to specified servers, configure the Load weight option on the Port Rules tab. NLB will sum the load weight for all active servers and allocate the workload in direct proportion to each server's share of the sum. For example, if you distribute a workload between two servers, you could set the first server's load weight to 50 and the second server's load weight to 100, and NLB would send one-third of the workload to the first server and two-thirds of the workload to the second server.

When you load-balance an application among multiple hosts, Affinity causes NLB to send all traffic from one IP address (Single Affinity) or from any address in the same Class C IP address space (Class C Affinity) to the same cluster host. These affinity options let one server track Web session details. Affinity's other setting, None, causes NLB to use the source IP address and the TCP or UDP port number to decide which cluster host will accept the packet.

Single-host filtering mode causes NLB to send all application traffic to one cluster member, regardless of where the application request originated. When configuring single-host filtering, you need to use the Handling priority option to assign each cluster server a unique priority for the port rule. Application traffic goes to the highest-priority server in the cluster in which the port rule is active.

Disabling a port rule on a load-balanced server causes NLB to reallocate application traffic to other active servers in the cluster. NLB rebalances multiple-host rules among remaining cluster members and directs single-host rules to the next-highest priority server in the cluster for that port rule. To disable a port rule on a server, open the Port Rules tab, select the port rule, select Disabled, and click Modify.

NLB can load-balance a variety of IP-based functions, including printing and read-only file serving, which use TCP and UDP port 139. These applications rely on NetBIOS access to the server, so the setup requires special considerations. Because the NetBIOS over TCP/IP (NetBT) protocol binds only to the first IP address on a NIC, the cluster VIP address must be the first (or only) address that you list in the TCP/IP configuration parameters for the cluster NIC. In this configuration, typical Server Message Block (SMB) communication through the cluster NIC to individual cluster servers is impossible, even in multicast mode. You need to configure another NIC with a unique IP address that will also bind to NetBT and provide a path for SMB traffic to reach individual servers in the cluster.

The cluster name isn't a NetBIOS name, and cluster servers neither register the cluster name with WINS nor respond to NetBIOS name resolution requests for the cluster name. Thus, client systems that need to access the cluster by a NetBIOS name must have another means to resolve a NetBIOS cluster name to the cluster's VIP. Win2K clients must use the cluster's VIP address to access NetBT-based applications because an additional security mechanism in Win2K compares the endpoint server's name to the NetBIOS name in the connection and rejects the connection when the names are different.

Remote Administration
The wlbs.exe utility and its subcommands let an administrator locally or remotely control NLB hosts and clusters. Because NLB doesn't monitor applications, you need to use other tools to monitor load-balanced applications and wlbs.exe when an application fails. If you're hosting multiple IP-based applications and only one application fails, wlbs.exe lets you use the Disable subcommand to disable the port rule for the failed application. Then, NLB reallocates the workload among remaining servers.

You can also use wlbs.exe to stop or drain cluster operations on one or all servers in the cluster. The Drain subcommand doesn't allow any new connections to the cluster or the specified cluster member but lets users remain connected so that they can complete their work.

Other wlbs.exe subcommands let you display cluster status information: Query tells you which servers are in the cluster, and Display lists detailed configuration information for one server in the cluster. You can run Display only from the server that you're seeking information about.

Testing Failover
While using Benchmark Factory to generate a workload to the cluster from one simulated user, I simulated failure of a cluster server by using wlbs.exe to remove and re-add one server to the cluster. From a computer connected to the clustered and nonclustered networks, I used Win2K's Performance Monitor to watch CPU utilization at each server in the cluster.

I set the default port rule to use single affinity. Single affinity causes NLB to allocate traffic from a particular IP address to the same server as long as the same set of servers is balancing the application's workload. When you add or remove servers from the cluster or disable a port rule at the server, the allocation algorithm might assign a different server to process requests from a particular IP address. My system's workload originated from one computer, so Performance Monitor showed that only one member of the NLB cluster was working. When I used wlbs.exe to stop that cluster member, the workload shifted to a different member of the cluster. After I used wlbs.exe to restart the stopped cluster member, it resumed processing the workload.

I set Affinity to None and performed another test. One computer generated a workload that NLB allocated among all cluster members. After I stopped one cluster member, that member's workload rotated among remaining cluster members. After I restarted the cluster member, workload distribution returned to its original pattern.

Understanding NLB Is Crucial
NLB works well, is easy to configure, and provides two modes for client-to-server affinity to preserve user sessions during a load-balanced application session. However, to avoid mistakes in your configuration, you need to understand how NLB works before you begin to use it. Learn how packets flow and how NLB processes application workloads, and you'll discover that NLB can help you create a scalable, fault-tolerant server system to support your applications.