In the October 1996 Lab Guys column, "Overloading Your Server with Multi-Homing," John Enck and I raised the issue of putting multiple IP addresses or multiple NICs in your Windows NT server: Does multi-homing work? We invited readers to submit ideas about how to make multi-homing work, because the Windows NT Magazine Lab ran into serious operational difficulties. Thanks to everyone who suggested ideas or echoed the Lab's concerns. We're happy to report that we have a solution--not the solution, mind you--but one possible solution. Let's look at some issues the Lab faced and how we resolved them.
One problem is how to increase network bandwidth for application servers. We test servers of all sizes in an enterprise environment designed to mimic corporate networks. We need to know two things: 1) that we can build fat data pipes into our test servers for thousands of users on one network, and 2) that network I/O is not a bottleneck in our test environment.
You can take two approaches to network administration: Spend a lot of money and buy the biggest, fastest hardware; multiple Class-C IP licenses; and so forth, or make do with what you have and hope your users are patient. The Lab aimed somewhere in the middle: We limited IP addresses, servers, and network cards, but we used some expensive equipment.
Our setup is not unique. We need to run one NT domain, using TCP/IP, Dynamic Host Configuration Protocol (DHCP), and Windows Internet Name Service (WINS) for up to 50 workstations (simulating 500 to 2000 users) and 10 servers on 4 independent subnets, while maintaining trusts with other domains and access to our 10Mbps corporate network and the Internet. We need to operate on 100Base-TX so that our virtual users don't get bogged down by thin network pipes and we get a clear picture of server performance. Compared with a corporate organization that has 2000 client workstations, 100+ servers, and who knows how many separate networks, the Lab's setup is like a walk in the park.
I won't describe all the things the Lab tried, but I will mention two things we learned not to do:
* Don't assign IP addresses to two cards in the same server on the same subnet (e.g., 188.8.131.52 and 184.108.40.206) without subnet masking. Such an arrangement hopelessly confuses NT, even if you run Multi-Protocol Routing (MPR), because the OS doesn't know which card to use for which packets.
* Don't cascade hubs. You can chain only a limited number of 10Base-T hubs, you cannot chain Class-1 100Base-TX repeaters at all, and you can cascade only two Class-2 hubs. If you cascade hubs, you end up with contention between ports; that is, systems (IP packets) earlier in the chain get priority over later ones. Instead, use switches, which boost network performance in other ways (e.g., add full-duplex capabilities and high-speed backbones).
What should you do? Your solution depends on your situation. Our limitations dictated our course of action. Figure 1 depicts our logical network layout, and "Windows NT Magazine Lab's Enterprise Testing Environment" details our resource and testing requirements. Working closely with engineers from Adaptec and Cisco Systems, who each supplied a piece to our puzzle, we set up our network solution as follows:
* We had only one Class-C license, so without setting up a proxy server (also an option), we divided our 256 available addresses among four IP subnets (one resource and three client virtual LANS--VLANS, same physical network, different collision domains).
* To accommodate four separate networks talking to the same resource server, we needed a router. We used a server running Microsoft's NT MPR software (built into NT 4.0; available for NT 3.51 from Microsoft's Web site at http://www.microsoft.com). Using an NCR Globalyst S40 with two 133MHz Pentiums, 128MB of RAM, three 2.1GB Fast and Wide SCSI-2 drives, and an Adaptec Quartet 4-port 100Base-TX full-duplex NIC, we installed MPR and DHCP services to route among our four networks and assign IP addresses to all clients and resource servers.
* Because our setup was a domain, we needed a Primary Domain Controller (PDC). We used a Digital Equipment Prioris HX 5133DP, with two 133MHz Pentiums, 64MB of RAM, two 2.1GB SCSI-2 drives, and a single Adaptec 100Base-TX NIC. This system also functioned as the WINS server for NetBIOS name resolution (our testing software referenced machines by name).
* We set up five 100Base-TX hubs for our network: one 8-port Netelligent from Compaq and four 12-port hubs from Adaptec. We also needed a fast Ethernet switch--we started with a 6-port Netelligent 100-TX switch from Compaq, but we outgrew it. So we acquired a Cisco Catalyst 5000--an expensive piece of networking hardware (with a base price of $11,000), but worth the price if you want the power. For our test environment, we needed expandability, flexibility, and performance. The Catalyst 5000 came with one supervisor card (with two 100Base-TX ports) and three line cards (12 ports each), giving us a total of 38 configurable full-duplex switch ports. We separated 32 of these ports into 4 VLANs, giving us 4 essentially independent 8-port switches. (We could have used 4 separate, smaller boxes, but the Cisco gave us more flexibility).
Making It Work
Now that you know about the physical components, let's look at how we put them together to set up our multi-homed network. We used TCP/IP, MPR, DHCP, and WINS to implement our solution. Let's look at what you need to know about using these services in a multi-homed environment.
TCP/IP. TCP/IP presents some unique problems (e.g., multi-homing, security, administration) compared to IPX/SPX. So, the logical conclusion is don't use TCP/IP! Unfortunately, the Lab doesn't have that luxury. Microsoft based NT on TCP/IP and included IPX/SPX for compatibility with NetWare, but IPX/SPX is better than TCP/IP for certain things. For example, if we set up our testing environment with IPX, we wouldn't need multiple networks. (However, we couldn't access the Internet without an appropriate TCP/IP proxy server.) To increase server network throughput, we'd just add NICs and make sure their IPX network numbers didn't conflict. But we couldn't do that, so off we went.
In the October 1996 Lab Guys column, we suggested using a proxy server as a firewall between you and the outside world so that you can use any address range you want. Another option is IP subnet masking--breaking your network into logical segments within a single address range. If you have more than 256 computers, you need more than one Class license, but you can still group users into separate networks going into the resource servers to manually load balance.
Setting up subnet masking is as easy as entering a single number into the Subnet Mask field on the IP Address tab in TCP/IP Properties (as shown in Screen 1 ). If you have only one logical network, you enter a 0 in the rightmost (last) octet field of the subnet mask. (An octet is an 8-bit identifier--an IP address has four octets that define its value. You can use all four octets to define a subnet mask, but we used only the last octet to keep things simple.) If you want two segments, enter 128; for four segments, enter 192; for eight segments, enter 224. Let's review how to determine these values.
The 8 bits of the octet define 256 possible logical network segments, but not all of them are usable. When the destination IP address of an IP packet gets XORed (XOR is the logical Boolean operation defined in Table 1A) with the subnet mask (which happens regardless of whether you have multiple networks), the destination network is calculated. If you have multiple networks, the subnet mask tells the packet which one to go to, based on the IP address.
To break the 256 possible IP addresses into two segments (0 through 127 and 128 through 255), you enter 128, which is binary 10000000, in the last octet field of the subnet mask. The 1 in the most significant bit (MSB) position of the subnet mask's last octet tells the protocol that the MSB of the destination address carries an identifier (0 or 1) that defines the network segment.
Thus, the two logical network definitions are 10000000 and 00000000 for the two address ranges. The MSB of the final calculated address (after the XOR operation) defines which network to route to. The remaining 7 bits define the address within the network. Table 1B, gives an example calculation.
A value of 0 in the last octet of a subnet mask signals one logical network, and all 8 bits define the IP address (instead of the MSB and 7 remaining bits defining the network number and address, respectively). If you want four networks, you need four sequential identifiers: 00, 01, 10, and 11. To tell IP that the two MSBs are identifiers, you need to set the subnet mask to 196--binary 11000000.
We needed eight subnets--four for the testing environment and four for the rest of the Lab--resulting in a subnet mask of 255.255.255.224 (224 is binary 11100000). This subnet mask passes the packet to one of eight networks and strips away the first three octets in the address.
The IP addresses assigned to the card (in this case, statically assigned addresses for the ports in the MPR server because the MPR and DHCP server needs to know the numbers for the NIC that does the assigning) determine which logical network they are on. In the Lab, we broke up our eight networks as addresses 0 through 31, 32 through 63, 64 through 95, 96 through 127, 128 through 159, 160 through 191, 192 through 223, and 224 through 255. You cannot use the starting or ending addresses in each range, either via static assignment or DHCP, because the TCP/IP specification reserves them for special use (the first address in the range is reserved for broadcasts to the rest of the world; the last is reserved for broadcasts to the rest of your internal network). So our NCR server with the 4-port card had X.X.X.65, X.X.X.97, X.X.X.129, and X.X.X.161 assigned to it (leaving the other address ranges for future use), defining it in networks 3, 4, 5, and 6. When each port gets a DHCP request, the subnet mask defines which network the address belongs to; all networks are logically separate, just as if they were separate Class-C ranges.
MPR. Microsoft's MPR is a networking service that lets you route IP packets between multiple logical networks, collision domains, and address ranges, or route network data between networks running different protocols (e.g., TCP/IP vs. IPX/SPX), different network hardware (e.g., 10Mbps vs. 100Mbps), and disparate wiring types. You can set up an NT server with multiple NICs to function as the pathway joining your different networks. Because TCP/IP or IPX/SPX is just the underlying transport protocol, MPR enables different NT domains or networks to communicate with one another as if they were one big happy network.
MPR includes several services you need to install: DHCP (BOOTP) Relay Agent (if you plan to route DHCP requests), Routing Information Protocol (RIP) for TCP/IP, and RIP for NWLink IPX/SPX-compatible transport (depending on which protocols you're running). Installation is easy enough: Go to the Control Panel Network applet, open it (or right-click your Network Neighborhood icon and select Properties), click the Services tab, click Add, select the service, and hit OK (install one service at a time). When you exit the Network applet, NT will want to restart--let it.
Next, you need to do some basic configuration (you can't configure the software components for DHCP Relay Agent or the RIP for IP services). Bring up the Network applet again, go to Protocols, double-click TCP/IP to bring up TCP/IP Properties, and select the DHCP Relay tab, as shown in Screen 2. Click Add and enter the addresses for the cards and IP addresses you assigned.
One last check box you need to select is Enable IP Forwarding, which you find on the Routing tab of TCP/IP Properties. If your system has more than one NIC or multiple addresses assigned to one NIC, IP Forwarding lets your system receive packets via one NIC and retransmit them out another, or take packets from one network (or subnet) and forward them to another. People disagree about whether to enable this setting on nonrouting servers and workstations, but you need to enable IP Forwarding on your MPR system--if you have only one NIC and experience IP problems, turn IP Forwarding off.
NT will want to reboot again; let it. Then ping from one card to another to determine whether the cards can talk over the cables; next, ping to and from another system. IP routing should be working fine at this point. If not, there are a couple of possibilities you can try. First, create an lmhosts file. This text file resides in the %systemroot%\system32\drivers\etc directory and is a list of NetBIOS names and their associated IP addresses. If you create this file, put each system on a separate line with the IP address in the first column and the name in the second. For more information, see Microsoft Windows NT Server Resource Kit: Windows NT Server Networking Guide. The lmhosts file defines your PDC, MPR Server, and any other important servers with statically assigned IP addresses. You need to put this file on each system listed in the file, so that they all know where they are and don't use the router to identify each other. If it still doesn't work, you might need to set up routing tables on the MPR server, which you can do from the command line with the route command (refer to the NT Server networking guide for a detailed discussion).
If you also want to access the Internet, you need to run Domain Name System (DNS) somewhere. Either point your MPR system to your DNS server and Internet gateway, or install DNS on your router and configure your IP settings accordingly (DNS server IP address and domain name).
DHCP. DHCP (also installed from the Services tab in the Control Panel Network applet) lets you dynamically assign IP addresses to servers and workstations, rather than manually administer a huge number of static addresses across your enterprise. When you set up a new workstation or server, you simply tell it to use DHCP; when it comes online, it broadcasts a request for an address assignment. The first DHCP server to respond tells the networked systems about the gateway, the WINS server, and other related services.
To use DHCP on multiple networks or subnets of the same physical network, you must either set up a separate DHCP server for each network or cover them all with one multi-homed server. Using a multi-homed server is an extension of creating multiple DHCP scopes (ranges of addresses) under a single Class-B or Class-C license.
For the Lab's purposes, dedicating four servers to take care of IP addresses didn't make sense--and we certainly didn't want to manage all the addresses on our own. Using the server with the multi-port NIC and MPR to service DHCP requests did make sense. As the MPR server, each port on the card already required a static address, and the subnet masking broke our Class-C range into eight logical groups.
NT is intelligent in the way it handles DHCP. A server with one NIC can be a DHCP server for one large range of addresses or several small ranges with exclusions. By combining DHCP with multiple NICs and subnet masking, you can service DHCP-enabled systems across multiple subnets or even across multiple Class-C IP licenses. If you need to service multiple subnets, properly setting the DHCP scopes lets NT match the scope to a specific port. However, you don't have to follow our example of using multiple NICs for a DHCP server. If you are using standard network routers, you can set up a DHCP server with one NIC and multiple scopes, and use the DHCP Relay Agent to route DHCP requests from the different logical--or physical--networks. This setup lets you use one DHCP server for your entire enterprise, because each router forwards its appropriate subnet information along with the DHCP request. This way, systems across your network will not be assigned the wrong addresses from the wrong subnets.
In the Lab, our eight-segment network required eight DHCP scopes (four scopes per server; one scope can cover multiple subnets, but you must exclude any reserved or statically assigned addresses). Screen 3, shows the four scopes for our second DHCP server. You create these scopes with the DHCP Manager applet installed with the DHCP service (double-click the defined server to set up a new scope). We defined the DHCP scopes on 66 through 94, 98 through 126, 130 through 158, and 162 through 190, and we excluded 66 for the domain controller and WINS server. (We could have started the scope at 67 to accomplish the same result.) Table 2 lists other DHCP scope configuration settings we used.
With these scopes defined and the addresses 65, 96, 129, and 161 statically assigned to the card, by a little sleight of hand, NT can assign a request that comes in over a particular card to an address from the correct matching pool. For example, if a request comes into the port at address 65, it gets an address from the 66 through 94 range, not the 130 through 158 range.
We used physical wiring segmentation (creating separate collision domains) to logically delineate the Lab's network. This method works well if you want to optimize your network throughput, but the hardware costs extra money. You can set up a flat network using hubs instead of switches between the MPR system and the clients, and even put systems from disparate subnets into the same hub (you could no longer use DHCP), but you will end up with significantly increased packet collisions, lower throughput, and decreased performance. The better choice is to use fast switches to physically separate the logical networks and keep like-addressed systems on the same hubs. You can combine the two approaches, based on your priorities in the tradeoff between cost and performance.
Figure 1, shows the Lab's network, with links from the DHCP/MPR system to the four VLANs on the switch, hubs that handle the workstations, and direct switched connections for loaded resource servers. We'd get the best performance if every computer had a switched port (and money was no object). The arrangement we set up allows for the greatest expandability and flexibility, without a huge cost.
The physical segmentation of the network solves one problem: a DHCP client request that arrives at the wrong card and causes the system to receive the wrong address and end up on the wrong network. If all systems were on the same physical LAN (attached to the same hubs or the same switch), you couldn't predict the behavior of the DHCP server. By physically separating the workstations and servers according to the networks they belong on, DHCP can assign addresses as I described previously, and you don't have to worry about requests ending up in the wrong place.
One minor problem we experienced deserves mention. On a network, the first DHCP server that responds to an address assignment request gets the prize. The Lab's network has two DHCP servers (covering two different IP ranges)--one for the 10Mbps side and the other for the MPR/DHCP system. We experience occasional random address assignments: A machine mistakenly gets an address from the 100Mbps side when it is plugged into the 10Mbps side because the MPR system is the first to respond (this situation always occurs for requests off VLAN 1, the one connected to the outside world). Setting up another MPR system to act as a broadcast packet filter between the 10Mbps and 100Mbps networks or using a dedicated network router are the only solutions we've found to this problem. By the way, if you double-click a particular scope in the DHCP Manager tool, you can see all the systems with registered IP addresses on that subnet.
WINS for Multiple Subnets. WINS, which handles NetBIOS name resolution for NT (i.e., WINS lets NT systems see each other by name no matter what network they are on), is another service that you install from the Services tab of the Control Panel Network applet. The WINS Manager applet installed with the service (shown in Screen 4) lets you manage WINS Server properties. However, WINS can be complicated to set up and make functional--debugging the system took us a while.
At first, we had difficulties putting WINS and DHCP on the same system, so we separated them onto two servers: the MPR system (NCR) and the PDC (Digital Equipment). As the WINS Help files state, WINS will not work on multi-NIC (or multi-homed) systems. We weren't able to properly ping across the router using NetBIOS names. The MPR system could ping the other clients, but the clients had trouble doing the reverse (pinging worked--but very slowly).
After you've installed the service, you can configure the primary WINS server and secondary or replication servers. Go to the WINS Manager applet under Administrative Tools, and open Add WINS Server under the Server menu. Enter the IP address for the system (in this case, the statically assigned address of our PDC) in the dialog box, and press Enter--you're finished. The system is now available for WINS name registration--or you can initiate scavenging (i.e., force the system to go out and look for machine names) under the Mappings menu to speed the process.
Assigning gateways is an issue that covers several of the topics discussed here. Gateway assignments affect what happens to lost packets, Internet access, and name resolution. For example, if you want to get from your desktop computer onto the Internet, you need to know who and where the gateway is. If you are using DHCP, the DHCP server tells your computer how to route the packets.
But you have to tell the DHCP and WINS machines where to find the gateways. For name resolution, tell the DHCP server the WINS server's IP address (even if they are on the same machine). Tell the WINS server that the MPR system is the default gateway, and use the IP address of the port or VLAN that's attached to the outside world. Make the default gateway for each network segment on the MPR server point to the system that functions as your Internet gateway (or if your MPR server is your gateway and DNS machine, use the appropriate VLAN port's IP address). Table 3 shows the address assignments that we used for the four Lab VLANs. By the way, nothing prevents one physical server from functioning as PDC, DHCP server, DNS, and MPR; because the Lab environment was so complicated, we decided to separate them--which also solved some intermediary problems and simplified troubleshooting.
Performance note. WINS takes forever to configure: Hours pass before the WINS server builds complete browser lists. After each system is registered in the domain and given an IP address, you'll wait awhile before the system is added to the list and made available in the Network Neighborhood (although you can manually force access using the Run dialog: \\machinename). You can try to force the WINS Manager to go out and scavenge for names, but this action takes just about as long to rectify as waiting for it to happen on its own. About the only way to speed this process is to configure a secondary WINS server, which not only distributes the load for managing the names, but provides fault tolerance if the primary server fails (which otherwise stops name access from one computer to another on the network).
Extending network bandwidth in your server has some tradeoffs. Extra overhead and I/O processing (e.g., interrupt handling) come into play when you add cards. Each PCI NIC you add is another interrupt that the CPU must handle to service I/O requests. More CPU interrupts mean slower processing.
Adding bandwidth increases the amount of information the system must process. Additional bandwidth consumes extra resources such as memory and CPU time--often requiring that you augment them to maintain your desired performance level. As your system handles more data, you need to look at your disk subsystem as the next potential I/O bottleneck.
All you can do is hope that putting the extra cards into your application server doesn't slow it and negate the benefit you get by increasing network throughput. You can help this problem somewhat by using full-duplex cards in your servers and giving them direct connections to your fast Ethernet switches. In our environment, we saw drastic improvements in performance by adding network segments to the database servers under test, such as a quad-processor Pentium Pro system going from a maximum of about 100 transactions per second to more than 1800.
What you have in the end is a resource server, with a nice fat data pipe into it and your users segmented into multiple logical networks to optimize I/O. You can accomplish this functionality with 10Mbps hardware instead of 100Mbps, but you will still want to use switched 100Mbps hardware for the backbone (10Mbps hubs or switches with 100Mbps uplinks).
For example, if you have a SQL server that needs more network bandwidth, you can segment your network, separating your users into discrete groups to balance your load. Put one or more extra cards in your SQL server, assign them the appropriate addresses (you don't have to use DHCP) to match your groupings, add some switching or routing hardware, and away you go. Keep in mind the packet collision considerations, and you have successfully increased your bandwidth.
Don't forget to download and install Service Pack 2 (SP2) for NT 4.0. This update fixes several bugs in NT's networking components, such as duplicate addresses from DHCP, IPX performance issues, and NetBIOS session conflicts. SP2 also adds the ability to handle BOOTP DHCP requests directly from the clients. SP2 states that you must reapply it if you change or reload any network services; otherwise, the system uses the wrong libraries from the NT distribution CD-ROM. (For more information and some caveats about SP2, see Jonathan J. Chau, "Service Pack 2," and Mark Minasi, "Recovering from a Network Disaster," March 1997.)
See Also "Enterprise Testing Environment"