Fault tolerance, RFC 1542, and more

Last month, I talked about how Dynamic Host Configuration Protocol (DHCP) works. For those just joining us, DHCP simplifies setting up workstations on TCP/IP. You just set up a Windows NT Server machine as a DHCP server, tell it about your network, and your NT server can hand out unique IP addresses to each PC on your network, greatly simplifying IP setup. The server assigns these addresses for a limited time, so DHCP clients (PCs that get IP addresses from DHCP servers) lease their addresses.

DHCP is a terrific facility, with a few quirks. However, if you understand them, you can work around them.

A New Lease
When learning DHCP, you wonder what happens when the lease runs out. Well, you're supposed to stop using the IP address. But the lease is not likely to run out. When it's half over, the DHCP client begins renegotiating the IP lease by sending a DHCP request to the server that issued the expiring IP address.

The DHCP server responds with a DHCP ACK. It contains all the information--domain name, DNS server, etc.--that the original DHCP ACK had. This information lets you change the Domain Name System (DNS) server, Windows Internet Name Service (WINS) server, subnet mask, and the like, and the new information will periodically be updated at the clients (you can specify a period, but it can be no more than 50% of the lease time).

Well, you can change the information in theory. Sometimes DHCP ACK doesn't work. Suppose you renew the lease, but the server doesn't transfer new information to the workstations. Your best bet is to open a command line and type ipconfig/release and then ipconfig /renew. (No, I don't know why it doesn't always work, but I can show you Network Monitor captures where it doesn't, and these captures led me to the release/renew technique.)

Now, if the DHCP ACK doesn't appear, the DHCP client keeps resending the DHCP request every two minutes until the IP lease is 87.5 percent expired. (Don't you wonder where Microsoft gets these numbers?) Then the client goes back to the drawing board, broadcasting DHCP discover messages (requests for an IP address) until someone responds. If the lease expires, the client will stop using the IP address, disabling the TCP/IP protocol on that workstation.

Fault Tolerance
That possibility leads me to wonder about fault tolerance. Can a backup DHCP server hand out IP addresses if the primary goes down? Not really, unfortunately. On the same subnet, you absolutely cannot run two DHCP servers that assign addresses from the same range.

However, on the same subnet, you can have two DHCP servers that assign addresses from different ranges. Suppose you have a C network, 200.100.100.0, and DHCP will give out addresses 200.100.100.20 through 200. 100.100.120. You can run two DHCP servers on the subnet and let one distribute addresses .20 through .90 while letting the other pass out .91 through .120.

Notice that you create two scopes (ranges of IP addresses on a subnet) that do not overlap. If they overlap, you run into trouble, because you have no way to make two DHCP servers coordinate which addresses to give out. Telling both servers to assign addresses in the entire .20 through .120 range and to talk to each other to make sure they don't give out the same address to two clients would be nice. But that's not possible--yet (probably, Cairo, the version of NT that will likely appear in 1998 will let you). So, you can create two DHCP servers on a subnet and give them scopes that don't overlap. If one DHCP server is down when a workstation needs a lease, another one (you hope!) has an address to spare.

What happens if two machines get the same IP address? DHCP avoids that. Right after a DHCP client gets an IP lease, it tests the lease by trying to send a message to the address. A response means the DHCP server gave the client an address that someone else is using. The client's response is to tell the user that it received a duplicate IP address, and then to stop using TCP/IP. This approach has always seemed odd to me--why not negotiate further with the DHCP server to get an acceptable address?

RFC1542
DHCP can simplify the IP assignment problem for each subnet, and you can create a weak kind of fault tolerance with multiple DHCP servers per subnet, but gosh, that sounds like a lot of servers! If a DHCP server doesn't have to be physically on the same subnet that it serves, you can dedicate a couple of machines to handing out DHCP addresses, and they can serve all your network's subnets.

You can do that, at least in theory. Recall how DHCP works. A workstation broadcasts a discover message. A DHCP server responds with an IP address in an offer message. The workstation accepts the offer and is ready to go. But wait­if the initial "gimme an IP address" message is a broadcast, how can a DHCP server on another subnet hear it? A router (or perhaps several routers) has to retransmit that broadcast for the DHCP server to hear it in the first place, and most routers don't retransmit broadcasts!

The answer is in a Request for Comments (RFC) concerning DHCP or, rather, a predecessor to DHCP, bootp. RFC 1542 describes how a router can recognize the special broadcasts a DHCP client generates, so that the router knows to retransmit those broadcasts. To create DHCP servers that serve clients from across routers, the routers must run software to make them RFC 1542 compliant, or the routers must support bootp forwarding--both phrases mean the same. If your routers won't cooperate, yes, you must have at least one DHCP server on each segment that you want to put DHCP clients on.

If you're running the NT 4.0 beta or have installed Service Pack 4 (ftp.microsoft.com/bussys/winnt/winntpublic/fixes/usa/NT351/ussp4) on NT 3.51, either NT Server or Workstation can be an IP router. Also, with the Multi-Protocol Routing (MPR) in the Service Pack 4 directory or with any version of NT 4.0, you can enable bootp forwarding. In theory, that ability means you can make an NT machine into an IP router that supports bootp forwarding, but my success with getting NT bootp forwarding has been uneven. If you experiment with it, remember that if one DHCP server is providing addresses to multiple subnets, you'll need one scope for each subnet. (The DHCP server will not let you create multiple scopes on one subnet on a given server--you can have multiple scopes on one subnet, but you need separate machines running DHCP server to do it.)

Another consideration: If a DHCP server serves several subnets and its adjacent routers support bootp forwarding, the server must expect to receive DHCP discover broadcasts from any one of those subnets. So how does the DHCP server know which subnet the broadcast came from--how does the server know which subnet range to draw from when assigning an IP address to a client?

The answer lies in how bootp forwarding works. A bootp forwarding-enabled router will retransmit (forward) a DHCP discover broadcast. But when this router forwards the broadcast, it adds data, a note saying, "To anyone who hears this: This is a broadcast that I originally found on a different subnet, subnet x.y.z.a." Then, if a DHCP server receives a broadcast that was retransmitted over one or more routers, the server will know what subnet to direct the response back to and which scope to pull a number from for its offer.

So these are the main DHCP quirks and how to work around them. For more information about DHCP, see "Implementing and Administering DNS," page 121, and John Enck, "Take a Number," October 1995.