Highly available and efficient transport

Most organizations rely on email for internal communications as well as interactions with peers, clients, partners, and constituents. Because email has become a resource that so many of us expect—even need—to be fully available 24 X 7 X 365, the challenge is not only to maintain highly available mail servers, but also to ensure highly efficient and fault-tolerant email transport.

Exchange 2000 Server's connectivity options can help you establish fault-tolerant routing of Internet mail traffic. You can use mail exchanger (MX) records, smart hosts, and SMTP connectors and Routing Group Connectors (RGCs) to establish a set of servers that act as an email gateway to the Internet and at the same time provide fault-tolerant transport for internal and external communications.

In and Out
When an SMTP server needs to determine how to route a message to a recipient domain, the server queries DNS for MX records. SMTP's ability to use the information in these records to make routing decisions and to route mail to alternate servers is one of the protocol's most powerful features.

In its most basic format, an MX record has four fields. The first field specifies the host or domain being queried. The second field defines the DNS Resource Record (RR) type (i.e., MX). The third field specifies a numeric weighted preference between 0 and 100. The fourth field lists the SMTP host that's responsible for accepting mail for the destination host or domain. For example, suppose you send an email message to me at joseph .neubauer@compaq.com. Your SMTP server queries DNS, which returns the MX records that Figure 1 shows.

These records show that more than one MX is available to accept email for the domain, so the SMTP process uses the associated preference values to decide which server to try first. A lower preference number indicates a higher preference for use, so SMTP will use the mail2.compaq.com server first because its preference value is 10. If that server is unreachable, then SMTP will try the mail1.compaq .com server, which has a preference value of 20.

Get Smart
A generally recommended practice is to maintain at least two servers to handle inbound and outbound SMTP mail so that your email capabilities won't be cut off if one server fails. Because of the ever-present threat of spam, viruses, and other unwanted content, many IT shops also elect to set up perimeter or buffer SMTP servers to provide extended functionality (e.g., virus scanning, content filtering) and an extra level of isolation for the Exchange environment.

Figure 2 shows a common approach to this type of configuration. Several Exchange mailbox servers (i.e., Mail1, Mail2, and Mail3) and two Exchange connector servers (i.e., Hub1 and Hub2) handle external SMTP mail, and two other servers (i.e., Per1 and Per2) provide perimeter services. Per1 handles primarily inbound messages, and Per2 handles primarily outbound messages. When an incoming message arrives at Per1, the software on that server filters the message content and scans the message for viruses. The server then passes the message on to the Exchange servers. Like Per1 and Per2, Hub1 and Hub2 are responsible primarily for handling inbound and outbound email, respectively. However, each of these four servers is configured to act as a backup to its partner server (e.g., Per2 can mirror Per1's functionality, including content filtering). This configuration provides redundancy and a path in or out of the system should any one of the servers fail.

When you place a perimeter system between an Exchange server and the Internet, you don't want Exchange to use DNS to route email. If Exchange used DNS to route email in such a situation, Exchange would deliver messages directly to their destination —bypassing the perimeter system. Instead, you want Exchange to hand messages off through an SMTP connector to the perimeter server so that the perimeter server can perform its task. The perimeter system then uses DNS to route the message to its destination. To define this route between your Exchange servers and your per-imeter servers, you can use the SMTP connector's smart-host definition.

Most Exchange documentation states that you can specify one or more smart hosts by name or IP address. When you specify multiple smart-host definitions, the connector uses them in a rotating order so that the smart hosts forward email equally. This method provides a measure of fault tolerance but prevents the use of one server to handle most of the traffic in one direction. However, Exchange 2000 (as well as Exchange Server 5.5 and some perimeter-system software) perform a DNS query on the value in the smart-host field to see if the smart-host specifies an MX record. For example, Figure 3 shows the smart-host configuration for an Exchange 2000 SMTP connector, which is configured to forward email through the smart host out.your.com. Looking at Figure 4, you can see that the DNS MX records for out.your.com specify that a sending SMTP server should first attempt to connect to the Per2 server to process messages directed to out .your.com, then use the Per1 server as backup. Using the smart-host option in combination with MX records preference settings, you can use one name (e.g., out.your.com) for your perimeter servers and another name (e.g., in.your .com) for your Exchange servers in such a way that you provide fault-tolerant routing and also maintain preferred inbound or outbound server roles.

Group to Group
This MX-based smart-host configuration gives you a way to route inbound and outbound messages through your perimeter systems, but you also need a way to direct Internet email between servers within your organization. You need to configure connectors so that internal servers use the Hub2 server primarily for outbound mail and use the Hub1 server only as a fault-tolerant backup.

Usually, the most efficient way to move messages from sender to recipient is to deliver the messages directly from the sender's server to the recipient's server. By default, logically grouped Exchange servers move messages directly from source to destination server. I use the term logically because servers grouped according to routing or administrative purposes are often geographically separated from one another. Having such servers communicate point-to-point might not be optimal (e.g., when the only connection between the geographic points is one dial-up line, when bandwidth between the points is limited, when a security policy permits only designated servers to pass traffic through a firewall).

To let you better control traffic movement in your network, Exchange 2000 provides routing groups. The routing group's primary purpose is to define the boundaries of full-time, high-bandwidth network connectivity. For example, servers that connect to the same LAN or subnet usually form one routing group. Servers that connect over a WAN are usually in separate routing groups. Many organizations, especially those with geographically distributed servers, usually deploy their Exchange 2000 servers as a series of routing groups. Just as Site Connectors are the easiest way to link Exchange 5.5 sites, RGCs are the easiest way— and in most cases, the best way—to link Exchange 2000 routing groups. Like a Site Connector, an RGC defines a one-way path for messages to follow, defines the bridgeheads that are responsible for handling message transport, and has a cost value that lets you define preferred message paths.

For example, suppose an organization has its headquarters in Houston, Texas, and offices in Denver and Washington, DC. Each site has full-time, reliable connectivity for all local servers but uses T1 lines for connectivity between sites. The company therefore maintains three Exchange 2000 routing groups: Houston, Denver, and Washington. Figure 5 shows the company's logical routing group layout and Exchange connector definitions. RGCs between the three sites form multiple paths for email delivery. A message can move directly from the Washington routing group to the Denver routing group, or it can be delivered through the Houston routing group. In this example, the direct route is less preferable because the cost of the communications link between Washington and Denver is more expensive than the combined cost of the links between Washington and Houston and Houston and Denver. (The cost of the direct link is 10; the cost of the indirect link is 2.) This type of situation might occur when the cost is based per byte sent, when the link is extremely slow, when the link is accessible only through a dial-on-demand router, or when the added load might affect other applications already using the bandwidth.

Like an Exchange 5.5 Site Connector, an Exchange 2000 RGC provides a way to define a bridgehead server that acts as the focal point for messages flowing out of a routing group. Just as you can use a smart host to focus the flow of SMTP message traffic, you can use a bridgehead server to assign roles (e.g., routing, hosting mailboxes) to specific servers. Whereas the Site Connector lets you specify only one bridgehead server, the RGC lets you specify multiple bridgeheads. For a fault-tolerant configuration, then, you should specify at least two servers as bridgeheads for each RGC. However, you can't assign a preference for bridgehead use; Exchange 2000 chooses among the specified bridgeheads at random.

Tie It All Together
When you add a connector to your organization, Exchange updates the link state table with information defining the available routes that a particular type of message can take. The link state table is essentially a map of how to get from point to point, given a specific address type (e.g., Exchange, X.400, SMTP). When a user sends a message, the Advanced Queuing (AQ) engine and the Message Categorizer use information from the link state table to determine the address type and to find the available routes for message delivery. The AQ engine then works with the Routing Engine to see which routes are available and chooses the route with the least cost. The trick to building a fault-tolerant configuration is to make sure that the link state table contains information about multiple routes for the SMTP address spaces. Although you can create one SMTP connector and add multiple Exchange servers (e.g., Hub1 and Hub2 in our example) as bridgeheads, that configuration uses both servers equally for outbound mail. To assign roles to the servers, add multiple SMTP connectors.

Suppose that in the configuration that Figure 5 shows, the Houston routing group had contained two SMTP connectors named Internet 1 and Internet 2. You can assign the Exchange server Hub2 as a bridgehead for the Internet 1 connector, then define an SMTP address space with a cost of 1. In the same way, you can assign Hub1 as a bridgehead for the Internet 2 connector, but assign that SMTP address space a higher cost of 10. After this configuration replicates to all servers, the link state table will contain SMTP address space routing definitions that give preference to Hub2 for outbound mail. If Hub2 is down, mail will automatically reroute to Hub1.

Scratching the Surface
With a good understanding of how Exchange routes mail and much careful planning and consideration, you can build efficient and fault-tolerant solutions. However, building a good routing design for Exchange isn't a trivial undertaking, especially when you must deal with a variety of connectors and sites. The Web sidebar "A Sample Design" presents the necessary steps to configure the type of design that this article describes. (To read the sidebar, go to http://www .exchangeadmin.com, and enter InstantDoc ID 22859.) However, I strongly suggest that you plan, gather information, build a prototype, test, and experiment to see what works best for your situation. In the next installment of this series, I'll explain some of the configurations you can make to help fortify your Exchange servers against threats such as unauthorized relay.