You can dance with the mainframe dinosaur

Several years ago, experts predicted that IBM mainframes would die a quiet death as most businesses moved to PC-based networks. But time has proven that mainframe hosts continue to play a mission-critical role in the enterprise. Today, about 70 percent of business data--from Wall Street firms to fast-food chains--resides on mainframes. However, business has changed the way it accesses mainframes. Business now uses Windows on the desktop to access networks instead of coaxing old dumb terminals to mainframes. In addition, the computer industry is developing GUI and Web-based applications, letting users interface with mainframe information intuitively.

Microsoft SNA Server is an SNA gateway service that can link your IP-based Windows NT network to the SNA-based glass house (i.e., mainframe data center). Although IBM mainframe and AS/400 operating systems (OSs) can speak TCP/IP now, SNA Server lets you offload TCP processes from mainframe and AS/400 hosts to prevent performance degradation. SNA Server also lets you integrate your traditional mainframe programs with modern NT applications via Microsoft's component object model (COM) and ActiveX technologies without having to rewrite the host applications to understand TCP.

As soon as you add SNA Server to your network, you'll likely find that almost everyone in your company wants to access it and requires 24 * 7 availability. Therefore, before you deploy SNA Server in your enterprise, consider how the SNA Server service's fault tolerance and high availability will function within your network design. Fortunately, Microsoft has done a good job of incorporating fault-tolerant features into SNA Server. In this article, I'll help you understand these fault-tolerant features, and I'll discuss how you can build fault-tolerant SNA services within and across SNA servers and subdomains for your NT network. (See the sidebar "A Quick Glance at SNA Server 4.0 Features," page 178, for a listing of additional SNA Server features and functions.)

SNA Subdomains
SNA Server uses a domain concept that's similar to NT's domain concept. An SNA subdomain is a group of SNA servers within an NT domain. An SNA subdomain comprises a primary server, backup servers, and member servers. One primary server (similar to a Primary Domain Controller--PDC) contains the entire subdomain's master configuration file, which reflects the SNA resources in the subdomain (e.g., SNA servers, link services, physical units--PUs, logical units--LUs, users, and security information). You must designate the first SNA server in a subdomain as the primary server. A backup server (similar to a Backup Domain Controller--BDC) keeps a read-only copy of the configuration file. The backup server starts the SNA Server service by loading this local configuration file. You can promote the backup server to primary server if the primary server fails. Therefore, in order to provide a fault-tolerant configuration file, you must have at least one backup server in an SNA subdomain. An SNA member server doesn't have a local copy of the configuration file. It receives the configuration information from the primary server or a backup server and loads the information in memory. A maximum of 15 SNA servers can exist within an SNA subdomain, but you can create as many SNA subdomains within an NT domain as you want. If you have 15 or fewer SNA servers, you can group them in the same subdomain on the same LAN.

When you need to make a change to your SNA subdomain configuration, you can modify the master configuration file on the primary server from any SNA server or an SNA Server Manager workstation (an optional component in the SNA Server Client). The primary server will replicate the configuration file to all backup servers after it commits the change. All SNA servers in the same subdomain communicate with one another through server broadcasts to update the configuration information and status of SNA resources. Server broadcasts occur every 60 seconds by default. Microsoft refers to the protocol you choose for server broadcasts (e.g., TCP/IP) as the server-to-server protocol.

Client to Subdomain Access
SNA Server Client, which runs on a workstation as a service or program, lets the workstation use a client-to-server protocol (e.g., TCP/IP, IPX, and Net BUI) to communicate with SNA servers in the subdomain. Most third-party terminal emulators, Advanced Program-to-Program Communications (APPC) applications, and Microsoft's COM Transaction Integrator (COMTI) running on remote computers use SNA Server Client to connect to SNA servers. A client can search for SNA servers by subdomain name or by server name. To search by subdomain name, the client must specify the subdomain name; however, the client workstation must reside on the same subnet as the SNA servers the client is searching for. This condition is often not the case in a multisubnet network. Searching for SNA servers by server name doesn't carry the same-subnet requirement.

When a client searches by server name, the client needs to specify one or more SNA server names or IP addresses in the SNA Client Configuration dialog box, as Screen 1 shows. A server defined in the client configuration server list, a sponsor server, can be any SNA server in an SNA subdomain. When a client needs to establish a 3270 session with a mainframe, or an APPC session with a mainframe or AS/400 host, the client sends the request to a sponsor server the client chooses from the server list, either randomly or by progressing through the list in order. The server receiving the request sets up a special sponsor connection with the client and returns a list of all available SNA servers in the subdomain. The client can try to connect to every server in the list until it finds a server that can establish the requested 3270 or APPC session.

An option on the Client Mode tab in the Client Configuration dialog box lets the SNA server update the sponsor server list dynamically so clients have a maximized selection range of sponsor servers. When you select this option, the sponsor server that establishes the sponsor connection will add all available SNA servers in the subdomain to the client configuration server list, including any new SNA servers you added to the subdomain after the initial SNA Server implementation.

Backup Subdomains
In addition to letting clients choose a server to establish a sponsor connection, SNA Server lets clients automatically access a server in the secondary backup subdomain if all sponsor servers in the primary subdomain are unavailable. For example, a client's primary subdomain is SNADOMAIN, and you want the client to fail over to the backup subdomain SNADOMAIN2 when SNADOMAIN fails. To implement this cross-domain fault tolerance, you define the backup subdomain name or the server names in the backup subdomain on the Client Backup Configuration tab of the SNADOMAIN Properties dialog box. Screen 2 shows a backup configuration with the names of the servers in the backup subdomain defined in the Backup Sponsor Server text box.

The client automatically receives and accepts backup subdomain information by default when it connects to a sponsor server in the primary subdomain. If the client's sponsor connection is active, the client will also automatically receive updated backup subdomain information when you change the client backup configuration of the primary subdomain.

Although SNA Server's cross-domain backup feature works well, if you have enabled dynamic updating of the client configuration sponsor server list, you must carefully plan how to use cross-domain backup with dynamic updating. When you enable the dynamic updating option, the client will replace the client configuration sponsor servers listed with the client backup configuration backup sponsor servers when the client connects to the backup subdomain. For example, if a client fails to connect to any sponsor servers in its primary subdomain, it will connect to a backup sponsor server in its backup subdomain. At that point, the client will think the primary subdomain doesn't exist and will regard the backup subdomain as its new primary subdomain. The client will then update its primary sponsor server list with the backup information it receives. Even when the primary subdomain comes back online, the client will consider its backup subdomain to be its primary subdomain.

You can resolve this sponsor-recognition problem in three ways. The first solution is to not use the dynamic update option. The second way is to manually change the primary sponsor server list back to the original sponsor servers in client configuration. The third way is more complicated. When a client's original primary subdomain comes back online, you can configure the client's acting primary subdomain (originally the backup subdomain) to use the original primary subdomain as the acting backup subdomain. As I described before, an active client automatically receives the backup subdomain information from the connected subdomain. Next, you take the acting primary subdomain offline manually to fail over the client to the original primary subdomain (masquerading as the current backup subdomain). Using dynamic updating, the client will automatically change the primary sponsor server list back to the original.

Fault Tolerance for Native Host Access
So far, you've seen that SNA Server provides a good client-to-server fault-tolerant mechanism that lets clients connect to an available server in the primary subdomain or a backup subdomain. Now let's look at the fault-tolerant methods SNA Server offers from server to host when a connection or server fails.

From the host perspective, an SNA server appears as a device in the hierarchical SNA network or the peer-oriented APPC network. An SNA server connects to a mainframe through a Token-Ring network, Ethernet, Fiber Distributed Data Interface (FDDI), Synchronous Data Link Control (SDLC), X.25, or Enterprise Systems CONnection (ESCON) channel. Likewise, an SNA server connects to an AS/400 host through any of these links except the ESCON channel. You can have many host connections (i.e., PUs) on each physical link. You can also have multiple physical links on one server, which lets an SNA server connect to multiple hosts.

For fault tolerance, you must define two or more connections on one or more SNA servers for mainframe 3270 access, and multiple LU6.2 pairs for AS/400 5250 access or APPC applications. SNA Server uses a pooling method to hot-backup mainframe connections on a single server or multiple servers, and an LU alias pair mechanism to hot-backup AS/400 access among multiple servers.

To understand these fault-tolerance mechanisms for mainframes, let's investigate the scenario Screen 3 illustrates. Suppose your SNA subdomain SNADOMAIN contains two servers: SNASERVER1 and SNASERVER2. A client needs to connect to the mainframe through the subdomain, so you create a connection, 1DEMO1, on SNASERVER1, and another connection, 2DEMO1, on SNASERVER2. (Screen 3 doesn't show 2DEMO1.) You insert 253 or fewer LUs in each connection. Then you create an LU pool, 3270POOL, for the subdomain, and you drag all LUs from both connections to the pool. The right panel in Screen 3 shows all the LUs I defined for the pool 3270POOL. The highlighted LUs in Screen 3 belong to the connection 1DEMO1 on SNASERVER1, and the remaining LUs are from the connection 2DEMO1 on SNASERVER2. When a client requests a 3270 session, the SNA subdomain assigns the client an available LU from the pool. If a connection (e.g., 1DEMO1) or a server (e.g., SNASERVER1) stops working because of a hardware or software failure, the connection 2DEMO1 on SNASERVER2 can still provide mainframe access for the subdomain. You can add as many LUs from different connections to an LU pool as you like. SNA Server supports up to 30,000 sessions per server, which can turn into a very large number of sessions if you have several SNA servers in a subdomain. The LU pool can also provide load balancing between multiple connections and servers.

To set up fault tolerance for AS/400 or APPC application access, you can define a local (i.e., SNA server) and remote (i.e., AS/400 or APPC application) LU pair on both servers (SNASERVER1 and SNASERVER2) in Screen 3. You must use matching local and remote LU alias names for both LU pairs. For example, you have one LU pair for SNASERVER1, LOCAL1, and REMOTE1, and another LU pair for SNASERVER2, LOCAL2, and REMOTE2. You use the same local LU alias name, such as LOCAL, for LOCAL1 on SNASERVER1 and for LOCAL2 on SNASERVER2. Similarly, you use the same remote LU alias name, such as REMOTE, for REMOTE1 on SNASERVER 1 and for REMOTE2 on SNASERVER2. When the client asks for the alias LU pair, the SNA subdomain chooses an available physical LU pair on either SNASERVER1 or SNASERVER2.

You can extend the alias LU concept to more than two servers in the subdomain, and you can have multiple alias LU pairs for a server. You can assign default alias LU pairs to users, groups, or everyone. If the default alias LU pair is not available, a client can choose another alias LU pair, or the SNA subdomain can dynamically assign an available alias LU pair to the client. The alias LU pair mechanism provides fault tolerance and hot backup, as well as load balancing.

SNA Server's LU pooling and alias pairing mechanisms give users fault-tolerant SNA services. However, the failover process is not transparent. A failed connection will terminate all active sessions on the connection, interrupting users participating in those sessions. The interrupted users must restart the sessions. Fortunately, SNA Server can use available LUs from the LU pool or available alias pairs to immediately establish new sessions, according to the requirements of the requested sessions.

Fault Tolerance for TN3270 and TN5250 Clients
SNA servers can function as TN3270 servers to let TN3270 clients on IP machines access mainframe hosts. SNA servers can also function as TN5250 servers to let TN5250 clients on IP machines access AS/400 hosts. You use LU pooling and alias mechanisms to implement fault tolerance for an SNA TN3270 or TN5250 server. For an SNA TN3270 server, the LU type you use is LU Application (LUA). You can add a maximum of 253 LUAs to a connection to a server. You group all LUAs from multiple connections across multiple servers into an LUA pool for the subdomain, and you assign the defined LUA pool to an SNA TN3270 server. You assign an LU alias pair to an SNA TN5250 server.

SNA Server's fault tolerance for TN3270 and TN5250 clients is not perfect. The SNA subdomain can't automatically switch the TN3270 or TN5250 client to another TN3270 or TN5250 server if the primary server fails. In addition, a TN3270 or TN5250 client can talk to only one TN3270 or TN5250 server to initialize a session. This single-server contact becomes a single point of failure, even though you group multiple connections for the TN3270 server and assign an alias LU pair to the TN5250 server. To overcome the single-server-contact shortcoming, you need to dedicate one LUA pool to each TN3270 server and one alias LU pair to each TN5250 server in the SNA subdomain. That way, if one server fails, the client can switch to another server manually. However, the manual-switching process can be tedious for end users. The good news is that you can automate the switching process.

One way to automate SNA TN3270 and TN5250 server switching is to use Microsoft Cluster Server (MSCS). To use MSCS to automate switching, install it on two SNA servers, create a resource group, add an IP address for the resource group, define the TN3270 or TN5250 service as a generic service in the resource group, and specify a primary server for the resource group. Set the TN3270 or TN5250 service to start manually on both SNA servers running installed MSCS. When MSCS starts, it brings the primary TN3270 or TN5250 service online. Then, if the primary TN3270 or TN5250 server fails, MSCS automatically starts the backup server. To connect to the TN3270 or TN5250 server, a client uses only the single IP address you defined for the cluster resource group, or the Domain Name System (DNS) name of that IP address.

Another automatic-switching method that will be possible with NT 5.0 is to map a host name to the IP addresses of TN3270 or TN5250 servers in your DNS database. However, DNS's round-robin function, which routinely resolves an IP address for a host name associated with multiple IP addresses, will not provide the expected fault-tolerance in this case. (For information about round-robin functionality in DNS, see Douglas Toombs, "Load Sharing for Your NT Web Server," April 1998.) However, with NT 5.0's dynamic DNS you might be able to develop a custom utility to dynamically add and remove host name and address records in the DNS database when the status of monitored TN3270 and TN5250 services changes.

One automatic-switching solution currently in development is a TN3270 or TN5250 proxy server for networks. A proxy server contains a list of TN3270 or TN5250 servers and automatically returns the name of an available server to a client request. OpenConnect Systems, a company known for its host-access products, is planning to implement proxy server fault-tolerance in its WebConnect product. WebConnect will let a Microsoft Internet Explorer (IE) or Netscape Communicator browser access mainframes and AS/400 hosts through SNA Server's TN3270 and TN5250 services.

Three Deployment Models
SNA Server provides three deployment models--branch-based, centralized, and distributed--that let you flexibly integrate your NT network with mainframe and AS/400 hosts. Microsoft refers to these models as SNA Open Gateway Architecture (SOGA).

In the branch-based model, you place SNA servers in branch offices (a term used in SOGA) that are close to end users. An SNA server communicates with hosts located in a data center, central site, or headquarters with SNA protocols through a WAN link, such as a leased line, frame relay, or asynchronous transfer mode (ATM). You need two SNA servers in the branch office subdomain for fault tolerance and high availability. If you have only one SNA server, you might have to let one branch office fail over to another branch office's SNA subdomain, which can create difficulties because of bandwidth or administrative constraints.

If you have hundreds of remote offices but no SNA-enabled network administrators, SNA Server's centralized model will fit your network's administration plan. With the centralized model, you place SNA servers in a central location and simply group them into a big SNA subdomain to provide fault tolerance and hot backup. However, the centralized model can result in a slow client response if your WAN doesn't have adequate bandwidth, because end users must access the central SNA servers through the WAN.

SNA Server's distributed model combines the branch-based and centralized models to provide the best performance and fault tolerance. Figure 1, page 176, illustrates the distributed model, in which you place one or more SNA servers in a branch office, and two or more SNA servers in a data center. The central servers connect to local hosts through a very fast link, such as an FDDI. The branch servers, which are close to end users, use a LAN adapter to connect their local link services to the distributed link services on the central servers through a routing protocol (e.g., TCP/IP) across a WAN link instead of an SNA protocol. This way, branch servers can reach hosts faster. In configuring the branch server, you can enable multiple distributed link services in the central site for fault tolerance and hot backup. The branch servers can use alternative link services at a secondary site, such as a backup data center, in case the primary site fails. With the distributed model, you can easily implement a cross-domain failover. The central SNA subdomain can be a backup subdomain for all branch offices. If the branch SNA subdomain is not available, the remote client can fail over to the central subdomain to establish a new session with minimum interruption.

Dancing with the Dinosaur
Now you know about the fault-tolerant features in SNA Server 4.0. Before you move your mainframe and AS/400 users to an NT-based network, you need to include SNA Server's fault tolerance and hot backup in your network plan. When you implement SNA Server in a planned deployment of your network, high-quality server hardware, and Microsoft cluster services for Mail Transfer System (MTS), Internet Information Server (IIS), and COMTI, you can deliver reliable SNA services to your users and application developers for 24 X 7 access. Then you can safely dance with the mainframe dinosaur without carrying a pager at night.