Fibre channel Storage Area Network enhances storage management

Managing and administering large amounts of storage can be challenging when you're performing installations, configurations, backups, and failure recoveries. However, Storage Area Networks (SANs) make your task easier because SANs let servers access network-based storage as though the storage were local. SANs also let servers access network-based storage faster than an Ultra SCSI interface can. The Amdahl LVS 4600 Storage System is a SCSI or fibre channel-based SAN that offers high availability through full redundancy, support for Windows NT and several UNIX platforms, and the performance and scalability that large enterprise applications need. The system I tested supported a dual fibre channel host attachment and included twenty 18GB Ultra 2 SCSI disk drives in two cabinets.

Amdahl designed its SAN architecture to provide servers with high-speed access to SAN-attached storage. The LVS 4600 uses a network other than the LAN that client computer systems use. Each server is connected to a client network (typically by Fast Ethernet) and has a separate fibre channel or Ultra SCSI connection to the SAN. SAN-support software that each server includes lets the server see and communicate with SAN-attached storage as though the storage were a locally attached SCSI subsystem. For example, when you want server access to shared storage, you can install both a traditional network that your client computers use to access each server, and a fibre channel-based network (to give your servers 100MBps full-duplex access) or an Ultra SCSI-based network (to give your servers 40MBps data access) that your servers can use to access shared storage.

The network heart of the LVS 4600 is McDATA's ED-5000 Enterprise Fibre Channel Director, a fibre channel switch that you can configure for as many as 32 fibre channel ports. (For more information about the ED-5000 Director, visit McDATA's product Web site at http://www.mcdata.com/products/datasheets/5000web.html.) The LVS 4600 consists of a base unit and one or more disk drive trays. Each base unit includes five Ultra SCSI channels that connect the disk drives, and slots for two Availability Managers. Amdahl offers Availability Managers for multimode fibre channel or Ultra SCSI SAN-host attachments. Both versions include processors and logic that support failover when you install redundant Availability Manager modules. You can configure two Availability Managers as a fault-tolerant pair that the support software refers to as Controller A and Controller B. Each Availability Manager can have as much as 128MB of read-write cache and can include battery backup. You can use redundant Availability Managers to configure their cache as mirrored for additional fault tolerance or unmirrored for larger cache capacity. Each drive tray can hold 10 low-profile disk drives that can connect all 10 disks to one SCSI channel or that can connect five disks per channel to two SCSI channels. You can connect a maximum of 50 disk drives to one base unit when you connect one drive tray with 10 drives to each of the five Ultra SCSI channels. Amdahl currently supports 10,000rpm drives in 9.1GB, 18.2GB, and 36.4GB capacities, and 7200rpm drives in 9.1GB, 18.2GB, and 50.1GB capacities.

Amdahl's A+LVS software provides support for NT and several UNIX versions. A+LVS software installs components that let you configure disk drives in RAID-based logical volumes and that provide the driver support that the host OS needs to access the volumes. A+LVS software also supports failover when the host includes installed redundant host bus adapters (HBAs). Amdahl provides two A+LVS software versions. The A+LVS host version communicates over the host-to-LVS 4600 connection (which is the fibre channel in my test unit) to configure arrays. The A+LVS/NetDirect version uses an Ethernet 10Base-T connection to the LVS 4600 base unit for configuration.

Installation
McDATA engineers who represent Amdahl helped me with my initial LVS 4600 installation. The product includes two sets of documentation—one for the ED-5000 Director and one for the LVS 4600. The LVS 4600 documentation needs an expanded explanation of fault-tolerant features and their configuration.

After you're familiar with the LVS 4600 architecture and configuration tools, you'll find setting up and configuring the SAN a breeze. A complete installation includes several steps. You first need to cable all the LVS 4600 components together and power them up. Next, connect the servers that will use the SAN storage. This step requires that you install one or two fibre channel HBAs, connect to the ED-5000 Director with fiber-optic cables, install the HBA driver, and install the LVS support software in each computer. You next configure the ED-5000 Director's zones, which are groups of devices that will talk with one another. Your last step is to use the host's LVS software to configure the disk drives in RAID volumes.

The physical setup is straightforward. You use standard 68-pin, single-ended SCSI cables to connect the drive trays to the base unit, and you use a pair of full-duplex, multimode fiber-optic cables to connect the base unit to the ED-5000 Director. Amdahl provides an IBM PS/2 computer with the ED-5000 Director as a configuration and control console, although you can install the Enterprise Fabric Connectivity (EFC) Management software on NT PCs that support a 10Base-T connection to the ED-5000 Director. The ED-5000 Director control console includes preinstalled, ready-to-use EFC Management software. One control console can configure and control as many as 36 switches. Your first step in the configuration process is to give the ED-5000 Director a user-friendly name, which is handy when you manage several ED-5000 Directors from one EFC Management console. You can choose from several views of the ED-5000 Director. For example, the Physical View displays a diagram of the ED-5000 Director's front, each slot, and each installed module, and virtual LEDs that indicate each module's health and active state. The Port List View displays a list of the installed fibre channel ports and their active state, as Screen 1 shows.

When the Availability Managers in the LVS 4600 base unit are cabled to the ED-5000 Director and both units are powered on, link lights on the ED-5000 Director's port card show the port to be in Active mode. Every fibre channel interface has a unique World Wide Name (WWN) number, which is analogous to a NIC's media access control (MAC) address. When you click an active port in the ED-5000 Director's EFC Management software's Port List View, the port's properties display, including the switch port's and the attached device's WWNs. You need to know the WWN of each device that is attached to the ED-5000 Director to configure which devices will communicate with one another.

After the LVS 4600 can communicate with the ED-5000 Director, you can set up the servers that will use storage on the SAN. The physical fibre channel interface is an HBA. For my test, Amdahl provided LightPulse LP7000 PCI HBAs that feature a 32-bit PCI interface, and LightPulse LP8000 PCI HBAs that feature a 64-bit PCI interface. (For information about these Emulex products, visit the company's Web site at http://www.emulex.com/index/html.) I installed the HBAs in 32-bit slots in each of the servers that I used in my test. For high availability, the LVS 4600 supports redundant HBAs in servers. I used the Emulex SCSI Mini Port 4.31 driver and initially installed one HBA in each server for my test. The LVS 4600 fully supports the Emulex miniport driver and offers limited support for a full port driver in multihost contexts. From Control Panel, click the SCSI Adapters icon and select the Drivers tab to install the miniport driver. This procedure is similar to the way you install a standard SCSI controller driver. Reboot to complete the miniport driver installation. Figure 1 illustrates a typical LVS 4600 implementation.

You can use the A+LVS CD-ROM to install the software on the servers. After the installation completes (i.e., after the OS reboots), you need to change the topology driver parameter so that the HBA can negotiate a link with the ED-5000 Director. You need to update the firmware in the LVS 4600 to be consistent with the A+LVS software version your system is using. The update is an easy process and takes only a few minutes. The LVS 4600 firmware files ship on a separate CD-ROM, and you copy the firmware files to the LIB directory that the A+LVS software creates during the installation process. Select the Firmware Update function from the Management & Tuning program, which the A+LVS software installed during installation, to complete the update process.

Next, configure the ED-5000 Director so that the server can communicate with the LVS 4600. You define zones and zone sets to control which ED-5000-connected devices can communicate with one another. A zone is a group of devices that you configure to communicate with one another, and a zone set is a collection of one or more zones that will operate together. Zone members can be a specific port on the ED-5000 Director, a WWN, or an alias that you define to include one or more WWNs. For my test, I created an alias for the WWN of each ED-5000 Director port and each server HBA. Doing so let me use the names that I created rather than the WWN numbers. You place servers that have only one HBA in zones that contain both Availability Managers. You place servers with redundant HBAs in two zones, each with one HBA and one Availability Manager.

The Catch-22 Zone
Zones that you configure on the ED-5000 Director let you define which ports on the switch can talk to one another. At a minimum, a zone must include the server's WWN and the WWN of the Availability Manager that controls the drive group on which the server's dedicated LUN is located. (I define drive groups shortly.) When a zone is active, the server will see all the LUNs assigned to the zone's Availability Manager and consider these LUNs as local SCSI volumes. However, if you have several servers and each server owns different LUNs that the same Availability Manager controls, a user at one server can unintentionally write and corrupt the LUN that another server owns. If several servers share an LVS 4600, you would like to give each server exclusive access to the LUNs it will be using. NT assumes that it has exclusive use of the devices it recognizes as local disk drives, so sharing one LUN among several servers won't work. However, when you use the Emulex miniport driver (the only driver that Amdahl fully supports), the LVS 4600 architecture doesn't include the ability to assign a LUN to a specific server. In the LVS 4600, Availability Managers implement failover. When an Availability Manager fails or a link between an Availability Manager and an HBA fails, the second Availability Manager in the pair assumes control of the drive groups and associated failed module LUNs. When you set up zones so that a server has access only to the Availability Manager that failed, that server can no longer access its data. Each server must be able to access both Availability Managers to access its data after a failover. Because the miniport driver lets servers access all LUNs that Availability Managers own and that the driver can see, all servers can access all LUNs that the LVS 4600 controls. You need to be careful that one server doesn't unintentionally corrupt the disk another server owns. Of course, when the LVS 4600 is dedicated to one server, you don't have this problem.

Emulex has designed the SCSI Port 1.24 driver, a full port driver with more functionality than the SCSI Mini Port 4.31 driver. You use the full port driver to limit a server's view of the LVS 4600 to only the LUNs that the server needs. However, Amdahl didn't fully support the SCSI Port 1.24 driver at press time because the company hadn't completed its quality assurance testing on the product.

Data-Backup Software Vendors
COMPUTER ASSOCIATES 516-342-5224
http://www.cai.com
LEGATO SYSTEMS 650-812-6000
http://www.legato.com
VERITAS SOFTWARE 650-335-8000
http://www.veritas.com
Configuring Storage
The next step in my test was to define storage volumes on the LVS 4600 that my NT server could access. To configure the LVS 4600 disk drives, you first need to define drive groups and LUNs within a drive group. A drive group is a set of one or more disk drives that you define as one RAID array. Because you want to maintain fault tolerance when you define drive groups, you typically create a drive group with one disk drive from each SCSI channel. Use a RAID level other than RAID 0, which doesn't have fault tolerance. With this configuration, the RAID array can continue to operate even if one SCSI channel fails. You can create one or more LUNs in a drive group similarly to the way you create one or more partitions on a disk drive in NT. You use the Create LUN menu option in the A+LVS Configuration utility to configure LUNs and drive groups. The Configuration utility defaults to one drive on each SCSI controller as a RAID 5 array (i.e., striping with fault-tolerant parity data that is spread among all the disks) and one LUN filling the drive group's space. Because the system I tested had drives attached to four SCSI channels, A+LVS created a four-drive RAID 5 array and one large LUN. The system also supports RAID 0 (striping with no fault tolerance), RAID 1 (mirroring), and RAID 0+1 (mirrored stripe sets). You can override the utility's default choices for disk drive selection, segment size, cache use, and LUN size. If you don't allocate all of a drive group's space initially, you can create additional LUNs later.

You can configure the LVS 4600 with redundant Availability Managers. Each Availability Manager has a 128MB cache and a fibre channel interface. Two Availability Managers provide the fault-tolerant mirrored cache and a second fibre connection to the ED-5000 Director. When you install redundant Availability Managers, you use the A+LVS utility to assign each drive group to one module or the other of the pair as the primary communication path to hosts. The Configuration utility automatically assigns all newly configured drive groups to one Availability Manager; you can manually select this assignment to give you control over the I/O workload that routes through each Availability Manager. When one Availability Manager fails, all of its drive groups fail over to the second Availability Manager, and applications continue uninterrupted. However, this scenario works only when the ED-5000 Director zones let each server communicate over both Availability Manager modules.

ED-5000 Enterprise Fibre Channel Director
To meet the needs of high-availability applications, McDATA designed the ED-5000 Director with a high degree of fault tolerance. The system consists of the ED-5000 Director and an IBM PC—the EFC Server—that functions as a system manager. One EFC Server can monitor and manage as many as 36 ED-5000 Directors, which are connected by a 10MBps Ethernet network. The EFC Server runs the EFC Management software under NT Workstation. The EFC Server also implements the Call Home feature. When you configure and enable this feature, it will call the Amdahl Worldwide Consumer Support Center when hardware failures occur. The EFC Server uses NT's Dial-Up Networking features and SNMP reporting so that the ED-5000 Director can take advantage of the Call Home feature.

McDATA draws a distinction between a fibre channel switch and a fibre channel director; director implies the highest fault-tolerance levels. The company designed the ED-5000 Director for high availability. A fully configured ED-5000 Director includes a chassis with redundant power supplies; eight four-port, full-duplex port cards for up to 10 kilometers of single-mode fibre or up to 500 meters of multimode fibre; two central processor (CTP) cards; two central memory module (CMM) cards; and two Message Path Controller (MPC) cards. The CTP cards include the 10Base-T interface for the EFC Server connection and coordinate the ED-5000 Director's activities, including power-up system initialization. The CMM cards provide the storage that supports each port's store-and-forward function. This function provides a temporary holding area for data traveling to another port via a device that is attached to the originating port. The MPC cards are the heart of the switching function because they coordinate the data transfer between ports. All cards are hot-pluggable, so you can replace them without interrupting system operation. The system contains two of each of the three types of cards, so when a CTP, CMM, or MPC card fails during system operation, the spare of its card type takes over system operation. To accommodate port card failure, you can use two HBAs in each host and connect the HBAs to different port cards.

Failover Testing with the Miniport Driver
When you configure a server for high availability, you install two HBAs and the A+LVS Redundant Disk Array Controller (RDAC) software to support failover if one communication path fails. RDAC installs as part of the A+LVS software and as a separate component when you use the A+LVS/NetDirect software. I installed the A+LVS software and two LP8000 HBAs in a Compaq system that used the Emulex SCSI Mini Port 4.31 driver. I used the EFC Management software's Manager to create two zones. I defined one zone to let the first HBA communicate with the LVS 4600's primary Availability Manager and defined the other zone to let the second HBA communicate with the secondary Availability Manager. After I enabled this new zone set and rebooted the Compaq system, the OS was able to see all the LUNs on the LVS 4600. To test failover operation, I disconnected the cable between the second HBA and the ED-5000 Director while I copied a 5GB file from a LUN on the primary Availability Manager to a LUN on the secondary Availability Manager. The cable disconnection interrupted the copy operation for about 40 seconds. The LVS 4600 took about 30 seconds to notice the failed link and another 10 seconds to complete the failover process. The LVS 4600's failover process placed the secondary Availability Manager and its HBA offline. The primary Availability Manager then became responsible for the LUNs that the other Availability Manager, which was zoned to communicate with the unplugged HBA, had serviced.

This test simulated a cabling failure between the HBA and the ED-5000 Director. I plugged the cable back into the ED-5000 Director to simulate cable repair. You must use the A+LVS Recovery utility to complete recovery of the link, which is a manual process. The A+LVS software installs this GUI-based utility, which you can access from the Start menu. The Recovery utility includes the Recovery Guru feature, which lets you query Availability Managers and determine their operational status. In my test, the Recovery Guru reported that a RAID module was offline. After I selected the offline RAID module, I used the Recovery utility's Manual Recovery-Controller Pair option to bring the offline module back online. The recovery process took about 2 minutes, and the pattern of flashing LEDs on the ED-5000 Director showed that the copy operation was again using both data paths.

My test proved that failover is automatic in a system that has the Emulex miniport driver. The failover was invisible to the end user except for the brief pause in the application's ability to access data. Recovery, though a manual process, was clean and nondisruptive to ongoing data operations.

Failover Testing with the Port Driver
Amdahl provides limited support for the Emulex SCSI Port 1.24 driver for customers who implement multihost configurations on the LVS 4600. I wanted to test the driver's ability to grant individual LUN access to SAN storage. (The miniport driver permits access to all LUNs on the LVS 4600 controller and to all LUNs on both controllers when you install redundant HBAs for failover support.) I tested the port driver with an LP8000 HBA (firmware version 2.82). After I completed the configuration, the driver worked; however, the driver requires you to define LUN 0 to all HBAs. I created a small LUN 0 that no server would actively use to circumvent this restriction. I limited the server's view of the SAN to the LUNs that the server had permission to use. Failover succeeded in my test and at faster speeds than with the miniport driver. However, I couldn't restore operation after the failover. Amdahl needs to improve this driver so that it works as well as the miniport driver does. (For a detailed account of my Emulex SCSI port driver installation and test, see the sidebar "The Emulex SCSI Port Driver Failover Test.")

Performance Testing
I used single and multiple workloads to test whether the LVS 4600's performance would degrade when I added workloads to other LUNs in the system. I set up three networks—one to service a Microsoft SQL Server 7.0 application, one to service a Microsoft Internet Information Server (IIS) 4.0 Web workload, and one to service a file server. I created 4 four-drive LUNs to host the data for these applications. You can place more than one LUN on a drive group, but for my test, I dedicated all the space on each drive group to one LUN. I placed the SQL Server data on a RAID 5 array and SQL Server log files on a RAID 0+1 array. I also placed the Web site and file-server data sets on RAID 5 arrays.

A Gateway Advanced Logic Research (ALR) 9200 system with four Pentium II Xeon 400MHz processors hosted SQL Server 7.0 Service Pack 1 (SP1) with NT 4.0 SP6a. I installed Emulex's LP8000 in a 32-bit slot to provide the fibre channel attachment. To stress the SAN storage during testing, I limited storage on the Gateway system to 1GB and defined a 22MB pagefile to keep paging activity on the local disk from becoming a bottleneck. I used Benchmark Factory software to generate a workload with an online transaction processing (OLTP) suite that works with a warehouse-model database. (For information about Benchmark Factory software, visit the company's Web site at http://www.benchmarkfactory.com.) I generated a 7GB database and unpinned several tables that typically stay in a system's memory to enhance application performance. Unpinning these tables helped to generate more I/O against the SQL database so that I could see how the LVS 4600 handled high I/O rates.

For my Web test, I used a Compaq system with two Pentium II Xeon 450MHz processors and 256MB of RAM. I installed Emulex's LP8000 in a 32-bit slot to provide the fibre channel attachment. This system ran IIS 4.0 with NT 4.0 SP6a. The workload was Benchmark Factory 2.0's Standard File workload, which can retrieve static HTML pages ranging from 500 bytes to 5MB.

I used an Intergraph TDZ-2000 workstation with dual-Pentium II 333MHz processors and 128MB of RAM as a file server for the third workload. I installed Emulex's LP7000 in a 32-bit slot to provide the fibre channel attachment for this test. The Benchmark Factory Random Reads/Writes test performed 1KB random reads and writes to separate 1MB files that Benchmark Factory creates for each virtual user.

I conducted the test in two phases. In the first phase, I ran the SQL Server test alone and varied the client workload from 150 to 400 users in increments of 50 users. Each simulated client generated transactions with a mean interarrival time of 300ms in a negative exponential distribution. (Interarrival time is the time that elapses from the start of one transaction to the start time of the next transaction.) To promote consistency, I restored the 7GB database between benchmark test runs. Benchmark Factory recorded a transactions per second (tps) metric for each test run. I used the Benchmark Factory control console's Performance Monitor to monitor the ALR 9200 server and record Physical Disk: Bytes/Second and Physical Disk: Average Disk Queue Length for the LUNs that the SQL Server log and data files were on. For the test's second phase, I ran the same SQL Server workload that I ran in the first phase, but I also ran a constant file-server and Web workload against other LUNs on the LVS 4600 at the same time.

The test results were interesting. First, I saw only minor differences in SQL Server's performance between the SQL Server test and the corresponding test with constant Web and file-server workloads running. The SQL Server OLTP workload was hammering the LVS 4600 storage hard. Disk I/O queue lengths for the data LUN consistently exceeded 125, which indicated that I succeeded in making the disks a bottleneck in the test. Typically, a queue length that is more than two times the number of disks in the array (four disks in this case) identifies a bottleneck. The LVS 4600 was able to keep up with the predominantly sequential SQL Server log I/O, and the queue length consistently hovered around 1. Figures 2, 3, 4, 5 and 6 present the performance testing results. (For more detailed test results, see the Web-exclusive sidebar "Amdahl LVS 4600 Performance Test Results," http://www.win2000mag.com/articles, InstantDoc ID 8626.)

Data Backup
When you attach a tape library to the SAN, you can share a high-capacity backup device among several servers. Each server can access a SAN-based tape library as a local SCSI-attached device at the SAN's full speed. It therefore takes you less time to back up your data compared with the time data backup takes when you use traditional SCSI or LAN-based backup. Several backup software vendors, including Computer Associates (CA), Legato Systems, and VERITAS Software, provide SAN-specific support modules. (For vendor contact information, see "Data-Backup Software Vendors," page 139.)

Price
The LVS 4600's flexibility, scalability, and ease of management have a price. When you configure the system with two disk-drive cabinets; twenty 18.2GB disk drives; redundant fibre channel Availability Manager modules; and two 32-bit HBAs, your cost is $107,060. The price includes Amdahl's installation and a 12-month warranty for 24 * 7 2-hour response-time service. A 16-port, fully redundant ED-5000 Director fibre channel switch adds $141,000.

The LVS 4600 Solution
The LVS 4600 is an enterprise-class solution for managing large storage volumes and one that will meet the needs of storage-hungry e-commerce applications. The system includes 50 disk drives that communicate over five SCSI channels, 100MBps full-duplex transfer rates, and full data-path failover. The ED-5000 Director offers full fault tolerance, as many as 32 fibre channel ports, and easy manageability. Although Amdahl provides only limited support for the Emulex SCSI port driver for customers who implement multihost configurations on the LVS 4600, Amdahl representatives said that the company is addressing this limitation. Upgrades to Ultra 3 and Ultra 160 SCSI technologies would be welcome because they have faster data transfer rates and are technologies that vendors have implemented in other products (e.g., the new Adaptec SCSI controllers). If you're not using multihost configurations, I think you'll find the LVS 4600 a pleasure to work with.

Amdahl LVS 4600 Storage System
Contact: Amdahl * 408-746-6000
Web: http://www.amdahl.com
Price: $107,060, plus $141,000 for McDATA ED-5000 Enterprise Fibre Channel Director
Decision Summary:
Pros: Creates a scalable, manageable, and high-availability Storage Area Network solution that supports high storage volume and peak performance
Cons: Creates disadvantages for multihost configurations because the product provides full support for only the SCSI miniport driver