Pushing the pace for file-sharing performance

Recently, the Windows NT Magazine Lab handed me this hot assignment: Produce performance benchmarks for Microsoft Windows NT Server 4.0 and Novell NetWare 4.1. Why is this topic so hot? Because chances are really good that at least one of two industry giants will not like the results. Undaunted by the prospect of attracting the wrath of a major corporation, I gathered up my gear and headed to the Lab's speedway.

For my performance tests, I used the Lab's standard configuration: a set of client machines on a 100Mbps Ethernet network that simulates the workload of multiple users. (For details about the Lab's test environment, see the sidebar, "The Benchmarking Speedway," page 65.) One server, running either NT Server 4.0 or NetWare 4.1, tackled the workload of the client machines. Because I wanted to test file-sharing performance, I employed Bluecurve's Dynameasure for File Services 1.5 as the workload engine. (For information about this product, see Lab Reports: "Dynameasure Enterprise 1.5," September 1997.) The combination of Dynameasure and the Lab's test environment let me simulate workloads that users typically perform and pinpoint potential bottlenecks to ensure that I stressed the server, not the network or clients. After a month of testing and reviewing reams of data, graphs, and tables, I determined a clear winner: The checkered flag goes to...NT Server 4.0!

Preparing the Track
I wrestled with several background issues going into these tests. My biggest concern was how to create a fair comparative evaluation. Different people often have different priorities when they evaluate a product. For example, when my wife and I decide to buy a new car, we define our needs and select criteria (e.g., price, speed, appearance, hauling capacity, headroom) to measure a car's potential value. Even if we agree on the criteria, we don't always agree on the importance of each item; picking a clear, overall winner is difficult. Fortunately, we agree that one criterion outweighs all others: performance.

Performance is also a primary criterion that people use to evaluate file servers. Although other factors (price, support, compatibility, interoperability, etc.) matter, performance is of utmost concern. I based my evaluation and conclusion on only how well NT Server 4.0 and NetWare 4.1 performed in the Lab's test environment under Dynameasure for File Services 1.5. If criteria other than performance are more important to you (e.g., your primary concern is to find a file server operating system that runs on a 386), you might come to a different conclusion.

The Speedway Judges
In the performance tests, I measured three main areas: throughput, average response time (ART), and motors per step (MPS). Throughput (measured in kilobytes per second--KBps) is the total number of bytes all the motors copy during the measurement phase of a step divided by the elapsed time of the measurement phase. Throughput measures system capacity. The type of transaction, the number of motors, and the hardware capacity of the system influence throughput. Higher throughput means greater system capacity.

ART is the average time in seconds to complete a transaction during the measurement phase of each step. ART measures the speed of the test system. The type of transaction, the number of motors, and the hardware capacity of the system also influence ART. Lower ART means the system is faster.

The third measurement, MPS, is the number of motors that report results for each step of the test. MPS measures the total number of assigned motors in a step that complete the transaction. MPS is a direct measure of load on the system. Higher MPS means greater load.

I approached the benchmarking process as if I were testing two unique racing teams and pit crews. Each team (NT and NetWare) used the same physical track (the Lab's network) and the same physical cars (clients and motors). Each team had equal time (about 16.5 minutes) to complete as many laps as possible (throughput). I tracked average lap times (ART) and the number of cars that completed the race in a given time frame (MPS).

Start Your Engines
Establishing a test environment that could run both NT and NetWare was the first order of business. For the test server, I used a generic PC clone with the following hardware configuration: a 120MHz Pentium, 64MB of RAM, a master 2.1GB hard disk (EIDE), a slave 2.1GB hard disk (EIDE), and a Novell NE 2000 Socket EA network adapter. Software configuration included NetWare 4.1 with 10 license connections and NT 4.0 with Service Pack 3 (SP3). I partitioned the slave 2.1GB hard disk into two equal areas--one for NTFS and the other for Novell's file system. I added this test system to the Lab test environment to measure performance with the Dynameasure software.

I chose six of the Lab's clients, running NT Workstation 4.0, as my user testbed. I configured each client with eight or nine motors to simulate a total of 50 network users. NWLink IPX/SPX bound all client adapters. I performed some preliminary tests with both the Microsoft NetWare client/protocol software and the Novell-supplied NetWare client/protocol software. I used the same testing specifications I planned to use for the benchmark tests, and I saw no performance difference between the two client/protocol software products. Throughout the benchmark performance tests, I used the Microsoft client/protocol software.

Time Trials
I ran several initial Dynameasure tests just to warm up the track. At this point, I wanted to identify any bottlenecks that could affect the results. In particular, I wanted to eliminate the possibility that the client systems or the network bandwidth could degrade performance. The warm-up tests ran the Dynameasure for File Services Copy All Bidirectional test configured for a 5.6MB dataset; a 1KB block size; 10-second think time; and 6 steps, with the following number of motors assigned to each step: 5, 10, 20, 30, 40, and 50.

The Copy All Bidirectional test consists of 16 different transactions in which compressed data, uncompressed data, binary files, text files, and image files are copied between the server and the clients. Based on this test (and with help from Bluecurve's technical support team), I was able to ensure that I was not overstressing the client workstations or the network. The summary reports for NT and NetWare showed that server performance began to degrade after reaching approximately 25 motors--throughput leveled off, ART rose, and MPS declined.

During these warm-up tests, NT vastly outperformed NetWare, as the Dynameasure graphs in Screen 1 and Screen 2 show. In Screen 1, the left graph displays throughput, and the right graph displays ART. Screen 2 displays throughput in the left graph and MPS in the right graph. NT maintained higher throughput, lower ART, and higher MPS in these tests.

For several reasons, the Lab had hypothesized that NetWare would have higher throughput and be faster during certain types of transactions (such as copying small files). For example, NetWare 4.1 includes Packet Burst technology, which lets a server transmit several packets in a burst, without waiting for verification that each packet has been received. NetWare 4.1 also supports Large Internet Packets, which lets the server and workstation communicate using the largest possible frame size.

The Copy All Bidirectional test includes eight different types of data, with two files for each data type: one file for client-to-server transactions and one file for server-to-client transactions. Because the type of transaction influences the three benchmark measures, I decided to break out individual transactions and compare the results to get additional information. This view of the data would let me compare server performance based on the type of transaction each server completed.

Much to my surprise, the detailed results from the tests revealed that for every type of transaction, NT outperformed NetWare in throughput, ART, and MPS. The closest throughput values for NT and NetWare occurred during step 5 of the Copy Compressed Text Bidirectional test: 406KBps for NT and 401KBps for NetWare. During that step, NT's ART (0.17 seconds) was faster than NetWare's ART (0.51 seconds). The largest gap in throughput values occurred during step 6 of the Copy Data Bidirectional test: 1.21MBps for NT and 298KBps for NetWare. At that point, NT 4.0's ART (29 seconds) was more than five times as fast as NetWare's ART (166 seconds). In all tests, both network operating systems had the same MPS values, which matched the assigned specifications of 5, 10, 20, 30, 40, and 50 motors.

Reality Check
To ensure that my testing parameters and system configuration were not in some way slanted toward NT, I presented my findings to Bluecurve's technical support team and to the other Lab technicians. One major concern pertaining to NT caching came up in those meetings. For bidirectional transactions, Dynameasure creates the data files on both the client and the server. For example, one of the files in the Copy Compressed Binary Files from Client to Server transaction is identical for every motor during a test. Because of the small file size and small block size being transported across the network, I decided to investigate whether NT Server and the NT clients were actually reading from or writing to the disk for every copy transaction.

On any desktop system, you can see how data caching affects performance with a simple experiment: Open any large data file stored on a floppy or a hard disk. The system takes a few seconds to read the requested data, open the viewing application, and display the data on the monitor. Close the application, and then open the same file. The data appears almost instantly, and you don't hear the characteristic spin of the system reading the disk. The system has cached the data (and possibly the application) in RAM; thus, no disk read occurs, and the whole operation is substantially faster.

I needed to eliminate any server or client memory caching that could influence the tests. The idea was to force both NT and NetWare to access the hard disks as many times as possible during the copy transaction. (In racing terms, I needed to ensure that the cars made as many pit stops as possible.) I conducted the next series of tests in the same manner as the warm-up tests, except I increased the dataset scale and block size. Increasing these settings increased the memory that the system paged, and flushed the system's RAM.

Armed with this idea, for the next series of tests, I used the following parameters: a 24.2MB dataset, a 100KB blocksize, a 10-second think time, and six steps (with 5, 10, 20, 30, 40, and 50 motors, respectively). I again selected the Copy All Bidirectional test, because the random order of the transactions makes caching data from one transaction to the next difficult. Graphs 1, 2, and 3 display the results of these tests.

As you can see from the graphs, NT 4.0 came out the clear overall winner in performance. During the test, peak throughput for NT (741KBps) was double the peak throughput for NetWare (356KBps). In every step of the test, NT's ART was more than twice as fast as NetWare's ART under comparable loads (MPS).

To the Winner's Circle
For the last series of benchmarks, I decided to use Dynameasure's Copy All Files to Server test. This test would eliminate file caching on the clients as a performance variable. For this test, I used the same test parameters that I used for the Copy All Bidirectional test: a 24.2MB dataset, a 100KB block size, a 10-second think time, and six steps (with 5, 10, 20, 30, 40, and 50 motors, respectively). This test and specification set transmitted a large amount of data across the network and maintained a high frequency of delivery. Graphs 4, 5, and 6 display the test results.

The benchmark data again favors NT. Graph 5 shows that the closest ART values occurred during step 6 of the test. In step 6, NetWare's ART (122 seconds) was 10 seconds faster than NT's ART (132 seconds), so you might be tempted to argue that NetWare outperformed NT in the part of the test simulating the heaviest load (50 motors). However, you must examine the ART data in relation to the other performance data.

In step 6, the NT racing team completed a maximum 841,000 laps (i.e., 841KBps peak throughput) , and all 50 cars (MPS) finished the race. The NetWare racing team completed 167,000 laps (i.e., 167KBps throughput), and only 33 cars (MPS) finished the race. NT's throughput is five times NetWare's throughput, and NT has 17 more motors running than NetWare.

What If ...
Someone in the Lab observed that for the last two steps of the test, NetWare's ART was getting faster, and NT's ART was getting slower. What would happen if the test continued with more motors? I reran the tests, with a 24.2MB dataset, a 100KB block size, a 10-second think time, and six steps (with 10, 20, 40, 60, 80, and 100 motors, respectively), and the results again favored NT. NetWare's maximum throughput was 208KBps with an ART of 142 seconds, running 83 motors. NT reached maximum throughput at 720KBps with an ART of 42 seconds, running 100 motors.

Post-Race Analysis
To keep the racetrack equal for both teams, I maintained the same physical network connections, the same protocols, the same physical clients, and the same physical test server--down to the same physical hard disks. Within Dynameasure for File Services, I kept identical test specifications (file size, type of transactions, and number of motors) for each operating system. I gave both racing teams the same track to race on, the same type and number of cars to drive, and the same amount of time to complete laps.

After running race after race and watching NT leave NetWare in the dust, I finally concluded that NT is indeed the better performing operating system for file services. Does that mean that you should throw out your NetWare servers and replace them with NT? You be the judge. After all, every garage in the country does not have a new sports car. But the next time you go shopping, remember which operating system has the performance edge.

Windows NT Server 4.0
Contact: Microsoft * 425-882-8080
Web: http://www.microsoft.com
Price: $1129 for 10 users
System Requirements: 16MB of RAM, CD-ROM drive, VGA, Super VGA, or video graphics adapter compatible with NT Server 4.0
Intel-based systems: 486 33MHz or greater, Pentium, or Pentium Pro, 125MB available hard disk space
RISC-based systems: RISC processor compatible with NT Server 4.0, 160MB available hard disk space
NetWare 4.1
Contact: Novell * 888-321-4272 or 800-209-3500
Web: http://www.novell.com
Price: $2095 for 10 users
System Requirements: 386-based PC or greater, 8MB of RAM, 115MB available hard disk space