Here's what you can do to tune NT Server's network I/O, disk I/O, and CPU utilization
Now that you've made a strategic decision on a server that supports your requirements today and has room to grow tomorrow, you want it to run at its very best. In my June article, "The Beginner's Guide to Optimizing Windows NT Server, Part 1," I focused on optimizing Windows NT Server's memory subsystem. This month, I'll examine tuning NT Server's other subsystems, such as the network I/O, disk I/O, and CPU. Let's review some basic tuning concepts.

Basic Tuning Strategy
To optimize NT Server's performance, develop baselines for each subsystem. Monitor the system closely to identify bottlenecks and plan for future capacity requirements. Finding the cause of a bottleneck can be difficult because all the server's major resources--CPU, memory, disk I/O, network I/O, and applications--are interrelated. Solving one problem can cause another. Try one change at a time and compare the results to see whether the change was helpful. Always test your new configuration, and then test it again to be sure changes haven't adversely affected your server. I will discuss only NT Server's built-in monitoring tools: Performance Monitor (Perfmon), Network Monitor, and Task Manager.

Tuning the Hardware
Hardware tuning is as important for disk optimization as it is for memory optimization. Set your disk adapter's BIOS settings for maximum performance and stability. Be sure you have turned on write-back cache and set the SCSI channel to negotiate for the fastest setting the disks can handle. Many drives support the Ultra Fast/Wide SCSI (40MB per second) standard. Many disk adapters sold in the last year support Ultra Fast/Wide SCSI but were set to the slower Fast/Wide settings of 20MBps, the speed that the typical disk drive supported. Ask your hardware manufacturer for the most recent BIOS release level and optimum BIOS settings for disk adapter/disk drive combination.

For network I/O connections, set the adapter appropriately for your network. Many of today's adapters support multiple network speeds, including autosense mode. In autosense mode, the network adapter tries to automatically determine your network's speed and then sets itself to operate at that speed. I prefer to set the adapter to the exact speed of the network devices the server will be interacting with to ensure the best performance possible. Not all servers and network devices want to play together nicely when you've set them to run at peak throughput settings. Be sure that you are using the most recent device drivers for disk and network adapters. You will often see improved throughput and efficiency just by using the most recent release. Most hardware manufacturers have this information on their Web sites. Microsoft keeps similar information on its Web site about NT Server patches or service packs. These patches attempt to fix known problems and occasionally include performance enhancements. Always test new drivers or patches to guarantee that they operate as advertised.

Network I/O
NT Server's network I/O subsystem becomes a bottleneck when demand for network resources outpaces what NT Server can provide. Clients and other server systems must be able to connect to NT Server with sufficient bandwidth and low enough latency to provide adequate response times to support customers' requirements. Therefore, you need to understand what type of workload your client systems generate and which key network architecture components are in use. To determine where your bottlenecks are and how to fix them, you must understand the type of network protocol (e.g., Ethernet, NetBEUI) and physical network (e.g., Ethernet, Fiber Distributed Data Interface--FDDI) you're using.

Perfmon. The two primary tools for sleuthing out network I/O bottlenecks in NT Server are Perfmon and Network Monitor. You can use Perfmon in logging mode over a period of time to develop a baseline and to analyze the server's resources. Start Perfmon from Start, Programs, Administrative Tools. To enter logging mode, select View, Log, Edit, Add To Log. The key object to observe is Network Interface. (To access the Network Interface object, first install the Simple Network Management Protocol--SNMP--from Control Panel, Network, Services; then reboot.) Select the Network Interface object, click Add, and then Done. To begin the logging session, select Options, Log. Enter the name of your log file, specify a sampling interval, and then click Start Log. If you want to look at data the system is currently monitoring, you need to start a second copy of Perfmon and select Current Activity. (Perfmon stops logging when you view an active log file.)

To access the data that Perfmon collects in the log file, you need to stop logging and look at the log file. To stop collecting, select Options, Log, Stop Log. To read the log file, click Options, Data From, Log File and enter the name of the log file. From View, select Chart (or Report or Alert). Now choose Edit and Add To Chart and select the counters you want to observe. (For more information about using Perfmon for logging, see Part 1 of this article in the June issue and Michael D. Reilly, "The Windows NT Performance Monitor," March 1997.)

Perfmon collects data for each separate physical network adapter instance. Traditionally, the Network Interface object has had two notable counters:

  • Output Queue Length, which measures the length of the output packet queue
  • Bytes Total/sec, which measures all network traffic that moves through the particular network adapter and includes all overhead incurred by the protocol in use (TCP/IP, NetBEUI, etc.) and by the physical protocol (Ethernet, FDDI, etc.)

The first step in detecting a network bottleneck is looking for symptoms. Are users complaining about slow downloads from your server? Are applications that involve the network and the particular server you are investigating running sluggishly? In NT 3.51 the Output Queue Length counter is a good indicator. A value of 1 or 2 in the Output Queue Length counter is acceptable. However, if this measurement begins increasing (particularly above a value of 3 or 4), your network I/O adapter is waiting on the network and can't keep up with the server's requests to move data onto the network. If the length of the output packet queue is frequently higher than baseline, a bottleneck might be occurring. In NT 4.0, the help information states that because the Network Driver Interface Specification (NDIS) queues the requests, the counter's value is always 0. However, research and testing show that this statement is false. The Output Queue Length counter provides useful information in NT 4.0.

Observe the Bytes Total/sec counter to see whether NT Server is waiting for a slow client to receive data or waiting on an overloaded network. The information that this counter provides is not useful until you compare it with the network architecture in use. For example, if you are running a TCP/IP-based 10Base-T Ethernet network, it has a theoretical maximum throughput of 1.25MBps. If you take into consideration Ethernet and TCP/IP overhead, the maximum goes down to roughly 1MBps, and the network rarely achieves that speed. Ethernet uses a Carrier Sense Multiple Access/Collision Detection (CSMA/CD) scheme, which leads to a poor degradation curve under heavier loads. Thus, as the network utilization increases above the 40 percent to 60 percent range, clients on the network begin noticing slower response times to their network requests. These slower response times result from increased collisions on the physical layer cable, which in turn causes the adapter to retry the transmission after a random delay.

In a 10Base-T network, if the Bytes Total/sec counter shows a sustained output rate between 400,000 bytes per second and 600,000 bytes per second and your output queue continues to grow, your network adapter and architecture combination has become a bottleneck. Don't forget to check the other NT Server resources--memory, CPU, and disk subsystems--to be sure that they aren't becoming bottlenecks.

Network Monitor. NT Server includes a version of the Network Monitor tool from Microsoft's Systems Management Server (SMS). With the SMS version, you can monitor any system that has an active network monitor agent. However, NT Server's Network Monitor lets you view network traffic only with respect to the server it is running on. As you see in Screen 1, to install Network Monitor from the Control Panel, you select Network, Services, Add. Select Network Monitor Tools and Agent, and then reboot your server to finish the installation. Now you can start Network Monitor from the Start, Programs menu.

After you've launched Network Monitor, select Capture, Start to begin monitoring the network. Network Monitor is a powerful addition to NT Server: You can use it to debug network problems associated with the server down to the packet/protocol level.

To detect potential network bottlenecks, Network Monitor places the network card(s) into promiscuous mode; that is, the card analyzes every packet moving through the server's network interface. Under Network Monitor's default view, you can quickly obtain information such as network utilization, frames per second, bytes per second, and the originating systems of your network traffic. Beware: Network Monitor can use a fair amount of resources on a busy server. In my tests, I found Network Monitor consumed approximately 4 percent of the CPU resources on a 166MHz Pentium Pro with 512KB of RAM. You set the amount of RAM that Network Monitor uses in the Capture buffer settings. The setting must be low enough to not use too much memory (RAM) but high enough so that Network Monitor doesn't drop any packets. Typically, a 2MB to 3MB buffer setting is sufficient.

Using Network Monitor under its default settings, as shown in Screen 2, observe whether the value of the % Network Utilization counter is consistently above the 40 percent to 60 percent range. If this range of utilization is common, you are using your current adapter close to its maximum capacity, and it is becoming a bottleneck. To alleviate this situation, consider adding another network adapter to segment your network. This action will add more network I/O bandwidth to your server by physically and logically separating your networks into two parts. You can activate routing under NT Server through Control Panel, Network, Protocol, Routing to pass data between networks. NT Server is capable of supporting basic routing functions under a light-to-medium load, but this routing function will add some overhead to your system.

Tuning Network I/O
Network Monitor earns its keep in this segmentation tuning process. In the default view, Network Monitor displays bytes sent and received between the server and the various systems demanding service. You can use this information to help you decide where to physically connect the various systems on your network; you want to balance your network load by distributing the more heavily used systems between your two network segments. Even though segmentation involves some new cabling (physical addressing) and logical subnetting, it is a proven technique to optimize your server's network I/O and is relatively easy to implement. Once you have used this divide-and-conquer technique, continue to monitor your server's network utilization and output queue to head off future bottlenecks.

Another technique for optimizing your server's network I/O performance is to bind your network adapter to only those protocols that your network is using. Under Control Panel, Network, Protocol, you can check which protocols are currently installed. Removing unnecessary protocols lowers the amount of memory that NT Server allocates for network I/O and ensures that your network has no unnecessary traffic.

Finally, a popular optimization technique for a TCP/IP-based network is to adjust the TCP/IP window size. The TCP/IP sliding window is a dynamically set buffer for transmitting packets. You can easily adjust the window size in NT's Registry, but the optimal size depends on your network architecture. Typically, finding the best window size for your environment requires several network tests to improve on the default NT Server settings. (For more information about this topic, see Bill McLaren, "13 Tips for Optimizing Internet Information Server," April 1997.)

The techniques I've described ensure that the network I/O subsystem on NT Server isn't the network bottleneck. But remember that many facets of a network's architecture can cause a network bottleneck. These facets include client configuration, application design, the physical network, network protocol, and network devices (e.g., routers, hubs). Many tools are on the market for network analysis, including network sniffers and network managers such as Unicenter TNG (Computer Associates), Open View (HP), and Works (Cisco).

Disk I/O
To tune disk I/O performance under NT Server, you can purchase additional drives and internal/external RAID units, if the server hardware can support them. Short of adding hardware, you can tune the disk subsystem in other ways. In particular, review your Perfmon logs regularly to be sure that you have evenly distributed the disk subsystem's load. A common source of contention is running all applications on the root NT Server disk, which can quickly become a bottleneck.

To find out if you have disk bottlenecks, first eliminate the possibility that the problem is due to insufficient memory. You can easily confuse a disk bottleneck with paging file activity when you have a memory shortage. To help distinguish between disk activity related to the virtual memory manager paging to disk and applications using the disks, keep the paging file systems on separate dedicated disks. This technique simplifies using Perfmon to distinguish which busy disks are not associated with paging activities. With Perfmon, review the Avg. Disk Queue Length and % Disk Time counters under the LogicalDisk object for each disk of interest. See Table 1 for definitions of these counters. (Remember to turn on the disk counters with the diskperf-ye command, as Part 1 of this article described.)

A high % Disk Time value is not unusual; a high percentage shows that you are receiving a good return on your disk investment. But beware, % Disk Time exceeding 60 percent can lead to increased response times from an overly busy disk. A problem can occur when the % Disk Time value is high and the Avg. Disk Queue Length is high or increasing. In this situation, a busy disk is not handling all of its requests well. When you find a hot spot on a particular disk subsystem, move some of the associated application files from their current disk location to another disk that is not as heavily loaded. Then continue to monitor the disk I/O subsystem to ensure that you haven't created a new disk hot spot.

Another technique that helps isolate disk performance problems, improves performance, and lowers the head movement rate over the disks is to format only one logical drive per physical drive. For example, if you have three disk drives, create only three logical drives, such as C, D, and E.

To improve the performance of your disk drive subsystem, consider matching the file system Allocation Unit Size to the block size of the application you are using. For example, suppose SQL Server is using a 4KB block size. When you format a file system on a new disk drive, launch Disk Administrator, create the partition, commit the partition changes, select Format, and then set the Allocation Unit Size to 4096 bytes. Matching the file system block sizes can improve the efficiency of the disk transfers when you use the application.

For example, if you have four 4KB blocks of data to write to the disk and the disk is slightly fragmented, you might end up with eight separate 2KB writes to disk on a file system created with a 2KB Allocation Unit Size. When reading this file, the disk heads subsequently have to move to eight random locations. If you use a 4KB Allocation Unit Size, NT Server has to write to disk four times, and thus the disk heads have to move four separate times to complete a read of the data. Test your particular Allocation Unit Size configuration to determine your optimum file system layout, because each application and disk subsystem environment is a little different.

Another way to improve your disk I/O subsystem performance is to try not to use it. If you have so much RAM that you need only I/O operations to permanently save your data, you might not have a problem. But the more power users have available, the more power they use. As I mentioned in Part 1, on the NT Server memory subsystem, you can use the Control Panel, Network, Server option to set how NT Server uses RAM. By appropriately selecting RAM use, you can allocate more space for the dynamic allocation of the file system cache size. Be aware that the file system cache competes with other applications for main memory. An application that is hogging memory--which you can observe in Task Manager, Processes, as shown in Screen 3--can lower your file system cache hit rate. (You can observe the system cache hit rate in Perfmon, Cache object.) Obviously, how you allocate memory is a trade-off between better application performance and better file system and associated disk I/O performance. You must decide which goal is more important for your server.

Tuning CPU
To determine whether NT Server has a CPU bottleneck, first ensure that the system doesn't have a memory bottleneck. If the system is paging excessively or thrashing because the application or process has insufficient memory, the system is using CPU cycles to service all the paging transactions. If you don't find a memory shortfall, look for a CPU bottleneck. Use Perfmon to observe all processor instances and closely review the counters shown in Table2.

A high % Processor Time (e.g., 91 percent) does not mean the system has a processor bottleneck. If the CPU is servicing all the NT Server scheduler requests without building up the Server Work Queues or the Processor Queue Length, the CPU is servicing the processes as fast as it can handle them. A processor bottleneck occurs when the Processor Queue Length is growing; % Processor Time is high; and memory, the network, or disks don't have bottlenecks. Thus, when a CPU bottleneck is occurring, the CPU cannot handle the workload that NT requires because its CPU is running as fast as it can, but requests are queued waiting for CPU resources.

One way to diminish processor bottlenecks is to move to a faster CPU, which is particularly helpful if you have predominantly single-threaded applications. If you have a multiuser system using multithreaded applications, you can preserve your investment (i.e., not throw out the older CPU when the new one arrives) by adding processors.

You can also tune CPU performance by using Task Manager to identify the process that is consuming most of the CPU time and then adjust its priority. A process starts with a base priority level, and its threads can vary two levels higher or lower than its base. If you have a busy CPU, you can boost a process's priority level from Task Manager, Processes. Right-click the process, choose SetPriority, and then select Realtime, High, Normal, or Low. (Be careful when you set the processor's priority to Realtime--the process can become selfish and never release the CPU, possibly making your system unstable.)

By increasing a process's priority, you can ensure that the process will get more CPU time than the other user applications. The priority change takes effect immediately, but the process is manual; when you stop the application and then restart it, the application will return to its original priority. To let you launch applications from the command line at various priority levels, NT Server provides the Start command, which you can place in batch scripts that you run at server startup or directly from a command prompt. To review the Start command's options, enter

start /? | more

at the command prompt. For more information about the Start command, see Bob Chronister, "Ask Dr. Bob Your NT Questions,", and Christa Anderson, "Foreground Application Handling in NT 4.0," June 1997.

Know Where You're Starting
Before you try to tune NT Server, you need to establish a baseline and find out where the bottlenecks are. The best way to tune your server is to understand the server hardware, the NT Server operating environment, your network architecture, and the applications running on your system, because these components are closely interrelated. Until you know how your system performs over time, you won't know how much you've improved your NT Server's performance.