App Servers Are the Mainframes of the '90s

Windows NT Server has found a place in corporate networks both as an effective file server and as an application server. We network types have years of experience running file servers, but application servers may be new to many of us. App servers are the mainframes of the '90s--powerful computers that many people rely on, not just for data but for basic processing power.

The operation of mainframes in the past taught some valuable lessons, ones that the client/server world is just learning now. One of those lessons is that you can't simply solve a performance problem by throwing hardware at it--although sometimes that is a perfectly good solution. However, there are several approaches to determining the right path to maximum application-server performance.

App servers make different demands on their hardware than do file servers; when looking for bottlenecks, the first things to examine are the CPU and the memory systems.

CPU "Telltales"
First, you'll want to examine CPU "telltales." In the Performance Monitor (Perfmon), watch the Processor object, and the Percent Utilization counter. If your CPU is being utilized 90% or more on a regular basis, then you're a candidate for a processor upgrade, even if it's just a clock doubler. When you buy servers, think seriously about those that have sockets for extra processors; adding a Pentium chip later is a much cheaper way to buy more power than having to build a whole new server.

Sometimes apparent CPU bottlenecks are hardware problems in disguise; take a look at the Interrupts/second counter in the Processor object. I run a moderately busy server (32-bit disk adapter and network card, several dozen users) and see an average of 100 interrupts per second. Even an NT server doing relatively little will idle around that number. That means that if the interrupts per second approach 600 or above, you probably have either a failing board that is spewing out interrupts or a badly written driver.

There's also a way to sniff out badly designed device drivers: Check that the Context Switches counter in the System object stays at less than 500. If it becomes higher, then some device driver has built-in critical sections that are too long. If the Context Switches aren't too large, take your server off-line, and run a basic DOS-based diagnostic program, such as CheckIt or QAPlus; it can often point out the screaming hardware.

And here's a CPU killer I ran across: One of my clients upgraded from Windows NT 3.1 to 3.5 and couldn't figure out why the performance took a huge dive. I poked and prodded, but I couldn't figure it out either. Then, as I was pondering the problem, the screen saver came on. It was one of the screen savers new to version 3.5--3DPipes. It has an interesting-looking display, but it uses an extremely CPU-intensive application programming interface (API) that supports 3D graphics. Perfmon showed that when it ran, it sucked up more than 90% of the CPU's power. Switching to another screen saver solved the problem.

Memory Grabbers
CPUs are only half of the app-server bottleneck picture; memory is the other. NT and NT apps are memory-hungry. You can literally throw memory at your NT Server system, and it will probably find a way to use it. How? Well, for one thing, NT uses an enormous amount of its memory--as much as half--as a disk cache. To look at memory status in Perfmon, check out the Memory object, Committed Bytes. It should be less than your server's actual physical memory. Then check to make sure that Available Bytes is at least 1MB.

Pages/second, also in the Memory object, should be five or less: If it's greater than that, you're paging in and out of virtual memory like crazy. You need to either speed up the disk, so that virtual memory will respond more quickly, add more RAM, or remove services to lower memory consumption. The TCP/IP services, Dynamic Host Configuration Protocol (DHCP), and Windows Internet Name System (WINS), for example, gobble up megabytes of RAM but don't use much CPU time, so you might be tempted to throw an extra DHCP server on your database machine--don't.

One way to find out how much free memory is on your system is to look at the Memory Index in WinMSD. To access it, you start WinMSD and click on the Memory button. A memory-usage meter will appear on your screen indicating how heavily your system's memory is loaded.

Why am I talking about disk usage in a section on memory? Because NT relies so heavily on virtual memory. It seems to require at least some virtual memory, no matter how much RAM you have. Again, that's because of the way it does disk caching--it often allocates half of the system's RAM to a disk cache. As a result, NT often needs physical memory to run some programs. So, it starts paging.

To adjust your application server's memory, open the Control Panel and the Network applet. Double-click on Server, and you'll see a dialog box controlling how NT uses RAM. Click on the radio button labeled "Maximize throughput for network application."

The worst part of the paging process, time-wise, is finding a place to put the data on the disk. So NT preallocates a large block of disk in an area that it calls pagefile.sys. Having this contiguous area of disk to work with allows NT to bypass the file system and perform direct hardware disk reads and writes. These are accomplished in a higher privilege level, one known as "Ring 0," for the sake of speed. NT can exceed the preallocated space if it needs more virtual memory, but having to stop and look around for free space slows it--and your server--down. You can control the amount of preallocated space on disk via the Control Panel; just click on System/Virtual Memory.

On a 16MB machine, NT sets the initial virtual memory file, pagefile.sys, to 27MB. So, when calculating free memory on such a system, NT knows that it can get up to 27MB of disk space with direct hardware reads and writes. When it needs memory, it can also use RAM. But how much RAM is left after you load the operating system? Despite all the grumbling about NT's memory requirements, they really aren't bad. NT 3.5 can run in about 5.5MB. (That's the part that can't be paged out to disk.)

If you have a 16MB machine, that leaves 10.5MB of free RAM for NT to play with--let's call it 11MB for simplicity. Add the 11MB remaining to the 27MB of pagefile.sys space, and NT has 38MB of working room, or, as Perfmon would refer to it, a 37MB "commit limit." If the sum of the programs that NT is running (called the "working set") remains at 37MB or less, then NT doesn't need to enlarge the paging file. On the other hand, if NT needs more memory, it can expand the paging file--that will, however, cost time. The paging file can grow up to 77MB on this system; so NT can execute up to 77MB plus 11MB or 88MB of programs on a 16MB system before it runs out of memory.

NT may not enlarge the paging file enough. That can lead to a kind of "sawtooth" size for the paging file, as NT continues to "go back to the well" for more space until it finally runs out. You can avoid this by keeping an eye on your machine's commit limit. Add the counters Commit Limit and Committed Bytes to Perfmon, and watch the difference between them. When Committed Bytes exceeds Commit Limit, NT has to increase the paging file size. If you want to see the size of the data structures and processes that won't be paged, add the counter Pool Nonpaged Bytes.

One suggestion is to log Committed Bytes over a few weeks with Perfmon, then note the maximum value it reports. Increase that by a small amount--10% to 20%--and make your system's minimum paging file that size.

Another thing to check beyond memory-grabbing applications is memory-grabbing services. NT starts up a lot of services that you may not need. If your server doesn't have a printer, then the Spooler service is unnecessary. If all of your storage devices are SCSI, check to see if the ATDISK--the IDE interface--is active. In either case, you can shut down a service or a device. Even better, get rid of any protocols you're not using.

Tweak Before You Buy
Before you simply throw another 32MB or another processor into your server, it's a good idea to use Perfmon to locate your bottlenecks and focus on them instead. Sometimes a little tweaking can solve the problem. If it can't and you do need more hardware, at least you'll know what hardware you need.