The Lab Guys fill you in on NT 4.0 and benchmarks

The new NT 4.0 Explorer interface has some neat features you'll want to learn. Table 1 lists these navigation tricks.

This capability is not new, but do you know that in User Manager, you can assign rights and permissions to groups of users by selecting all their names and then performing the administration functions? Beware, though! This action will overwrite any pre-existing permission setups (i.e., if certain users have certain rights, you will erase those rights in favor of the new attributes--this action is handy, though, for initial setups of numerous users).

Changing Security
NT File System (NTFS) is a secure file system and generally easy to administer. To grant or deny specific users and groups the permissions to read, write, execute, delete, change permissions, or take ownership of the selected objects, you simply highlight certain files or directories and select a few menu items.

But what if you already have different permissions set up throughout your file system and you want to remove access from a group or a user without mucking up the existing set? If you glance at the menu items, you might think you have to go through all the files, one at a time, looking at the permissions and removing the particular user or group. This procedure is time-consuming, susceptible to error, and just plain painful with thousands of files.

Lucky for us, Microsoft has provided a command-line utility, cacls.exe, that lives in your \%systemroot%\system32 directory. cacls stands for Change Access Control Lists. The utility lets you change the user and group access permissions for files and directories.

You can tighten your system security by removing the Everyone group from all the files and directories without wiping out the permissions that are in place. First, issue the command cacls*.*/T/E/g Administrator:F to ensure that the Administrator account can access all files and directories, just in case you remove your ability to modify them further. Next, issue the command cacls *.* /T /E /r EVERYONE from the command prompt in the root directory to change the permission of every file and directory on your system.

To see the permissions in place, as in Screen 1, issue the command cacls *.* /T. Add | more to the end of the command line so the listing doesn't scroll by too fast.

As with most of the command-line utilities that Microsoft provides, online Help is available. However, as Screen 2 shows, the Help is a bit cryptic.

Birth of a Benchmark
And now for something completely different: The Lab's mandate is to give readers criteria for judging and selecting software and hardware products. To develop those criteria, we Lab guys test products, just as the labs of other magazines do. However, we've become frustrated with the methods of most testing and the standards for benchmarking. So we've decided to develop meaningful and repeatable benchmarks.

A realistic way to define the word benchmark is as a distortion of reality. No one has discovered a way to accurately duplicate real-world user loads without sitting down at every company in the world and testing each network, server, and workstation in the application mix and their user load.

With that reality check in mind, you can further define a benchmark as a reasonable simulation of average user activity: Shoot for the median, and you stand a good chance of representing a useful cross-section of user load and configurations in the real world.

Some benchmarking strategies, such as TPC, ServerBench, NetBench, AIM, RPMarks, and RDBMarks, are synthetic: They test system performance by generating loads or transactions that do not occur in the natural IS world. These strategies don't always relate cleanly to real, end-user system performance and activity, so you don't get a feel for real environment scaleability. You are best served by not using these numbers to extrapolate system performance for your corporate environment. Some published results can mislead you.

The currently fashionable method of reporting system performance is with one number, such as TPC-C. Vendors and magazines use this number to say, "This machine is the fastest computer in the world," or, "This system is the best price/performer."

The trouble is that you frequently see system comparisons in environments that are not even close to real user environments, and evaluations stack systems against each other that have no business competing in the same market. You still see comparisons of $70,000 symmetrical multiprocessing (SMP) platforms to $6000 clones, with the statement that the SMP box performed only marginally better. The implication is that the SMP box is therefore a bad purchase relative to the clone. But, how does the SMP box scale at low to high user counts or different system configurations? What other features does it offer that add value for the customer? What is the test using the system for?

Lumping the entire system's performance characterization into one number doesn't really tell you anything, and if you base a buying decision on such a number, you may find that you went down the wrong path. Just because a system scored a high TPC or ServerBench score doesn't mean that when you plug that system into your network, it will scale infinitely. And you have no way of knowing where the new server's architectural bottlenecks are.

Looking at trends--using capacity planning tools and performance monitors--is a much better indicator than raw performance and maximum capacity. You use trends to analyze where potential problems are, where you can improve your system, where an upgrade is necessary, and so forth. Suppose you use the latest TPC-C result from IBM to buy a server with a specific configuration, and you expect it to outperform every other system because it carries this month's highest number. You'll probably find that, up to a certain point under a certain load, this expectation will be fulfilled.

However, any system will fall at certain points, be they at low loads, medium loads, or high loads. You'll be caught unaware, $50,000 in the hole, with a system not performing the way you hoped. And you'll be spending countless hours on the phone with tech support trying to figure out why the system isn't doing what you expected.

A New Way of Thinking
Where does this situation leave you when you're trying to decide what system to buy? We hope we've debunked the urban myths about benchmarking. So what's the alternative?

In the last week of July, several server manufacturers and software vendors gathered at the Windows NT Magazine offices to discuss the answer, and other key industry players contributed to the discussion: IBM, Compaq, Microsoft, Tricord Systems, HP, Digital Equipment, Bluecurve, SQA, and Great Plains Software. The consensus was that synthetic benchmarks are difficult and expensive to run and tough to understand if you try to dig past that single rosy number.

The question remained: How can we performance-test systems and software (the OS and applications) in a real-world fashion that users will accept, understand, and be able to duplicate without incurring high cost and unreasonable complexity? The answer is to use industry-standard tools and automate user activity that represents what users actually do, instead of what they ought to do (e.g., not to test with synthetic routines that move data blocks from one memory location to another). At the same time, we don't want to produce statistical nightmares that let us cook the numbers to say whatever we want. The key is to use simple metrics that answer what people want to know: How long do we have to wait for a process to run, and how many users can we support before system responsiveness disintegrates?

Right now, several tools are available for this kind of test, and more are under development: Microsoft Exchange via LoadSim, SQL Server 6.X via Bluecurve's Dynameasure, application serving with Citrix WinFrame, general accounting with SQA Suite and Great Plains Dynamics C/S+ SQL, and Internet applications with WebStone. To clarify the distinction between the existing industry benchmarks and what we propose to implement, the issue is a matter of what you want to learn: Do you want one performance number that seems easy to grasp but really doesn't give you any context for understanding its implications? Or do you want to know how the system will perform in your environment?

We will tell you exactly what the testing environment is, what we are testing for, where the potential bottlenecks are, how various aspects of the system perform under test, and where the areas of concern are. Then you can decide whether the transactions and applications we test represent what you do on your systems, and you can use this data to determine your direction of research. We can't tell you which system to buy, but we can suggest what to look for and consider and show you how a system performs under conditions that emulate your work environment.

Whether NT scales is a tough question, and Windows NT Magazine is going to work to answer it for you. In this issue, we begin the first round of performance analysis techniques and present test results of MS Exchange Server on Windows NT 4.0 Server (in "Optimizing Exchange to Scale on NT."). Perhaps we raise more questions than we answer, but future articles will address each issue and delve into the intricacies of client/server computing.