The design and construction of an innovative operating system

Windows NT's architecture influences everything from its API to its performance. In the late 1980s, Microsoft charged NT's developers with creating a new operating system, and the company mandated a hefty list of requirements to make NT the world's dominant desktop and enterprise-level operating system. NT's developers faced the constraints of supporting backward compatibility with DOS and Windows 3.x, as well as supporting a laundry list of capabilities intended to ensure NT's long-term success. What NT's developers produced was an operating system that made use of 1980s cutting-edge technologies but had roots in earlier operating systems. NT met Microsoft's broad requirement list, and that fact positioned NT to become widely adopted, no matter which of the popular operating system API sets, processor types, or network interfaces won market dominance.

This month I provide the first part of a two-part primer on NT architecture. I'll describe some of the design requirements that were goals for NT from the start. Then I'll outline in broad strokes the components that make up NT's base operating system and describe how they fit together. I conclude this month with a close look at NT operating system environments and system services. Next month I'll take an in-depth look at the NT Executive, Kernel, and hardware abstraction layer (HAL).

A Brief History of NT's Development
During NT's development period, from 1988 to 1993, the computing world was different from how it is today. DOS was the predominant PC operating system, and both Windows and OS/2 were gaining momentum. Servers and scientific and engineering workstations ran UNIX exclusively. Because all these operating systems were popular, Microsoft built support in NT for DOS, Windows 3.x, OS/2, and POSIX. This support created an upgrade and compatibility mode in NT for DOS and Windows users, and also enabled OS/2 and POSIX users to migrate to NT.

Microsoft realized that although NT's DOS and Windows 3.x support made its PC customers happy, enterprise-level (which at that time meant 32-bit) customers were more interested in the POSIX and OS/2 32-bit APIs. Microsoft saw that if it wanted to capture enterprise-level customers, it would have to develop its own 32-bit API. Thus, Win32, Microsoft's answer to OS/2 and POSIX, became NT's primary API.

Throughout NT's development period, most PCs used Intel's x86 processor. Several RISC processors competed for dominance of UNIX boxes: IBM and Motorola's PowerPC, MIPS processors, and Digital's Alpha. To keep its PC customer base, as well as to accommodate the desires of high-end users, Microsoft decided to make NT as portable as possible, and the company designed NT to run out of the box on any of the RISC processor chips. Microsoft reasoned that NT's portability across different processors would ensure its viability regardless of which chip came out on top in the market.

Windows 3.1 had no native networking support, and Windows 3.11's networking capabilities were cumbersome and relatively slow. As a result, Microsoft felt Novell NetWare's sting. Microsoft intended to avoid repeating these networking mistakes, and it outfitted NT with support for most of the APIs and networking protocols that were in widespread use in the 1980s. Those APIs included NetBIOS, remote procedure call (RPC), file server and redirector (Server Message Block­SMB), mail slots, named pipes, and Berkeley sockets. The protocols included TCP/IP, NetBEUI (Microsoft's LanManager protocol), IPX/SPX (Novell's NetWare protocol), AppleTalk Data Link Control (DLC), and SNA. By including protocols in NT that its competitors owned (i.e., AppleTalk and IPX/SPX), Microsoft opened the door to sites that were dominated by Macintosh or NetWare.

In addition to these high-level requirements, Microsoft included two important low-level operating system capabilities in NT. First, NT's developers designed its security subsystem as a centralized module that can be easily and thoroughly validated. This security configuration earned NT a C2 security rating.

Second, its developers gave NT a multitasking preemptive-scheduling system. Neither DOS nor Windows 3.x is capable of true multitasking with preemptive scheduling. Without add-on software, DOS can execute only one program, or task, at a time. Windows 3.x can execute several programs concurrently, but they must be well-behaved; that is, each program must be aware that other programs may need to run, and it must therefore yield the machine at regular intervals. This design means that a buggy or malicious program can halt the computer simply by entering an infinite loop in which it never yields. In NT, a centralized scheduling authority doles out CPU time to programs that need it. Once a program's turn has ended, the scheduler has the power to preempt it and give another program a turn.

Microsoft wanted NT to incorporate one final major feature: It had to be truly 32-bit and provide protected address spaces, à la UNIX. DOS and Windows 3.x are 16-bit operating systems. Programs running on them cannot easily access large amounts of memory. Using 32-bit addressing enables programs on NT to access 4GB (232 bytes) of memory efficiently. (A 64-bit version of NT is due for release within 2 years, and it will let programs address even larger amounts of memory efficiently.)

NT's developers ensured 32-bit system reliability by including protected address spaces in NT. Every program in Windows 3.x has a region of memory assigned to it. However, any program can scribble on the memory regions that belong to any other program--a program can even scribble on memory regions reserved for Windows, with disastrous effects. But with the protected address spaces in NT, all programs are confined to their memory regions and have no access (unless by permission) to the memory spaces of other applications. NT also prevents applications from accessing parts of memory owned by the Executive and by kernel-mode portions of the operating system, including device drivers.

An Overview of NT Architecture
Let's begin our look at NT's architecture by discussing the distinction between user mode and kernel mode. I discussed the user mode/kernel mode concept in "Inside the Blue Screen," December 1997, and I'll summarize it here (Figure 1 shows this architecture). User mode is the least-privileged mode NT supports; it has no direct access to hardware and only restricted access to memory. For example, when programs such as Word and Lotus Notes execute in user mode, they are confined to sandboxes with well-defined restrictions. They don't have direct access to hardware devices, and they can't touch parts of memory that are not specifically assigned to them. Kernel mode is a privileged mode. Those parts of NT that execute in kernel mode, such as device drivers and subsystems such as the Virtual Memory Manager, have direct access to all hardware and memory.

Other operating systems, including Windows 3.1 and UNIX, also use privileged and nonprivileged modes. What makes NT unique is where it draws the line between the two. NT is sometimes referred to as a microkernel-based operating system. Microkernel-based operating systems developed from university research in the mid-1980s. The idea behind the pure microkernel concept is that all operating system components except a small core (the microkernel) execute as user-mode processes, just as word processors and spreadsheets do. But the core components in the microkernel execute in privileged mode, so they access hardware directly. Figure 2, page 64, shows a pure microkernel operating system design.

Microkernel architecture gives a system configurability and fault tolerance. Because an operating system subsystem like the Virtual Memory Manager runs as a distinct program in microkernel design, a different implementation that exports the same interface can replace it. If the Virtual Memory Manager fails, thanks to the fault-tolerance possible in a microkernel design, the operating system can restart it with minimal effect on the rest of the system. In monolithic operating system design (e.g., DOS and Windows 3.1), the entire operating system must be rebuilt to change any subsystem. Figure 3, page 64, shows a monolithic operating system design. If the Virtual Memory Manager has a bug in a monolithic system, the bug is likely to bring down the machine.

A disadvantage to pure microkernel design is slow performance. Every interaction between operating system components in microkernel design requires an interprocess message. For example, if the Process Manager requires the Virtual Memory Manager to create an address map for a new process, it must send a message to the Virtual Memory Manager. In addition to the overhead costs of creating and sending messages, the interprocess message requirement results in two context switches: the first from the Process Manager to the Virtual Memory Manager, and the second back to the Process Manager after the Virtual Memory Manager carries out the request.

NT takes a unique approach, known as modified microkernel, that falls between pure microkernel and monolithic design. In NT's modified microkernel design, operating system environments execute in user mode as discrete processes, including DOS, Win16, Win32, OS/2, and POSIX (DOS and Win16 are not shown in Figure 1). The basic operating system subsystems, including the Process Manager and the Virtual Memory Manager, execute in kernel mode, and they are compiled into one file image. These kernel-mode subsystems are not separate processes, and they can communicate with one another by using function calls for maximum performance.

NT's user-mode operating system environments implement separate operating system APIs. The degree of NT support for each environment varies, however. Support for DOS is limited to the DOS programs that do not attempt to access the computer's hardware directly. OS/2 and POSIX support stops short of user-interface functions and the advanced features of the APIs. Win32 is really the official language of NT, and it's the only API Microsoft has expanded since NT was first released.

NT's operating system environments rely on services that the kernel mode exports to carry out tasks that they can't carry out in user mode. The services invoked in kernel mode are known as NT's native API. This API is made up of about 250 functions that NT's operating systems access through software-exception system calls. A software-exception system call is a hardware-assisted way to change execution modes from user mode to kernel mode; it gives NT control over the data that passes between the two modes.

Native API requests are executed by functions in kernel mode, known as system services. To carry out work, system services call on functions in one or more components of NT's Executive. As shown in Figure 1, the Executive components include the I/O Manager, Object Manager, Security Reference Monitor, Process Manager, Local Procedure Call Facility, and Virtual Memory Manager. Each Executive component has a specific operating system responsibility. Device drivers are dynamically added NT components that work closely with the I/O Manager to connect NT to specific hardware devices, such as disks and input devices.

A handful of other components are not usually described in Microsoft's NT architecture literature (e.g., the Cache Manager and the Configuration Manager). I'll describe these components in Part 2 of this primer.

NT's Executive components use basic hardware functionality implemented in the microkernel. The microkernel, which is known in NT as the Kernel, contains the scheduler. The Kernel also manages the Executive's use of NT's hardware and software interrupt handlers and exports synchronization primitives.

Device drivers and the Kernel use the HAL to interact with the computer's hardware. The HAL exports its own API, which translates abstract data into processor-specific commands. NT is portable across processor types because processor-specific code is restricted to the Kernel and the HAL. This situation means that when NT is ported to a new processor, only the Kernel and the HAL must be converted. The rest of NT's code is written in C and C++ and can simply be recompiled for the new processor.

Those are the basics of NT's architecture. Now let's delve into NT's operating system environments and system services more deeply.

Operating System Environments
NT's operating system environments are implemented as client/server systems. As part of the compile process, applications are bound by a link-time binding to an operating system API that NT's operating system environments export. The link-time binding connects the application to the environment's client-side DLLs, which accomplish the exporting of the API. For example, a Win32 program is a client of the Win32 operating system environment server, so it is linked to Win32's client-side DLLs, including Kernel32.dll, gdi32.dll, and user32.dll. A POSIX program would be linked to the POSIX client-side DLL, psxdll.dll.

Client-side DLLs carry out tasks on behalf of their servers, but they execute as part of a client process. As Figure 4, page 66, shows, in some cases a client-side DLL can fully implement an API without having to call upon the help of the server; in other cases, the server must help out. The server's aid is usually necessary only when global information related to the environment must be updated. When the client-side DLL requires help from the server, the DLL sends a message known as a local procedure call (LPC) to the server. When the server completes the specified request and returns an answer, the DLL can complete the function and return control to the client. Both the client-side DLL and the server may use NT's native API when necessary. Operating system environment APIs augment the native API with additional functionality or semantics that are specific to themselves.

One example of an operating system environment API that the environment's server must service is a CreateProcess function, in which the server creates a relationship between the client process and a new process. To create such a relationship, the server's CreateProcess function must call NT's native API CreateProcess function. An example of an operating system environment API that does not require client-side DLL interaction with its server is the ReadFile function. The ReadFile function can be implemented entirely in the DLL with the aid of the native API's ReadFile function. Because the ReadFile function does not require the update of global information, the server 's help is not necessary.

Because some operating system environment APIs require messages between a client and its server, an assumption has developed that system calls in NT are expensive. However, NT's LPC facility is highly optimized and very efficient. Nevertheless, Microsoft removed the most LPC-intensive portion of NT 3.51's Win32 operating system environment. Figure 5 shows NT 3.51 Win32 architecture, and Figure 6 shows the change to this architecture in NT 4.0.

The Win32 environment includes graphics and user-interface functions, which are implemented in its graphics device interface (GDI) and User components. In NT 3.51, whenever a Win32 program makes a drawing or user-interface call, the GDI or User client-side DLLs make LPC calls to the Win32 server (CSRSS.EXE). Those LPC calls to the server cause Win32's sluggish performance--the bane of microkernel-based operating systems. In NT 4.0, the User and GDI components move from user mode into kernel mode as a new Executive subsystem, Win32K.SYS. When a drawing call is made, the client-side GDI's DLL makes a new native system call into kernel mode, where the request is carried out (Win32 native system calls didn't exist in NT 3.5x). There is no message passing and no context switches--just a switch from user mode to kernel mode and back. This optimization has a dramatic effect on the performance of Win32 applications.

System Services
System services export the native API from kernel mode so that user-mode portions of NT can use it. The native API is intended for use by operating system environments, but nothing prevents an application from bypassing the operating system environment API and accessing the native API directly. However, the native API is usually undocumented, is very similar to (but more cumbersome than) the Win32 API, and would not give an application privileges or powers its operating system environment would not give it.

System services have names that begin with Nt. For example, Win32 has an API function called CreateProcess, which the Win32 server handles. CreateProcess calls the native API function, NtCreateProcess. The parameter lists for both functions are similar; however, CreateProcess performs significant amounts of work on behalf of the environment. For instance, it sets up the process's environment variables and command line and fills in the process address map with the program to be executed. System services validate parameters that are passed from user mode and then usually call functions within Executive subsystems. For example, NtCreateProcess calls the Process Manager Executive subsystem, invoking its PsCreateProcess function. Most system services are short because they serve primarily as thin interfaces between user mode and Executive subsystems. There can be a one-for-one correspondence between Win32 calls and native calls, but many Win32 functions make more than one native call to carry out a task.

Figure 7 shows the flow of control when an application calls a native API function. Applications and operating system environments that use the native API access it through a DLL named ntdll.dll. This DLL is linked to every process in an NT system and consists of entry points for every system service. These entry points don't do much other than preparing variables and causing a system service software exception. The System Service Exception Handler in kernel mode is executed in response to system service exceptions, and it uses a number associated with the requested service to index the System Service Table and find the function that implements the service. Thus, adding a new system service requires updates of that table and ntdll.dll. Microsoft continues to add to the number of system services in NT, and almost two dozen new calls will appear in NT 5.0.

Stay Tuned
Next month I'll conclude our two-part primer on NT's architecture with a look at the Executive and a description of the responsibilities and capabilities of each of its subsystems. I'll also take you inside the Kernel and down into the HAL.