Windows IT Pro is the leading independent community for IT professionals deploying Microsoft Windows server and client applications and technologies.
  
  
  Advanced Search 


April 1999

Linux and the Enterprise


RSS
Subscribe to Windows IT Pro | See More Internals and Architecture Articles Here | Reprints | Or get the Monthly Online Pass—only $5.95 a month!

Linux 2.2 introduces a new mechanism that lets network server applications request notification of certain events in a more efficient way than select. A Linux server thread marks a communications endpoint as a notification endpoint; any new connections a client establishes through that endpoint cause the system to notify the application—the thread doesn't wait for the event. Furthermore, when such an event occurs, the kernel tells the application precisely which event is occurring, thus eliminating the searching that select requires. Unfortunately, this feature has two major limitations. First, the feature applies only to particular communications endpoints (TCP/IP and TTY devices) and events related to a new client connection—this mechanism does not notify a server thread of new requests over existing connections. Second, just as for select, the kernel wakes up all threads waiting for a new client request event—not just one thread, as efficient server applications require.

Linux fails to meet the second requirement for supporting scalable server applications because the OS currently has no asynchronous APIs. If a Linux server thread initiates an I/O request, the thread can't perform any useful work while the I/O is in progress. Instead, the thread waits.

Many Linux developers mistakenly believe that the existence of a form of I/O in Linux known as nonblocking I/O means that Linux supports asynchronous I/O. An application that requests nonblocking I/O can attempt to read from a network connection, for example, and the application doesn't wait until data is available on the connection before the application continues executing. A major difference exists between Linux nonblocking I/O and true asynchronous I/O, however: An application performing a nonblocking I/O call does not initiate an I/O operation if the I/O cannot be immediately satisfied. If the application wants to initiate I/O, the application must issue the I/O request when I/O is possible.

A quick example illustrates nonblocking I/O. If a server thread performs a nonblocking read from a network connection and no client data is ready on the connection, the server thread will not wait for data to become available. Instead, the thread can perform other work. However, the thread must issue further read operations until data is ready on the connection. By contrast, a server thread that issues an asynchronous read actually initiates a read I/O but can also perform other work without issuing additional reads. The OS notifies the thread when client data arriving at the connection can satisfy the issued request.

Overscheduling
Even in a network server implementation that has a small pool of threads that share the processing of new client connections and client requests, the syndrome overscheduling can adversely affect server performance. If a server thread takes a client request and actively uses a CPU to process the request, and the server starts processing another client request on the same CPU, both threads will compete for CPU time. This situation introduces overhead when the OS switches between the threads to give each access to the processor. The higher the number of threads that actively compete for CPU time, the worse the overhead problem becomes. The goal of a high-performance server application is therefore to have as few threads competing for the CPU as possible. To achieve this goal, the application requires OS support.

The OS must make it likely that only one thread will process a request at a time, and that when that thread finishes with the request, the OS will choose the same thread to process the next request. Such support prevents a situation in which the thread finishing a request goes back to waiting while the OS launches another thread to handle the next request. A network server application that achieves this support will almost never have overhead that the OS scheduler causes when it switches the CPU among multiple threads.

To achieve this support, the OS scheduler must keep track of which server threads are active and which threads are waiting for events. NT integrates this knowledge into its completion ports and uses completion ports as gateways for threads to use the CPU. (Figure 2 shows an example of a completion port.) If a server thread begins processing a request after receiving notification from a completion port, the scheduler will not notify any other threads waiting on that completion port for client requests until the processing thread voluntarily gives up the CPU, usually by blocking on I/O. If the active thread finishes its processing without giving up the CPU and waits for another event at the completion port, the scheduler will immediately notify that thread of the next waiting request, and the thread will continue running. If a thread gives up the CPU while processing a request—for example, while it waits for some other event not associated with the completion port (such as transmitting a very large Web response)—the scheduler will notify another thread waiting at the completion port of the next client request. This thread-throttling mechanism helps the server application minimize the number of actively scheduled threads and get the most out of a CPU.

Without thread-throttling support in the Linux scheduler, Linux server applications must rely on less-precise methods in their application code to try to achieve the same goal. However, these Linux applications will not realize performance benefits to the extent that NT applications do with NT scheduler support.

Kernel Reentrancy
The Linux community heralded Linux 2.2's recent release as Linux's coming of age in the multiprocessing world. One big reason for this jubilation is that in Linux 2.2, parts of the kernel are reentrant. A reentrant kernel function is a function that can simultaneously execute on multiple CPUs in a multiprocessor. If one CPU is executing a non-reentrant function, another CPU wanting to execute the same function must wait until the first CPU is finished. This effect is known as serialization, because the two CPUs' execution of the function is sequential if viewed on a timeline, as Figure 3, page 98, shows. Serialized execution defeats the advantages of multiprocessor execution, because the non-reentrant functions execute as if they were on a uniprocessor.

Linux 2.2 is more reentrant than previous versions of Linux were. However, several major Linux functions are still not reentrant. These functions include read and write, the two most common functions network server applications use. A Linux server application will read client requests from a communication endpoint, read data from a file (such as a Web file or email database) to respond to the requests, and write the file to the client via the communications endpoint. Even if the data requested from the client is in a memory cache and a read from a file is not necessary, the write paths still serialize.

Figure 3 demonstrates the difference between the execution of a non-reentrant function and that of a reentrant function. In the top half of the figure, the kernel spends time waiting for both CPUs to execute the non-reentrant function; in the bottom half of the figure, the kernel doesn't spend time waiting for the reentrant function. The OS in the bottom half of the figure finishes executing the same code sooner than the OS in the top half of the figure does.

Many members of the Linux development community believe this kernel-waiting-time difference to be an insignificant performance problem. This belief comes about almost certainly because no performance studies of Linux 2.2 on enterprise workloads have yet taken place. To glimpse how serious a problem kernel waiting time will be for Linux, look at recent developments in NT. NT's write function for network I/O was reentrant except for the part of the function that NIC drivers (network device interface specification—NDIS—drivers) handled, wherein they transfer data to their network hardware. Making this small part of the entire write function non-reentrant was enough to prevent NT from competing effectively with Sun's Solaris OS on enterprise applications executing on 4-way multiprocessors. To remedy the situation, Microsoft let NDIS drivers have deserialized, or reentrant, write paths in NT 4.0 Service Pack 4 (SP4) and Windows 2000 (Win2K). In Linux 2.2, several functions, in addition to read and write, are still not reentrant, and each time a Linux server application uses such a function, the function's non-reentrancy hampers multiprocessor scalability.

Sendfile
A final area in which Linux is at a disadvantage is its implementation of the sendfile API. Sendfile is an API that Microsoft introduced to NT several years ago as a feature that network server applications can use to enhance their performance. Obvious candidates for the API are Web servers. Without sendfile, a Web server that receives a request from a browser for an HTML file must first read the contents of the file into its memory and then send the contents to the client via a communications endpoint. The process of reading the file into private application memory is wasteful, because the server application doesn't want the contents of the file—it merely wants to send the contents to the client.

Sendfile eliminates the necessity that a server application read a file before sending it. With sendfile, the server application specifies the file to send and the communications endpoint in the sendfile API, and the OS reads and sends the file. Thus, the server doesn't have to issue a read API or dedicate memory for the file contents, and the OS can use its file system cache to efficiently cache files that clients request. Soon after Microsoft implemented sendfile in NT, UNIX vendors implemented sendfile in their OSs.

Linux has a sendfile implementation, but the Linux sendfile has several problems that developers must fix before Web servers (and other applications that can use the sendfile API) on Linux can achieve the same benefits that UNIX variants and NT obtain from their sendfile implementations. On NT, sendfile doesn't incur a copy operation if the file being sent is in the NT file system cache. In other words, the network software can send the data directly from the cache. But on Linux, sendfile copies the file data into buffers that sendfile hands to the networking code. This extra copy operation consumes CPU time and creates a larger memory footprint for the server application, both of which adversely affect performance. Another problem with Linux's sendfile is that it is non-reentrant. Thus, using Linux's sendfile leads to the serialized execution of part of a server application, which inhibits the application's ability to scale on a multiprocessor. A final problem with Linux's sendfile is that it doesn't let the system preappend a data buffer to the front of the file sendfile is sending. In the case of Web servers, this limitation necessitates another system call to request an HTTP header before the server can send the file.

Linux Isn't There—Yet
The limitations I've cited are most of the major shortcomings in Linux's support for enterprise applications. However, other limitations might lurk beneath those I've pointed out. Despite the Linux community's claims to the contrary, Linux 2.2 is not ready for the enterprise or for multiprocessors. Linux is not engineered with enterprise computing in mind, nor has the OS been present in enterprise environments where administrators, programmers, and users can notice its limitations. Consequently, Linux's kernel threads are ineffective at supporting enterprise applications, and the Linux kernel is unable to scale applications on multiprocessors as well as other OSs can. Certainly over the next year or two, as Linux's momentum pushes it into the enterprise, Linux will face its shortcomings. When that happens, and Linux's developers address the OS's problems, UNIX variants and NT will feel a compelling threat to their enterprise dominance from this open-source OS.

End of Article

   Previous  1  [2]  Next  


Reader Comments
A lot of Linux advocates need to calm down, and I’m probably one of them! In Mark Russinovich’s “Linux and the Enterprise” (April), the author compared NT to Linux at the kernel level, then used his results to imagine which OS is better for his definition of enterprise computing. But NT and Linux aren’t just kernels. If they were, nobody would use them. The total combination of kernels, libraries, and applications is what makes customers use computers.
Perhaps if Mark had written the article for just the Linux kernel mailing list, he would have found a less zealous and more technically capable audience already familiar with the topics he discussed. Furthermore, any weaknesses he uncovered would be fixed immediately.
Seeing all the interest in Linux, why don’t you start another Linux magazine or at least add a regular column to <i>Windows NT Magazine</i>? In the real world, a lot of companies have both NT and Linux (or some other flavor of UNIX), so I imagine interoperability would be of great interest to your readers.<br>
--V.C. in Alameda, CA

V.C. in Alameda, CA August 09, 1999


Inside the narrow arena in which Mark Russinovich compared Windows NT and Linux, his observations were correct. NT is technically and practically superior to Linux in many areas, but the converse is also true. NT has superior SMP capabilities and also superior tightly coupled clustering capabilities. However, NT is inferior in loosely coupled clustering. The dirt-cheap prices of PCs today threaten to drive thin clients, SMP servers, and tight clusters into a very small niche. If so, who wins?
Linux isn’t ready to take on NT in all areas, but Linux improves on a daily basis. Microsoft’s traditional approach to dealing with the competition won’t work with Linux: Microsoft can’t bully Linux out of the market, nor can Microsoft defeat it in a commercial sense. In all areas, NT must increase its technical lead or catch up to Linux. I’m not sure that Microsoft is up to the task. Microsoft needs to reinvent itself in ways that are counter to its corporate culture. Regardless, Linux isn’t going to supplant NT overnight. The fight will be a long one, and inevitably all of us will win.<br>
--David H. Lynch Jr.

David H. Lynch Jr. August 09, 1999


You must be a registered user or online subscriber to comment on this article. Please log on before posting a comment. Are you a new visitor? Register now




Top Viewed ArticlesView all articles
Command Prompt Tricks

One reader shares his tip for setting up the command prompt to reflect a remote path. ...

WinInfo Short Takes: Week of November 23, 2009

An often irreverent look at some of the week's other news, including some post-PDC some soul searching, a Google Chrome OS announcement and a Microsoft response, Windows 7 off to a supposedly strong start, the Jonas Brothers and Xbox 360, and so much more ...

2009 Windows IT Pro Editors' Best and Community Choice Awards

Picking a favorite product from an impressive crowd of competitive offerings is never an easy task, and such was the case with our Editors' Best and Community Choice awards this year. ...


Related Events Windows Internals with Sysinternals Webinar

Deep Dive into Windows Server 2008 R2 presented by John Savill

Check out our list of Free Email Newsletters!

Windows OSs eBooks Understanding and Leveraging Code Signing Technologies

A Guide to Windows Certification and Public Keys

SQL Server Administration for Oracle DBAs

Related Windows OSs Resources Introducing Left-Brain.com, the online IT bookstore
Looking for books, CDs, toolkits, eBooks? Prime your mind at Left-Brain.com

Discover Windows IT Pro eLearning Series!
Clear & detailed technical information and helpful how-to's, all in our trademark no-nonsense format


Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro DevProConnections IT Job Hound
Left-Brain.com Technology Resource Directory asp.netPRO ITTV Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 © 2009 Penton Media, Inc. Terms of Use | Privacy Statement