The popular open-source Web server extends its Windows capabilities
Web servers have become one of the fundamental building blocks of the business world's IT infrastructure. The two big names in today's Web server market are Microsoft IIS, which has long dominated the Windows Server market, and the Apache HTTP Server, which has been the favorite for other OS implementations, most notably UNIX.
In recent years, IIS has come under fire for various reasons, especially for security concerns. But what other viable Web server options have Windows shops had? Apache, which the Apache Group manages, is a powerful, sophisticated, and mature Web server that many regard as a crowning achievement of the open-source community. And many companies that use Apache in their products—for example, IBM, which uses Apache in its WebSphere product—actively contribute to and support Apache, so the Web server continues to grow and adapt to the fast-changing business environment. However, the product's use on Windows platforms has been limited—until now.
With the release of Apache 2.0, this popular, highly reliable and scalable Web server has increased its portability and performance on the Windows platform and improved extensibility across all platforms. The new version's improvements mean that you can use Apache's power to your advantage, even if your organization runs Windows 2000 or Windows NT.
Improved Portability and Performance
The Apache Group's original goals were security and stability. In Apache 1.3, the development team created a secure, reliable, and highly available server that could scale under heavy loads. That version has long been the Web server of choice on just about every non-Windows platform (e.g., Linux, OS/2, Sun Microsystems' Solaris) and even runs on Win2K, NT, and Windows 98. However, Apache 1.3's high level of scalability doesn't extend to the product's Windows port, so Apache hasn't been a true competitor of IIS in organizations that run Windows. The Apache Group decided a new design was necessary to deal with key concerns across all platforms, most notably code complexity (which would have made Apache increasingly more difficult to improve and fix over time) and performance.
The Apache Group's efforts to support more platforms originally came in the form of conditional compilation, which means that only certain code is compiled into a program, depending on whether a stated condition (e.g., the OS in use is Solaris) is true. As the number of supported platforms expanded and variations among supported platforms increased, the Apache code base rapidly grew extremely complex and the use of conditional compilation quickly became a hindrance to development. Performance was another factor that the Apache Group needed to deal with. Apache relied on some basic assumptions about how to treat CPUs, I/O, memory, and processes. This reliance wasn't a problem on most UNIX systems because Apache was developed on and primarily for that platform. However, the architectural model around which Apache was built didn't work well on non-UNIX OSs and even on some types of UNIX, so these assumptions led to scalability problems on those platforms.
The Apache Group decided to deal with both concerns in one fell swoop, building Apache 2.0 on an abstract layer and tuning that layer's implementation to the underlying platform. As a result, Apache 2.0 provides major improvements in both portability and performance.
One of the abstractions now built into Apache 2.0 is the Apache Portable Runtime (APR). As Figure 1, page 118, shows, the APR serves as an interface between the OS and the Apache HTTP Server and handles system services such as I/O, shared memory, and child-process management. The APR, however, doesn't affect the model that Apache uses when handling concurrent client connections, which was Apache's primary performance problem on Windows. Earlier Apache versions assume that the underlying OS can efficiently create child processes. But some OSs, such as Windows and IBM AIX, better support multithreading, so using child processes to handle client-network connections isn't an optimal solution on those platforms. Instead, Apache needed to use a thread to handle each connection. (Multithreading isn't as resilient to errors as using multiple processes. An errant thread can cause problems for other threads, whereas in a multiple-process design, each process is essentially independent. However, Windows servers use the multithreading model, so a certain—albeit minimal—trade-off of reliability for the sake of performance was necessary.)
The Apache Group's solution—Multi-Processing Modules (MPMs)—are both theoretically simple and elegant in implementation. MPMs create an abstraction between Apache and the method used to handle concurrent client connections. The idea is that the Apache Group—or a developer who wants to support a new platform—builds an MPM specific to a given architecture or feature specification (e.g., using child processes, multithreading, or the combination that best fits a particular user's needs). During the configuration process, the Apache user chooses the MPM that best suits the user's environment, although by default the configuration file that comes with each port of Apache uses the MPM specific to that platform. Apache then relies on the MPM to handle the low-level details of handling concurrent client connections. For Windows systems, Apache uses the winnt MPM. As Figure 2 shows, this MPM works by having the Apache server create a child process, which then creates and controls the necessary threads to service each client connection. (A parent process is necessary in this case to create a new process if the client process dies.)
Apache 1.3 was already extensible because of the server's reliance on modules, a type of Apache add-on component. A module is simply software that uses the open Apache Module API. Despite the power of the existing Module API, the Apache Group developed a new and improved API for module writers and a new framework from which that API operates. Although this change means that modules written for earlier Apache versions don't work with Apache 2.0, the change increases Apache modules' abilities in several fundamental ways, such as providing a powerful new filtering capability. During configuration, you can use this filtering capability to layer modules and create complex systems. Even better, filtering can simplify the way that modules interact with Apache. For example, under earlier Apache versions, the only way to add Secure Sockets Layer (SSL) support was to modify Apache itself. Under Apache 2.0, however, SSL is simply an add-on module. Table 1 lists several Apache 2.0 modules and their purposes. (To view a list of Apache modules available with the Apache installation, examine the LoadModule directives in httpd.conf and the associated module binaries in \program files\apache group\apache2\modules. To view a more comprehensive list, visit the Apache Web site at http://httpd.apache.org/docs-2.0/mod.)
Note the absence of an Active Server Pages (ASP) module in Table 1. I intentionally left ASP out of the list because at the time of this writing, Apache doesn't offer full support for ASP. Fortunately, a project is in the works within the Apache community to create and maintain an ASP module that can compete with Sun ONE Active Server Pages (formerly Sun Chili!Soft ASP) for UNIX. Although this new ASP module, which is implemented through the Apache mod_perl module, doesn't offer the same breadth of features as IIS, the module is maturing slowly. For more information about the ASP module's development, you can visit http://www
As you can see, Apache 2.0 offers Windows users enough performance, flexibility, and power to rival IIS. The next step is to install Apache on a test system and see for yourself.
The first step in installing Apache is to download the most recent product software from http://www.apache.org/dist/httpd/binaries/win32. Notice that the Windows package doesn't include SSL support. If you want SSL support, and many sites do, you need to download OpenSSL (an open-source SSL library implementation) from http://www.openssl.org, then use Microsoft Visual C++ (VC++) to compile OpenSSL and mod_ssl. More information about this process is available at http://httpd.apache.org/docs-2.0/mod/mod_ssl.html.
To install the Apache software, first remove or disable any existing IIS installation, then click the downloaded Microsoft Installer package for Apache, agree to the Apache license, and enter your Web site's domain name, your Web server's name, and the Webmaster's email address. The Installer will provide some default suggestions, which you can change if necessary.
Next, you receive a prompt to choose the Typical or Custom installation. Little difference exists between these installations. The Custom installation lets you selectively install the required programming headers and libraries to compile new modules, so if you plan to compile modules later, you need to install these libraries. Otherwise, choose Typical.
Apache lets you use the default installation path for the Apache programs, or you can choose an alternate path. Note that you can later change the DocumentRoot path (known as the WebRoot path in IIS) in the Apache configuration file.
When you click Finish to complete the installation, the Installer creates the Apache2 service, starts the Apache server, and adds the Apache Control icon, which lets you stop and start Apache, to the System Tray. To test whether Apache is running on the Web server, view the URL http://localhost.
To configure Apache, modify the plaintext file httpd.conf in the \program files\apache group\apache2\conf directory. Although this file looks rather complicated, you don't need to concern yourself with most of the included directives (i.e., configuration commands). The file is ready to be used; at most, you might need to tweak a few settings for your environment.
The file's syntax is simple: The syntax permits only one directive per line, and directives aren't case sensitive, although some directives' options are. To comment out directives, add the pound sign (#), which is the comment character in most UNIX configuration and script files, to the beginning of each line.
So what directives should you pay close attention to as a new Apache user? The most important directives are ServerName, DocumentRoot, LoadModule, and—especially important for Windows users—ThreadsPerChild and MaxRequestsPerChild. (For a complete listing of directives, visit http://httpd.apache.org/docs-2.0/mod/directives.html.)
ServerName. The ServerName directive specifies the Web server's DNS name. For example, if you've configured DNS so that your Apache Web server is named www.example.com, configure the ServerName directive as follows:
DocumentRoot. The DocumentRoot directive specifies the root of the document tree visible to clients—WebRoot in IIS. Thus, if you use the D drive to store your Web site, configure the DocumentRoot directive as follows:
The path statement in this directive might appear a bit odd. Apache supports Windows drive letters but requires the use of UNIX-style forward slashes to separate file-path components. Thus, the Windows-style path D:\WebRoot becomes D:/WebRoot. Also, be aware that the Apache2 service runs as SYSTEM, so the location you specify in the DocumentRoot directive needs Read access for the SYSTEM account. (Running as SYSTEM is the default, but you can alter that by changing the Apache service's Log On property.)
LoadModule. The LoadModule directive lets you load modules. (You might not work much with modules at first, but you need to know how to load them and how to stop them from loading.) The directive's syntax is as follows:
LoadModule module_identifier relative_path_to_module_file
where module_identifier is a string that identifies the module to the LoadModule directive and relative_path_to_module_file is the path to the module file, relative to the Apache installation directory. For example, to load the WebDAV module mod_dav, configure the directive as follows:
LoadModule dav_module modules/mod_dav.so
The Apache documentation specifies each module's identifier, but a good rule of thumb is that the identifier is the end of the module name (i.e., the portion following "mod_") followed by "_module." For example, the identifier dav_module corresponds with the module mod_dav. Note that Apache module filenames end with the .so extension, which Linux systems use to denote a shared library file (similar to the .dll extension in Windows systems). Apache uses the extension across platforms for consistency and to minimize conflicts within the Apache documentation.
ThreadsPerChild and MaxRequestsPerChild. The ThreadsPerChild directive specifies how many threads a child process can create. (Each thread can service one connection.) If you set this value to 50, for example, your Web server can handle only 50 client connections at a time. Therefore, ensure that this value is set properly for your environment.
The MaxRequestsPerChild directive specifies how many connections a child process can handle before the Apache parent process kills the child process and creates a new child process. You can use this directive to limit a process's life and thus minimize memory-leak problems (when using an experimental module, for example). You can set the directive to 0 to turn it off so that the child process doesn't terminate automatically after a certain number of connections.
For example, to permit each child process to create as many as 250 threads (so that Apache can handle 250 connections simultaneously) and to prevent the parent process from killing and restarting the child process after a certain number of connections, configure the directives as Listing 1 shows. Note that in this example, the ThreadsPerChild and MaxRequestsPerChild directives are enclosed in an <IfModule> block, in which the <IfModule> directive specifies the winnt MPM. Some directives work only when you've loaded certain modules. To determine whether a module is loaded and to apply applicable directives if it is, Apache lets you use the <IfModule> directive to enclose the directives within a block, as Listing 1 shows. (If you'll always be using the winnt MPM, you can remove the <IfModule> directive to always specify ThreadsPerChild and MaxRequestsPerChild.) Note that you must append the .c extension to the module name. (You might wonder why <IfModule> uses .c and <LoadModule> uses .so when specifying modules. The <IfModule> directive uses the module name that module developers embed in each module file and that ends in a .c, whereas <LoadModule> loads a file from the file system; this file ends in .so.)
Other types of directives let you apply a scope to configuration parameters. For example, when you want to create a virtual host (as the sidebar "Using Apache to Create Virtual Hosts" explains), you can use the <VirtualHost> block to specify directives specific to that virtual host, as Listing 2 shows. (To find out more about applying scopes to a set of directives, visit http://httpd.apache.org/docs-2.0/sections.html.) You might also want to learn more about some of the more arcane but useful directives, such as the mod_rewrite module's Rewrite directive, which lets you rewrite the URL that a user specifies. I suggest that you adjust and test various values in the configuration file as you work with Apache. Apache has long had a reputation for incredible flexibility, so be sure to maximize this flexibility for your benefit.
After you make any changes to the Apache configuration file, restart Apache. To do so, use the Apache Control icon or the tools provided in the Start menu's Apache program.
Take the Advantage
Apache is an incredible tool that has proven to be both flexible and reliable—a difficult task. And now that the product can scale on any supported platform, you can use this flexibility and reliability to your advantage under Win2K. Test Web sites that use static pages, then progress to CGI scripts and even server-side scripting such as PHP Hypertext Preprocessor (PHP). After you see the many advantages to running Apache on your Windows servers, you'll be hard pressed to find a reason not to use Apache in your production environment.
Corrections to this Article:
- In Dustin Puryear's article "Apache 2.0 on Win2K" (April 2003, http://winnetmag.com, InstantDoc ID 38288), the author says that when installing Apache software you need to "first remove or disable any existing IIS installation." Is this a necessary step, and can Apache and Microsoft IIS coexist?