Troubleshoot IIS problems

Last month, I introduced you to heap corruption and the PageHeap functionality of ntdll.dll. This month, I show you how to use Windows Debugger (WinDbg) and PageHeap to troubleshoot heap corruption. You set up PageHeap differently in Windows 2000 and Windows NT, and I address the setup in each OS separately.

Preliminaries
Be sure to set up heap troubleshooting for the first time on a test machine to familiarize yourself with the process. Some changes can adversely affect Windows if you don't make them properly. When you're finished troubleshooting the problem, be sure to reset the system to its previous state. The settings I show you how to enable use a lot of resources and affect computer performance.

Because of the random nature of heap corruption problems, I've written a simple .dll file called debug.dll that doesn't cause a crash by itself in a test environment. (If you use debug.dll in a production environment under load, however, it will probably cause a crash.) Rather, this .dll file shows you how the PageHeap functionality of ntdll.dll lets you track down a problem after you think that a program has heap corruption. (Note that these techniques let you debug any DLL or component on your server.)

Before you can begin test debugging, you need to run debug.dll. Download it from the Code Library on the IIS Administrator Web site (http://www.iisadministrator.com) to your Default Web Site's Scripts directory (or the directory you use to run scripts). Attach WinDbg to IIS. (For information about attaching WinDbg to IIS, see "Using the Windbg Debugging Tool," July 2001.) Now, run the program by opening a browser and typing the URL

http://localhost/scripts/debug.dll?overrun

You probably won't see anything unusual after entering this URL. But after you set up the heap-corruption tools, I'll instruct you to enter the URL again, and you'll see how the results differ. For now, select Break from the Debug menu to break into the program you're debugging. Then, select Debug, Stop Debugging. Click Yes when the Save base workspace information dialog box appears.

Ntdll.dll and Memory Allocation
The heap-corruption tools you're using are WinDbg and the PageHeap functionality in NT's memory manager routine, ntdll.dll. (Note that pageheap.exe is simply a tool that sets global flags to enable the PageHeap functionality in ntdll.dll.) When a request for memory comes in, ntdll.dll grants the requesting program as many bytes of memory as the program requests, usually beginning with the next available space following the previous request's allocation. When you enable PageHeap, ntdll.dll allocates memory on a much grander scale.

Ntdll.dll breaks memory down into pages. (A page of memory is 4KB.) When a program requests 200 bytes of memory, then another 200 bytes, then another 500 bytes, all three of these requests fit into one page of memory with a fair amount of memory still available (7292 bytes, to be exact). However, with PageHeap enabled, each request receives its own page of memory plus an extra page. Ntdll.dll generates the pointer for the memory by starting at the end of the first page and counting backward by the number of bytes requested, as Figure 1 shows. Ntdll.dll marks the second page of memory as no access. Thus, ntdll.dll gives the program a boundary so that if the program tries to write even 1 byte past the amount of memory it requested, the program hits the second page and immediately generates an Access Violation error.

PageHeap for Win2K
Unlike NT, Win2K has the PageHeap functionality built in. To set up PageHeap on a Win2K test system, follow these steps:

  1. Increase the size of the memory pagefile to at least 512MB or double its previous size, whichever is larger. To increase the memory size, right-click My Computer, select Properties, then click the Advanced tab. Click Performance, then modify the size of the paging file.


  2. Run gflags.exe from the debugging directory you chose when you installed WinDbg.


  3. In the Global Flags dialog box, which Figure 2 shows, enter the name of the process that you want to monitor in the Image File Name text box. This process will be either inetinfo.exe (if you want to monitor in-process IIS) or dllhost.exe (if you want to monitor all out-of-process applications).


  4. Select the Image File Options option.


  5. Select the Enable heap tail checking, Enable heap free checking, Enable heap parameter checking, Enable heap validation on call, Disable heap coalesce on free, Enable page heap, and Enable heap tagging check boxes. Click Apply, then click OK. (Note that you must click Apply before you click OK.)


  6. Open a command prompt and type
  7. net stop iisadmin /y to stop IIS. Then, restart your IIS services. (You can also use the Iisreset command to stop and restart IIS.)

PageHeap for NT
You must perform a few extra steps to use PageHeap in NT. To do so, follow these steps:

  1. Increase the size of the memory pagefile to at least 512MB or double its previous size, whichever is larger. To increase the memory size, right-click My Computer, select Properties, then click Performance and change the paging file size.


  2. Run gflags.exe from the debugging directory you chose when you installed WinDbg.


  3. In the Global Flags dialog box, which Figure 2 shows, enter the name of the process that you want to monitor in the Image File Name text box. This process will be either inetinfo.exe (if you want to monitor in-process IIS) or mtx.exe (if you want to monitor all out-of-process applications).


  4. Select the Image File Options option.


  5. Select the Enable heap tail checking, Enable heap free checking, Enable heap parameter checking, Enable heap validation on call, Disable heap coalesce on free, Enable page heap, and Enable heap tagging check boxes.


  6. In the Image Debugger Options section, select the Debugger check box, then enter the path to your WinDbg executable with the g parameter (e.g., C:\debuggers\windbg 'g')


  7. Click Apply, then click OK.


  8. For debugging inetinfo.exe (or any other service), be sure to set the service to interact with the desktop by


  • opening the Control Panel Administrative Tools applet, then double-clicking Services
  • double-clicking the service name (For IIS, the name is IIS Admin Service.)
  • clicking the Logon tab, then selecting the Allow service to interact with Desktop check box

To debug mtx.exe, you need to change the package identity to run under interactive user by

  • opening Internet Service Manager (ISM)


  • expanding Microsoft Transaction Server, Computers, My Computer, Packages Installed


  • right-clicking the package you want to debug, then selecting Properties (You need to perform these steps for the system package, too.)


  • under Identity, choosing the Interactive User option


  • Open a command prompt, and type
  • net stop iisadmin /y

    to stop IIS. Then, restart your IIS services. (You can also use the Iisreset command to stop and restart IIS.)

    Setup Complete
    Your system is now set up to troubleshoot heap corruption. In Win2K, reattach WinDbg to IIS. In NT, the debugger is automatically reattached when you start the service. Open a browser, and enter the URL

    http://localhost/scripts/debug .dll?overrun

    again. This time, you'll see an Access Violation error. Type

    kb

    in WinDbg's prompt window to perform a Stack Backtrace. A stack similar to the one that Figure 3 shows appears. Notice that the Access Violation error is in Simple!strcat, which is part of the string-copying routine that's built into Windows. This routine was called by the Simple!OverrunRequest function, which is the function you want to examine.

    To look at the code, open the simple.cpp file and go to the OverrunRequest function, which is near the end of the file. Notice that memory was allocated for a pointer called pszString. HeapAlloc is the API that allocated this memory. The last parameter I passed into HeapAlloc was the amount of memory that the API should allocate. In this example, I used the Strlen API to generate that amount. Strlen determined that the length of the string Overrun2 (i.e., the string stored in the Buf variable) was 8 bytes. Therefore, I requested 8 bytes for pszString. At first glance, that amount appears to be enough memory for the string, but I should also have allocated memory for the null terminator. (For information about null terminators, see "Heap Corruption, Part 1," November 2001.) I didn't, so when the program copied the string into the memory allocated for pszString, the program wrote past the boundary that ntdll.dll set. When you ran the URL before enabling PageHeap, nothing happened because ntdll.dll doesn't police the memory. However, after enabling PageHeap, the string copy generated an Access Violation error when the write occurred, thus letting you see the problem when it occurred rather than when it caused another problem, such as the divide by zero error in last month's example.

    Debugging Corruption in the Production World
    For this example, you used a debugger attached directly to the process. However, in the production world, such a setup isn't practical because servers are usually isolated and have strict requirements as to who can access them. To debug heap corruption in the production world on Win2K systems, you need to use ADPlus.vbs, which comes with the debugging tools you've already installed. (Creating a dump file on an NT system is a little more complex, and I don't cover it in this article.) For more information about ADPlus .vbs, see the Microsoft article "HOW TO: Use Autodump+ to Troubleshoot 'Hangs' and 'Crashes'" (http://support.microsoft.com/support/kb/articles/q286/3/50.asp).

    ADPlus.vbs lets you generate a snapshot of your system and analyze it offline. When you've followed the steps I outlined previously to properly set up your server, open a command prompt, then enter the syntax

    adplus.vbs —crash —iis —pageheap

    to start ADPlus.vbs. The —pageheap option tells ADPlus.vbs to generate a full dump on a first-chance debug break exception (i.e., an Access Violation error in the case of heap corruption) instead of the usual minidump (i.e., a small and incomplete snapshot of the process you're monitoring). Enabling this option lets you catch heap corruption problems as soon as they occur.