Troubleshooting IIS problems
So far in this troubleshooting series, I've shown you how to examine the results of a crash, but I haven't yet shown you how to delve into what can cause the crash. This month, I show you how heap corruption can make pointers go bad, overwrite critical data, or cause loops and hang systems.
What Is Heap Corruption?
What is heap corruption? Simply put, heap corruption is the circumstance under which misbehaving code corrupts the data heap. (The data heap is a block of memory that the OS sets aside for an application to hold its data in.) To better understand this corruption, let's first revisit how a multithreaded OS and application work.
Windows is both cooperative and preemptive. To implement the cooperative part, applications use synchronization objects that the OS provides. (For more information about synchronization objects, see "Debugging IIS Deadlocks and Blockings," October 2001.) To determine the preemptive part, Windows uses a thread scheduler and a complex set of algorithms. (For in-depth information about thread execution times, see the "Thread Scheduling" section, Chapter 4, David A. Solomon, Inside Microsoft Windows NT, 2nd edition, Microsoft Press, 1998.) When you're working with heap corruption, you must understand both of these concepts.
Memory Allocation for Thread Use
Let's look at a situation in which two threads are running independently and one thread causes corruption in another thread. Thread 1 is processing a request from a client, so that thread requests memory from the heap. Ntdll.dll is responsible for handling this memory allocation; it looks at the heap, determines the best location to give the thread, and passes back a pointer for the memory to Thread 1. To make this determination, the pseudocode that Figure 1 shows calls ntdll.dll. Now Thread 1 has some memory it can use. Figure 2 shows the memory block in the heap.
When the time slice for Thread 1 is finished, Windows stops the thread's execution and determines that Thread 2 is next in line. Thread 2 starts and determines that it also needs memory from the heap, so it requests three pieces of memory that it will use to perform a math division routine. For the sake of this example, assume that Thread 2 stores its numbers as characters and converts them to numbers when it does the math. (Note that although this concept might seem strange, the practice is fairly common and has many uses and benefits.) Ntdll.dll looks at the heap and determines that the next available memory spot is at 15, so it starts giving out memory to Thread 2 from that spot. Figure 3 shows the pseudocode that calls ntdll.dll. Now Thread 2 has memory. Figure 4 shows the heap at this point.
Thread 2 now decides to assign values based on the numbers that were passed in to two of the three character variables in the pseudocode in Figure 3:
a = number1 (converted to a character)
b = number2 (converted to a character)
Figure 5 shows the heap after the variable assignment. Windows determines that Thread 2's time slice is finished, so Windows stops Thread 2's execution and lets Thread 1 start again. Thread 1 picks up where it left off. Because Thread 1 has the memory it requested, it starts copying the request string"Our string here"into this memory.
String Storage in Memory
At this point, you must understand how the OS stores most strings in memory. You might have heard of null-terminated strings: a string in memory that has an ASCII character of zero as the last character in the string (referred to as a null character). Note that this character isn't the printed 0, which is actually a decimal value of ASCII 48. When Windows reads a string, it starts by reading the first character of the string (i.e., the character that the pointer points to), then reads each subsequent character until it finds a null character. Windows then knows to stop reading.
Thread 1 copies the request string to the memory that X points to. This copy routine knows that a null-terminated string is involved and automatically tacks on a null character at the end of the string. The string is 15 characters long; 15 bytes are allocated for this string. Figure 6 shows the memory block following this allocation.
Thread 1 now parses the string, completes its work, then completes execution. Windows terminates the thread because the thread is finished. The memory is released, but it's not reset or overwritten. The memory is simply available for a new request from another thread. Note that only the original 15 bytes allocated for Thread 1 are available for reallocation.