In my previous article, I discussed various types of resource starvation attacks that an attacker might use to bring down your services. One type of attack that deserves a closer look is CPU starvation, where an attacker leverages your mistakes to cause your system to consume all available processing resources. It isnâ€™t enough to write robust code that doesnâ€™t fail, you must also write efficient code that consumes CPU and memory sparingly. The latest machines might have an astronomical amount of memory and fast processors, but thatâ€™s no excuse for being careless about how you use those resources. The example code in Listing 1 reinforces this need to be spartan while still getting the job done.
Size of Input vs. Operation Speed
Before we dive into the details of the code, letâ€™s look at how we typically describe the relationship between the size of input and the speed of an operation. If an operation always takes the same amount of time, it is order 0 (usually written as O(0)). If the amount of time grows linearly with the size of the input, it is order n (O(n)). Next, a functionâ€™s speed can vary with (n)log(n), and your worst case scenario is n2.
When you optimize your code, you need to understand what your library calls do at the code level. Source code to the C runtime libraries comes with Visual C++, and I strongly suggest that you look at some of your favorite functions to understand what they do. If in doubt, write a test application to perform an operation repeatedly and benchmark different approaches. For example,
snprintf( buf, sizeof( buf) -1 , "%s%s", prefix, ending );
might look elegant and be easy to read, but
strncat( buf, ending, sizeof( buf ) â€“ 1 â€“ strlen( prefix ) );
is actually much more efficient because the format parsing code underlying any of the printf() family of functions performs a lot of work. I prefer to use the first form when it isnâ€™t in performance-critical code because it's easy to maintain, but drop to the second form if the code needs to perform an operation often.
The RightWay() and the WrongWay()
The code in Listing 1 evaluates two functions: RightWay() and WrongWay(). Both functions take an input string, copy it to an output buffer, and replace any instances of duplicate \ characters with single \ characters. Note that each function is the same except for one line of code. The code first checks to see whether weâ€™ve encountered the end of the input string or if weâ€™ve reached the end of our output buffer. Next, the code checks to see whether this character and the next are backslashes, and if so, increments our pointer and continues. If this character and the next arenâ€™t both backslashes, the code writes them out to the buffer and continues.
The WrongWay() function uses strcpy() to write the characters to the buffer and makes the mistake of assuming that the code calling this function has allocated an output buffer the same size as the input buffer. As the size of the input string grows, strcpy() repeatedly writes the entire output buffer, just to deliver one character. This mistake makes WrongWay() an n2 function. The RightWay() function just writes one character at a time and does a lot less work in the process. It is also more robust because it writes the characters into the output buffer only until it fills the buffer.
Next, let's turn our attention to what's going on in main(). First, we allocate two buffers to hold both the input and output strings. Second, we need to build the test strings (e.g., c:\\a\\a\\â€¦\\a\\foo.txt). The for() loop that builds the test strings is also written to be efficientâ€”one sure sign of poor programming is the use of strcat() in a loop, which builds a very long test string. Finally, we use GetTickCount() to time the functions. I have to point out that a bug exists in this method: GetTickCount() returns the number of milliseconds because the system started and stores this information in a DWORD. If your system has been up for exactly 49 days, 17 hours, 2 minutes, 47.295 seconds, there will be a rollover error as you hit 232 milliseconds of uptime. When I use this function in production code, I always make sure to account for this error.
Take a look at the results in Table 1. As long as the input string is fairly short, you never notice that a problem exists. However, as soon as the input string exceeds 1000 characters, a difference becomes apparent. At 10,000 characters, it takes the WrongWay() function more than 200 times as long as the RightWay() function to do the same amount of work. Note that I performed these tests on an older 200MHz P6-200 system, so you will probably get much faster times when you try this yourself. I also tried running up to 100,000 characters, but it took so long that I didnâ€™t want to wait on it.
If the error I've demonstrated in the example code were present in a network service, several requests delivered simultaneously could cripple a server for an extended period. This example shows how just one line of code can make the difference between a service that stands up to attackers and one that will fail. The most important thing to remember is to plan not for what you expect to happen, but for the worst thing that can happen.