We’ve all heard of exploitable buffer overruns—programming errors that let people write programs full of strange strings that cause your application to suddenly do anything an attacker wants. This simple vulnerability accounts for most exploitable security issues, and even though security experts have known about this problem for a long time, new instances crop up all the time. Errors in handling strings account for most of the problem, though not all of the time. (For more information about string handling, see my previous article, "Avoiding Buffer Overruns with String Safety
Although several explanations of buffer overruns exist on the Web (see the sidebar, "Buffer Overrun References"), most tend to be very technical and often show you how to exploit a particular application. Some of these articles come complete with code that lets attackers do something they consider useful. Instead of going into that much depth, I’m going to demonstrate the problem with an example program that you can step through to see how it works. This article isn't meant to teach you how to attack people, but instead to teach you how an attack works and what the consequences are of writing insecure code.
Consider the code in Listing 1. Start by looking at main. Notice that the example uses an unsafe call to fill a static buffer, which means that two possible overruns might occur in this application. I’ll leave the task of exploiting the second one as an exercise for you. The reason I left in this unsafe call was because I needed some odd characters in my input stream to get the example to work, and feeding a file to the application via stdin was easier than typing possibly unprintable characters at the command line (although a Perl script could have overcome this problem). Typically, you never want to use gets()—fgets() is much safer and won’t overflow your buffer.
The first function the code calls is foo(), and you feed this function the string you obtained from the user. The foo() function then takes this string, fails to check to see whether it will fit, and then tries to stuff it into the static buffer buf. Main then calls function bar(), and you're done.
If you’re following along, you can set up your compiler to illustrate the problem, assuming you’re using Visual C++ 6.0. If you’re using something else, you can still compile this code, but you’ll have to tweak the input file a little. Open a new console mode project, include overflow.c (above), and set the active configuration of the project to a release build rather than the debug build, which has traps for this sort of problem. Under the Project menu, choose Settings, and select the Link tab. Enable the Generate debug info option in the middle of the page. The debugging information that the compiler generates will be useful when we step through the assembly code. As an aside, I’m not a very good assembler programmer, but I suggest that all programmers at least learn to read assembly code because you can hit problems that only stepping through the code will reveal.
Now, create a file called normal, and put just the word "ok" into it. Save the file, and run your application from the command line. The output will look like the following:
bar runs with arg = bar ran from main
Now, make a file called test, and put the following characters into it:
This string uses ascending values so that you know exactly where the overflow rewrites the return address—a trick I learned from David Litchfield, whose work and security products you can find at http://www.cerberus-infosec.co.uk. He also has two excellent papers on buffer overruns posted at his site—see the references for details. Run your application again with the new file, and notice that the output this time is different:
Now I have a pop-up window that tells me that the program could not read the instruction at 0x46454443, and asks me whether I want to debug or terminate the application. The value 0x46454443 is interesting because the individual bytes equate to the characters FEDC. Now, go into your test file and delete everything after the "E" character, and run it again. You get the same output, but the error is now at 0x00454443. Notice that you've now moved where the next instruction is going to try to occur.
The next step is to determine just where you want the program to go. If you were an attacker, you’d be looking into where various system calls link in and whether you have enough room in the buffer to insert assembler instructions. Because this is a benign example, just make the program jump to the bar() function. To find out where the bar function is in memory, set a breakpoint at the beginning of main and start the program (F5). Visual C++ will immediately complain, telling you that it has disabled the breakpoint and that execution will start at the beginning of the application, which is what you want. Now that you've loaded your application in the debugger, you can see the assembly code, annotated with the debugging symbols. Scroll up until you find _bar: marked. The instruction you want is located at 0x00401030.
Stop the execution of the program, and open your test file using Visual C++. Be sure you tell the program to open the file as binary (this option is at the bottom of Visual C++’s File, Open dialog box). Highlight the last letter on the binary side ("E" or 0x45), and change that letter to 0x40 (@). Now, change the "D" character to 0x10 (linefeed) and the "C" to 0x30 (0). Save the file, and run the application again. The output is now
bar runs with arg = Calling foo
The application blows up because of an illegal memory reference, but note that it has already run the code you wanted. If you were really going to exploit this application, you’d be concerned with manipulating the arguments passed and other technical details. Although this example illustrates the potential problems you can encounter with buffer overruns, you can use good coding practices to avoid these situations. Next time, we’ll examine the security implications of a simple POP3 server.