Welcome to the first in a series of articles about writing secure code for the Win32 API, specifically for Windows 2000 (Win2K) and Windows NT. Although some of the topics also pertain to Windows 9x, these systems don't have the rich set of security features provided by the Windows NT-based line of OSs, so some columns in this series wont apply.
In this column, I'll focus on writing secure code using C and C++ because I'm most familiar with these programming languages. In addition, Microsoft used C and C++ to write Windows 2000 and Windows NT, and you can most easily access some of the OSs' security features using these languages. Finally, these languages are also the ones developers most commonly use to code commercial software.
So why not discuss Java or Visual Basic (VB)? Both languages make writing secure code easy by properly bounds-checking arrays, and Java has the nice automatic garbage-collection feature. To some extent, my choice to focus on C and C++ is
personal preference because only these languages provide such speed and full access to
system calls that I use on a daily basis. Unfortunately, one of the reasons C and C++
executables provide such speed is because these languages give you plenty of room to shoot
yourself in the foot. A friend of mine once said that if C lets programmers shoot
themselves in the foot, then C++ just gives them a machine gun! My experience is that
poorly written C++ code is often worse than poorly written C code. However, C++ provides
some convenient features and lets you write code that is better organized. I encourage
anyone entering a career as a developer to learn C++ thoroughly. This education is
essential to learning Distributed COM (DCOM), which developers are increasingly using to
interface with Win2K and NT.
Finding Resources
The discussions in this column assume that you have some
fundamental familiarity with basic programming techniques. If youre new to
programming in C or C++, I recommend that you start with a good reference. One of the best
titles available is A Book on C: Programming in C by Al Kelly and Ira Pohl (ISBN:
0201183994). Another good reference is C: A Reference Manual by Samuel Harbison and
Guy Steele (ISBN: 0133262243). Ira Pohl has also authored some excellent books on C++, and
if youre learning C++, make sure you spend some time getting to know the standard
template library (STL). STL simplifies many difficult tasksonce you get past the
learning curve. If you're already familiar with coding basics, a great reference that will
improve your coding skills is Writing Solid Code by Steve Macguire (ISBN:
1556155514). Even if youve been programming in C for 15 years, this book is worth
reading. Some of the best programmers I know have told me that theyve learned from
this text, and it's essential to anyone starting out.
Laziness, Impatience, and Hubris
The first topic I want to cover in this series is that secure code
is really just solid code. The steps you take to make your code more secure are also going
to make your code more robust and reliable. However, in Programming Perl by Larry
Wall, Tom Christiansen, et al. (ISBN: 1565921496), the authors assert that the three
characteristics of a great programmer are laziness, impatience, and hubris. On the face of
it, this assertion seems oddyou would think we should be hard working, patient, and
humble. Why would the authors say such a thing, and why would I agree with them? Let's
look at all three.
The first characteristic, laziness, makes you go to great effort
to reduce overall energy expenditure. Think of laziness in the long termwriting code
correctly the first time is easier than going back and patching that code, especially if
youve shipped a commercial product. One off-by-one error I made cost my company
dozens of hours because we had to ship a dot release to put out a fix. What's more,
our customers didnt appreciate paying for a product that wasnt working
properly. Be lazydo it right the first time.
The second characteristic, impatience, is what makes you write
applications that are easy to use. An impatient programmer doesnt put up with poorly
constructed programs that perform poorly, or are hard to use. The third characteristic,
hubris, is defined as excessive pride. It's what motivates you to write programs that
people will like, and the programmers who look at your code once you move on wont
say nasty things about you. Although these factors are true, some of the worst programmers
are the proudest of their work, and some of the best are quietly humble. Dont let
your desire to write the best code you can blind you to your ability to make mistakes.
Assume that you will make mistakes, and constantly ask yourself how things can go wrong.
If youre dealing with user input, or anything outside of the direct control of your
application, assume that the outside world is bent on your destruction. Youre not
being paranoidthey really are out to get you. Get someone else to review your code
in depth, pick it apart, and point out seemingly inconsequential details.
Rules for Writing Solid Code
Some aspects of writing solid code from a security standpoint are
arcane, such as properly changing user context and setting permissions. However, most
real-life security problems stem from code that isnt robust and contains programming
mistakes. The most common security concern results from buffer overruns. Guarding against
these attacks isn't rocket scienceexploiting the buffer overrun might take a fairly
skilled hacker, but writing code that prevents these attacks only takes a careful
programmer. Here are some rules I try to follow:
Write code that is easy to read. Use lots of white
space and comments. Code that is easy to read is easy to review, and if you have to come
back to it later, youll appreciate the notes to yourself. If you have to work on
someone elses code, or someone else has to work on your code, then comments can make
the difference between making the correct changes and breaking something else. Make sure
that the logic flow is easy to follow and well documented. If you can't easily follow the
logic in a piece of code, the code will be more likely to do something unexpected. And
unexpected behavior is often a component of a security problem.
Fail gracefully. I once showed one of my programmers
a bug that passed a null pointer into one of his functions and, as a result, caused the
application to fail. His response was "Thats never supposed to happen!"
Unfortunately, it did happen. A function should never cause the entire application to
fail, no matter what the input or how many layers are above the function that should be
validating the input for you. A related topic is to use assert() liberally. Assert(),
which operates only in a debug build, validates your assumptions by checking to see
whether a conditional statement is true or false. A nice way to include this function is
int foo(char* arg)
{
if(arg == NULL)
{
assert(FALSE);
return 1;
}
.....
}
Assert() works only when you run a debug build, but this function will gracefully throw an
error in release mode. Further, if you're running a debug build, youll land in the
debugger right where the problem first showed up. Just be aware that you should never use
structured exception handling as a substitute for writing solid code in the first place.
Write simple functions. New programmers often make
the mistake of creating functions that do too many things, are too long, and are too
flexible. Ive seen single functions that were longer than 1000 lines and repeated
the same code multiple times within the function. You will also sometimes see this problem
when a programmer extends an existing piece of code over time (a phenomenon known as code
rot). A complex or long function is difficult to test, difficult to review, and more
likely to contain security problems. Remember, keep your functions simple.
Encapsulate properly. Encapsulation is one of the
properties of an object-oriented programming (OOP) language, but you can also use it with
other languages. Encapsulation hides internal implementation details and provides public
interfaces to an object. This property lets you make changes in one place, and as long as
you dont alter the external behavior, you dont have to change any of the code
where you use the public interfaces. For example, I may need to validate a certain piece
of data multiple times in many different places. The best way to accomplish this task is
to write a function that validates the data in one place. That way, if I ever need to
change the maximum size of an input string, I only need to change it in one place.
If you use encapsulation properly, it can make writing robust code
in C++ easier than writing the same code in C. For example, a common problem is parsing an
input string, and the results may get passed down through several layers of functions. If
you use C to perform the parsing, you either need to pass an argument for both the buffer
and the length or check the length of the string on input and let the rest of the code
assume that the string is a certain lengtha sloppy and perilous approach. Several
documented security bugs have come from similarly written code. If you use C++ to parse
the string, you can easily make a class that provides all the operations you need done to
the input string. You can then pass a reference (or a pointer) to the class down through
the functions, and any assumptions about sizing are all done in one place.
Stay tuned. Next time I'll get into the details of how a buffer overrun works, and how we can handle strings safely so that our code isnt the next topic of conversation on the security mailing lists!