Insights from my first Win2K Pro blue-screen experience

I admit it—I've been spoiled. Believe it or not, since I migrated my primary desktop workstation from the last release candidate (RC) of Windows 2000 Professional to the final release more than a year ago, I haven't experienced a single blue screen of death (aka a Stop error). Moreover, my Win2K Pro desktop system has been so rock solid and reliable that in the entire time I've been using Win2K, the system has never forced me to reboot my workstation. (However, I have chosen to do so on a few occasions either out of habit or based on the belief that doing so might restore lost system memory or improve system performance.) This situation has even become an ongoing joke at my office: One of my colleagues, who has been forever jaded by the continual reboot requirements he's experienced under Windows NT and Windows 9x, stops by my desk every so often to tease me and tell me it's about time to reboot my system. This nirvana-like state of computing has certainly been wonderful, but, alas, it couldn't last forever.

All Good Things Must End
For me, blue-screen day (B-Day) finally happened while I was busily engaged in my usual routine of simultaneously running about a dozen applications. On this occasion, I had just begun a CD-Recordable (CD-R) disc-burning session. Shortly after returning to my word processing application to do some writing, I encountered the Ghost of Crashes Past—the dreaded blue screen. I had enjoyed such flawless reliability for so long that a moment passed before I realized that yes, I was seeing an actual blue screen, and no, I wasn't going to have a chance to press Ctrl+S to save the paragraph I had just typed.

I snapped out of the mild state of shock I was experiencing, pulled myself together, and began recollecting the system recovery skills I had acquired during my NT days. First, as is always a good idea in these situations, I wrote down the offending error code of the Stop error message. In this case, the error was a familiar one—Stop: 0x0000000A IRQL_NOT_LESS_OR_EQUAL. However, even as I wrote down the details of the error message, I secretly believed that the incident was nothing more than karmic repayment for the gross negligence and overconfidence I had displayed in not rebooting my system more frequently. With this possibility in mind, I simply pressed the Reset button and waited for the system to successfully reboot. However, much to my chagrin, I was instead greeted with a second and identical blue-screen error, which appeared this time during the startup portion of the Win2K boot process. So much for my temporary affliction theory: Apparently, I had a full-fledged system disaster on my hands.

I immediately wrote down all the important information provided on this blue screen, including the aforementioned Stop error code and the various drivers loaded in memory at the time the system halted. As I reviewed the screen, I made a disturbing observation: Where there would usually be an offending driver listed to the right of the Stop error code and its four arguments, I instead found an empty blue space. This omission was bothersome because a system service or driver referenced here is an important clue and often proves helpful in tracking the offending service or driver. This fact is true even when the cause of the blue screen isn't the referenced driver but rather a related or conflicting service or driver. Because the blue screen didn't provide me with this information, I was on my own.

Calling Sherlock Holmes
As anyone who has ever had this experience can readily attest, staring at a system blue screen that doesn't contain a great deal of useful information is one of the more frustrating circumstances that a Win2K or NT user can experience. Without a fundamental knowledge about how and why blue screens occur and an arsenal of troubleshooting techniques, you can easily feel a bit helpless and desperate. Questions such as What does this error mean? What happened to cause this error? and What am I supposed to do with this blue-screen information? are common and reasonable in these situations. And believe me, many such questions were running through my mind at that moment. Luckily, NT 4.0 and Windows 3.x have put me through the wringer enough times that I've developed a decent array of troubleshooting and recovery skills. In addition, I was looking forward to putting some of my new Win2K-centric techniques and tools to the test (a sick impulse, I know).

The first step in my troubleshooting regimen was to determine what might have caused the problem. (After all, just moments before the crash, I had been happily computing.) I ruled out the possibility that a newly loaded service or driver was causing the conflict because I hadn't recently installed any new software or drivers on the system. After considering the alternative possibilities, I determined that the most likely culprits were an overheated CPU, disk corruption, or a hardware failure.

Discounting the overheated CPU theory was easy—I simply waited for the system to cool down, then restarted it. No luck there; I again faced the dreaded and obstinate blue screen. Strike one.

Recovery Console Chkdsk
My next theory, disk corruption, would be difficult to determine unless I had a way to run a disk diagnosis and repair utility on the system hard disk. The system's hard disk is NTFS formatted, and I hadn't installed a parallel Win2K installation on the system, so I decided to access and test the disk through the Win2K Recovery Console (RC) and its Chkdsk command. Although you can launch the RC from the Win2K Setup process (whether initiated by CD-ROM, 3.5" disk, or Microsoft Remote Installation Services—RIS), I had previously used the Winnt32 /cmdcons command to set up a hard disk­based version of the RC on the system. (For more information about Win2K's RC, see "Related Articles in Previous Issues.") Therefore, I elected to use this method to launch the RC. After selecting the RC option from the Win2K OS boot-loader menu, I logged on to the primary Win2K installation at the RC's command-line prompt. Next, I ran the Chkdsk command to diagnose and, if necessary, repair the disk. No discernible corruption existed on the volume, and Chkdsk determined that no repairs were necessary. Strike two.

Hardware Hunting
My last course of action was to proceed based on my third theory: that I had experienced some type of hardware failure. This scenario was the one that I was most hoping to avoid because the potential causes were seemingly endless in number: Was the problem related to the motherboard, video card, SCSI bus, system memory, CPU, cache memory, or something else? I decided to start replacing system hardware on a component-by-component basis until the blue screen disappeared. Before I began this tedious process, I tried one last-ditch diagnostic effort: booting into Win2K's special Safe Mode through the F8 boot-time option to generate a bootlog.txt file.

Safe Mode, which is similar to the Windows Me and Win9x startup mode of the same name, starts the system with a minimal set of drivers and services. Win2K actually has several safe-mode options, one of which is called Safe Mode. With the exception of the VGA Mode Only option, all Win2K safe-mode options generate a diagnostic log file called bootlog.txt in the root of the system partition (usually C:\bootlog.txt). This file logs the success or failure of each system component initialized during system startup. You can then use the file to determine which system component was the last to load before the blue-screen event. This information sometimes yields clues about the source of the problem.

To initiate this diagnostic mode, I pressed F8 when prompted during the boot process, and selected the Safe Mode option. I was surprised that this time the system briefly went as far as displaying the GUI before displaying the blue screen. I also noticed an encouraging development in this Safe Mode­generated blue screen: The blue screen now listed a driver name next to the Stop error code. I immediately recognized the driver, cdr4_2k.sys, as being related to my system's CD-R and CD-Rewritable (CD-RW) drive. This information, coupled with the recollection that I was burning a CD-R disc just before the original blue screen, led me to suspect that my CD-R and CD-RW drive was causing my problems. I removed the drive from the SCSI chain, and the system booted normally.

Safe Mode Savior
Although the standard-mode boot Win2K blue screen disappointingly yielded less-than-complete diagnostic information, I was happy that the Safe Mode option provided me with the clues I needed to track down the problem. Based on this experience, I recommend that you always boot to Safe Mode first when you experience a blue screen—especially a blue screen that provides limited information—in case doing so yields any diagnostic clues about the Stop failure.

Related Articles in Previous Issues
You can obtain the following articles from Windows 2000 Magazine's Web site at http://www.win2000mag.com.

SEAN DAILY
Tricks & Traps, "Daily Answers," November 2000, InstantDoc ID 15744
"Mastering the Recovery Console," July 2000, InstantDoc ID 8918

JOHN D. RULEY
Windows 2000 Pro, "Key Recovery Console Commands," July 2000, InstantDoc ID 8957

MARK RUSSINOVICH
"Crash Dump Analysis," February 2001, InstantDoc ID 16425 NT Internals, "Inside the Blue Screen," December 1997, InstantDoc ID 301
Corrections to this Article:
  • In the article "Recovering from B-Day" (June 2001), all references to bootlog.txt should have read ntbtlog.txt, and the reference to c:\bootlog.txt should have read c:\winnt\ntbtlog.txt. We apologize for any inconvenience these errors might have caused.