How to avoid frustration

Ever since Windows NT 4.0 arrived, I've learned a lot about handling system crashes. This skill never seemed necessary under NT 3.x, but I guess that's progress. (For information about system crashes, see "Recovering from a Network Disaster," March 1997.)

The most frustrating NT failures are the ones that occur early in the boot process, when NT is showing only text screens. Once the GUI is up, you get a lot more descriptive information about failures. Let me explain the text part of an NT boot. This explanation has the dual bonus of providing advice on fixing boot problems with both NT Workstation and NT Server because the basic low-level parts of Server and Workstation are identical.

The BIOS and the Boot
The initial part of an NT boot is the same in any Intel-based system (RISC systems are different). The system powers up, and the CPU vectors to address FFFF:0000, or 16 bytes below 1024KB (where all Intel-based computers have a BIOS ROM with boot instructions). The BIOS then scans the PC's memory between 640KB and 1024KB, looking for other BIOSs. If it finds any the system lets them initialize. BIOSs on add-on cards initialize themselves before the main system BIOS. (You've seen the video BIOS display its copyright notice before the copyright notice of the main system BIOS.)

Next, the BIOS loads the first physical disk drive's first sector into memory. That first physical sector, the Master Boot Record (MBR), has a partition table that describes the rest of the disk, and a small program that loads the first sector of the active partition. Most PC hard disks have two kinds of partitions: primary DOS partitions and extended DOS partitions. Like many PC components, partitions really make the most sense within the DOS context. Under DOS, most computers have a hard disk with a primary DOS partition and an extended DOS partition. The primary DOS partition has only one logical drive, drive C, and the system boots from it. The extended DOS partition can contain one or more logical drives. Originally, you couldn't create more than one primary DOS partition on a disk, but Windows 95 and NT support multiple primary DOS partitions. I don't recommend creating more than one primary DOS partition unless you have a special use for it. If you boot DOS 5.x or 6.x, the system will see its C drive and the extended DOS partition, but not the other primary DOS partition.

The Boot Sector and Beyond
After the program in the MBR loads the first sector of the active partition (the boot sector), the program looks for and loads a few files in the C drive's root. Some people use boot sector incorrectly to refer to the MBR, so get clarification if you hear someone use the term.

"C drive?" you ask. "I thought NT could load on any drive." Even if you install NT on drive D, E, F, or whatever, you need a few basic, low-level programs to start the boot process. Those programs must sit on the C drive. The first program is the NT Loader (NTLDR). You know that NTLDR is running because the screen clears and displays OS Loader V4.00 at the top.

But what if the boot sector on your C drive is damaged? That situation can happen if you install some other operating system and that OS overwrites your NT boot sector. You can get NT to restore the boot sector: Just start the process to reinstall NT by booting NT with the three floppies that come in the box. When the Welcome to Setup screen appears, don't press Enter to continue; press R to repair the configuration. The next screen offers you choices: Inspect Registry files, Inspect startup environment, Verify Windows NT system files, and Inspect boot sector. Select Inspect boot sector, and continue. The repair routine rebuilds the boot sector for you.

If the boot sector is OK, by the way, but NTLDR isn't on the C drive, you'll see BOOT: Couldn't find NTLDR/Please insert another disk. (Inserting another hard disk isn't, I'm sad to say, the answer.) You can restore NTLDR with the restore procedure, except you choose Verify Windows NT system files.

Next, if the boot sector is good and NTLDR exists, NTLDR looks for the file boot.ini in the C drive's root. boot.ini contains the information that NT uses to display the OS menu--the one that offers you the option to boot NT, NT with VGA, or your previous operating system. The file is ASCII, which you can edit with a regular text editor. But if you don't have that file, what does NTLDR do? NTLDR assumes that you want to load NT, and NT is in c:\winnt.

I got in trouble once when I accidentally deleted boot.ini from a computer with Win95 on the C drive and NT on the D drive. I was just cleaning up my root directory. (Ever heard that one before?) Because I had Win95 on the C drive, I would always be able to boot, right? Wrong. The lack of boot.ini made NTLDR try to load NT from C--where it wasn't --so I got the message, The file %systemroot\system32\ntoskrnl.exe is either missing or corrupt. This message confused the heck out of me. How can cleaning up a few files in the root of my C drive cause ntoskrnl.exe to fail to load properly? Of course, my Win95 installation was safe on the C drive; I just couldn't get to it because NTLDR had no instructions explaining where to find it. All I had to do was boot from a floppy and put a newly constructed boot.ini on the C drive. (That easy solution is not what I did, unfortunately. Read on.)

I could boot only from floppy because the C drive was formatted as a FAT partition. What if it hadn't been? Well, you can build a partial NT boot floppy that'll work even with NTFS drives (because of space restrictions I'll take up boot floppies some other day). Or, you can run the Repair utility and select Inspect startup environment to build a simple boot.ini that will get you back up and running. In my case, that problem happened late at night, so I demonstrated how truly stupid I can be by reinstalling all the software on my laptop instead of just rebuilding boot.ini. Argggh!

The Hardware in There
Once you choose NT, NTLDR needs to know what hardware you have. NT finds out what's in your computer with a program called ntdetect.com. Right below the OS menu, you'll see NT
DETECT V4.0 Checking Hardware. This program sniffs out the hardware in your system, and gives that information to NTLDR to build the part of the Registry in HKEY_LOCAL_MACHINE\Hardware.

If you have NT running on a machine, NTDETECT will never give you any trouble. But I spend a fair amount of time installing NT on other people's machines, and I've seen NTDETECT lock up. When you see the message, NTDETECT V4.0 Checking Hardware, for a couple of minutes, your system is probably solidly locked up. I sometimes see that message when I install NT on a PC rented for demonstrations in a class. PC rental company technicians sometimes install a SCSI host adapter with a built-in floppy controller. That floppy controller conflicts with the system's floppy controller, and the system locks up. If an NTDETECT lockup happens, some piece of hardware confused NTDETECT and will confuse NT. How do you pinpoint the hardware? Simple: with the debugging version of NTDETECT.

On the NT CD-ROM, look for the directory \support\debug\i386. This directory has a file named ntdetect.chk. You'll need this file on Setup Disk 1. First, DISKCOPY Setup Disk 1 (don't do this operation on the original, please). Then erase the ntdetect.com on your copy of Setup Disk 1, copy ntdetect.chk to the disk, and rename it ntdetect.com. Now run NT Setup, and when ntdetect.com runs, it'll give you tons more information.

Once NTDETECT has created the hardware information, the system begins loading the OS. But, I'm out of space, so I'll talk about that step next month.