Windows IT Pro is the leading independent community for IT professionals deploying Microsoft Windows server and client applications and technologies.
  
  
  Advanced Search 


April 1999

Disk Failures Hit Hard


RSS
Subscribe to Windows IT Pro | See More Backup and Recovery Articles Here | Reprints | Or get the Monthly Online Pass—only $5.95 a month!

Avert unrecoverable failures

Last month was a bad month. I had to deal with two catastrophic hard disk failures and a couple of near misses in my primary desktop system. And I had to clean up the mess that each failure left myself. The primary disadvantage of building your own computers is that you can't yell at someone else when something bad happens.

To begin, I'll describe the hard disk subsystem that I use, because it's a bit unusual for a desktop system. I'm a firm believer in SCSI devices, and I use only SCSI-based storage devices (i.e., other than the 3.5" drive). Thus, the hard disks, CD-ROM drives, and tape units that I use are SCSI. When the first hard disk failed, my system had an 8X CD-ROM, a 1GB hard disk, one 4.5GB hard disk, one 9GB hard disk, two 2.1GB hard disks in individual external cases, and one external DAT backup installed. These devices connected to an Adaptec AAA-133 three-channel caching UltraSCSI controller designed for servers. I really like this controller even though it's more expensive than a standard workstation SCSI controller. The AAA-133 offers good hardware RAID support, although I wasn't using RAID at the time, and comes with advanced software-management tools, including the Adaptec CI/O Array Management software and the Adaptec firmware SCSI support utilities. The AAA-133 is also a cost-effective multichannel product when you compare its cost to that of multiple less-expensive controllers, such as the ubiquitous Adaptec 2940. Vendors such as IBM, HP, and Dell include another member of the AAA-130 controller family onboard many high-end Windows NT workstations.

At the start, I had the 1GB boot disk, the two external hard disks, and the DAT backup on the first channel; the 4.5GB and 9GB hard disks on the second channel; and the CD-ROM on the third channel. I use a large, server-style case with a big power supply and extra fans for this desktop system, which sports dual-Pentium II 266MHz processors and 160MB of RAM. I had also accumulated some extra hard disks. You never know when you'll need a spare hard disk or extra storage capacity, and the prices some sites (e.g., http://www.onsale.com) listed were too good to pass up. I purchased a 1GB hard disk for $90, a pair of Micropolis 4.5GB Fast/Wide hard disks for $139 each, a 9.1GB Seagate Elite hard disk for $200, and for some reason—you really have to watch impulse buying—a 2.1GB Seagate SCA interface drive with a SCSI adapter for about $89.

The first sign of trouble was when one of my external hard disks started to chirp for no apparent reason. In 1 week, the sound went from the occasional cricket to a full-blown chorus, so I heeded the warning of a disk failure. Fortunately, the disk contained only installation files for applications (e.g., installers, .cab files). Preventing a full disk failure wasn't a big deal. I simply copied the contents of the disk to one of my servers, removed the disk, and created a network share mapped to the same drive letter to prevent confusion when applications look for their installation files. I did a low-level format of the hard disk, which returned it to the occasional-chirp state, and installed a copy of NT Workstation on the disk—just in case I needed it. I left the disk in the chain, but I turned it off. I averted failure number one.

One morning 2 days later, I walked into my office and the system was down. The boot disk failed. Although the SCSI controller recognized the disk, the system didn't boot. This hard disk was only the boot device, not the disk on which NT lived. Because this disk had the only FAT partition in the system, I spent half the day with the usual assortment of disk recovery tools and tried to recover the disk. I eventually gave up, replaced the disk with the 1GB hard disk I had on the shelf, and restored the contents from a tape backup. I didn't lose anything but a day's worth of work time and a bit of patience. I dealt with failure number two.

A couple of days passed, and I again found my system dead when I entered my office in the morning. This time, a message had appeared on the screen. The boot loader couldn't read the system files on the system disk. I booted to my backup NT installation and ran the chkdsk utility on the faulted disk. The procedure fixed a half-dozen errors on the disk and rebooted into a normal startup. I dodged that bullet, right?

No such luck. A week later, I got back from a meeting and the system disk was toast. The SCSI BIOS hung when BIOS tried to detect the disk. It was about as dead as a disk can get. And to add insult to injury, this 4.5GB hard disk was the newest disk in the system (other than the recently replaced boot disk), with less than a year's worth of use. And to make matters worse, I didn't have a recent backup of this disk. I had backups of some of the important files, such as my outlook.pst file and the mail data files from Eudora. But my most recent disk backup was too old to reliably use with the upgraded applications and service packs that I had applied more recently. A week's worth of work on current projects was totally gone.

I tried every trick I knew to get the disk back up so that I could pull the data off, but to no avail. Recovery from this failure still isn't complete. I took nearly 4 days to get the system back up and running, restore the OS and applications, recover some of the data, and configure my system the way I like. I dealt with failure number three.

But the computer gremlins weren't finished with me. Minutes after getting the system buttoned up with a replaced 4.5GB hard disk, the computer began to make a screeching ball-bearing noise. After pulling the case apart again, I discovered that the noise came from my 9.1GB hard disk, which was also fairly new. I didn't want to deal with another disk failure, so I rebooted the system, copied the contents of that 9.1GB hard disk to a server, and replaced the disk with the Seagate Elite 9.1 hard disk. Although the screeching 9.1GB disk is in my dead pile, at least it didn't cause any extra problems. I averted hard disk failure number four.

Everything is working now, and the system has been stable for 3 weeks. I have one minor problem with the Micropolis 4.5GB hard disks: The disks don't consistently respond fast enough for the Adaptec controller. About half the time, I get a Disk drive not ready response during the BIOS check of the drives when I reboot. But after the system completes the boot, the Micropolis 4.5GB hard disks are always available to the OS and the Adaptec management system software. I hope the slow response during reboot is just a bug in the disk-drive firmware. (Micropolis went belly-up, and as a result, technical support is a bit difficult to find.)

I've learned the obvious lesson from these hard disk failures: Keep my backups current. But I also discovered how tough it is to find devices and software to regularly back up more than 20GB of desktop storage (somewhat out of the ordinary for a desktop system). Next month, I'll describe the backup strategy that I decided to use and how I implemented it. I'll keep my fingers crossed and hope that I don't have more hard disk failures until I have a complete backup strategy in place.

End of Article



Reader Comments
I use Norton Ghost to make disk or partition images of my system.
It also can make Compressed Images, also with NTFS partitions.
I can get back my system in about 5 minutes per Partition. This product is really great i highly recommend it.
Now i am using DOS Norton Ghost 2002.

Jose Paz April 02, 2003


You must be a registered user or online subscriber to comment on this article. Please log on before posting a comment. Are you a new visitor? Register now




Top Viewed ArticlesView all articles
Command Prompt Tricks

One reader shares his tip for setting up the command prompt to reflect a remote path. ...

WinInfo Short Takes: Week of November 23, 2009

An often irreverent look at some of the week's other news, including some post-PDC some soul searching, a Google Chrome OS announcement and a Microsoft response, Windows 7 off to a supposedly strong start, the Jonas Brothers and Xbox 360, and so much more ...

2009 Windows IT Pro Editors' Best and Community Choice Awards

Picking a favorite product from an impressive crowd of competitive offerings is never an easy task, and such was the case with our Editors' Best and Community Choice awards this year. ...


Related Articles Client Backup Strategies

Storage Whitepapers Turn to a Proven Server and Storage Migration Solution

The Impact of Disk Fragmentation on Servers

Take Control of Your Email: Understand the Business Reasons for Email Storage Management

Related Events Backup – The Backbone of Your Business

Disk-to-Disk Grows Up

Effectively Shrinking Your Backup Window – with CA ARCserve Backup Data De-duplication and the Riverbed Steelhead Appliance

Check out our list of Free Email Newsletters!

Storage eBooks A Guide to Windows Certification and Public Keys

SQL Server Administration for Oracle DBAs

Keeping Your Business Safe from Attack: Encryption and Certificate Services

Related Storage Resources Introducing Left-Brain.com, the online IT bookstore
Looking for books, CDs, toolkits, eBooks? Prime your mind at Left-Brain.com

Discover Windows IT Pro eLearning Series!
Clear & detailed technical information and helpful how-to's, all in our trademark no-nonsense format


Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro DevProConnections IT Job Hound
Left-Brain.com Technology Resource Directory asp.netPRO ITTV Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 © 2009 Penton Media, Inc. Terms of Use | Privacy Statement