Enhanced Disk Storage for Windows NT
RAID or Redundant Array of Inexpensive (or
Independent) Disks, can improve disk drive performance by spreading data across
multiple disks that are treated as one logical drive. A RAID subsystem can
enhance system performance, provide fault tolerance, simplify the process of
adding disk capacity, and make building extremely large disk volumes possible.
RAID has been around on UNIX and mainframe systems for many years, and the
technology was built into Windows NT from the start. With the flurry of activity
surrounding NT clusters and server scalability, RAID has recently gained new
exposure in the NT market (see the June 1997 issue for more information about
NT-based cluster solutions). But what is RAID? And how can it help you improve
your NT systems' performance and reliability?
Let's explore the answers to these questions with a detailed technical look
at RAID options for NT systems. I discuss the best RAID levels to use for
optimizing performance and fault tolerance, and provide some general guidelines
for choosing a RAID system. For an introduction to RAID, see
Raid Levels
Hardware and Software RAID
The two types of RAID are hardware RAID, in which the disk controller
performs the RAID functions, and software RAID, in which the operating system
performs RAID functions. NT 4.0 lets you use hardware- or software-based
solutions or combine the two to achieve the best performance and fault
tolerance.
Enhanced Disk Storage for Windows NT RAID
distributes data on multiple disks, boosting server performance,
protecting data--or both. |
Many vendors, including Adaptec, American Megatrends (AMI), Compaq, and
Mylex, provide hardware RAID solutions (disk controllers and array chassis) that
offer many of the RAID levels listed in "RAID Levels." RAID 0, 1, and
5 are the most common. As a rule, hardware-based RAID solutions are faster and
more reliable than software-based ones. They also offer a greater range of
configuration options. Of course, they're more expensive than using NT's
built-in RAID, but if you want the best performance, strongly consider including
hardware-based RAID in your overall system budget.
NT supports RAID functionality, offering software settings for RAID 0 in NT
Workstation and 0, 1, and 5 in NT Server. The advantages of software RAID are
the convenience of built-in software and cost. However, performing RAID
functions through the operating system instead of offloading them to a separate
controller can slow server performance.
As you can see in "RAID Levels," each level has different
performance characteristics, fault-tolerance capabilities, and drive usages.
Some levels offer excellent all-around performance, and others sacrifice this
performance to gain fault tolerance (for more information about how the
performance of RAID 0, 5, and 10 compare, see "Optimizing Exchange Server,"
November 1996). Because each level is suited to a particular environment, your
main challenge when choosing a RAID subsystem is to decide which RAID level to
use on your server under what conditions. Let's look at the tradeoffs of the
RAID levels most commonly used in NT systems and some tips for choosing the best
RAID for your system. (See "RAID Tips," for some RAID-optimization
hints. And for information about other ways to improve disk subsystem
performance besides RAID, see "Pumping Up Your Server")
Optimizing for Performance: RAID 0
Disk performance is a critical factor in server performance. Disk access is
much slower than memory access. Therefore, the faster your disk I/O, the faster
your server's response time. As a rule, RAID 0 (i.e., plain disk striping)
provides the fastest I/O and thus the best performance.
RAID 0, or normal striping, splits data blocks (chunks of data) across
multiple disks simultaneously. The group of disk drives containing the split
data is called a stripe set; the size of each data piece depends on how many
disks are in the stripe set. Striping means all drives are active for every I/O
transaction and that each drive in the stripe set does less work per
transaction. Less work means faster performance.
You can immediately benefit from RAID 0 by using NT's Disk Administrator to
create stripe sets. This approach lets you create larger disk volumes under NTFS
(FAT has a 2GB partition limit) and improves disk I/O performance.
Software striping via Disk Administrator is useful for just about any
application, but with some cautions. First, software striping causes some
minimal additional CPU overhead because NT now has to calculate striping instead
of just passing I/O requests to the disk controllers. However, with today's fast
CPUs, this overhead is not a problem because the processing takes a very small
percentage of the CPU's overall capacity and the performance benefit of using
multiple drives is greater than the performance hit. Systems with old processors
(386, 486, or even slow Pentiums) may have more difficulty, and you need to
augment them with a hardware RAID controller, which offloads RAID calculations
from the system's main CPU or CPUs.
Second, be careful of where disks are located in the system. If you stripe
disks on two or more SCSI controllers (called controller multiplexing), you're
asking NT to calculate which data goes where in addition to figuring out the
striping, not to mention processing overhead, system bus traffic, and processor
interrupts for handling multiple cards. Again, older systems may have trouble
handling this processing.
Try to stripe disks only on the same controller for the best performance,
unless the capacity simply isn't enough from one SCSI card. You can compensate
for the above problems by using a hardware RAID controller that has specific
circuitry for handling these calculations and multiple channels for enhancing
performance and adding capacity (a multichannel card uses only one interrupt).
The big drawback of RAID 0 is that it offers no fault tolerance: If one
drive in the stripe set dies, the entire volume is unrecoverable. Also, the
number of drives you use in a stripe set has a point of diminishing returns.
For example, the results explained in "Microsoft SQL Server 6.5
Scaleability," (January 1997) showed that six drives were the effective
limit for a Compaq ProLiant 5000 with a Smart 2/P Array Controller; more drives
improved performance minimally. This minor improvement is because the mechanics
of the situation catch up with you (as the number of drives in the stripe set
goes up, the block size goes down; if the block size drops below the stripe
width, the advantages of striping diminish). In addition, you saturate the SCSI
channel with too many drives. New controllers with faster hardware, such as Wide
SCSI-3 and Ultra-2, raise this limit because they can run at 40MBps or 80MBps
and use wider (32-bit) data words. (See ">RAID-Related Terms,"
for definitions of the SCSI standards and other terms.) Another issue is that
more drives mean greater probability for failure.
Some experts recommend that you never use RAID 0 alone on a server.
However, the question is one of cost vs. performance, so RAID 0 with an
aggressive backup policy may be worthwhile.