Consultants often need to estimate how large a Microsoft Exchange Server 5.5 private store will grow. The formula I use is

<(mailbox quota * number of mailboxes) / single-instance storage
   ratio> +

The mailbox quota is the amount of data you allocate to each user. Typically, this figure ranges from 20MB to 50MB. Exchange Server uses a single-instance storage model. In other words, if a message goes to 10 recipients on a server, one copy of the content resides in the database and the system creates pointers to the content in each recipient's mailbox. The single-instance storage ratio measures how effectively recipients share messages on a server; in my experience, this ratio ranges from 1.2 (very bad) to 2 (normal) to 3 or above (very good). Servers that host users who send many messages to external addresses (e.g., Internet users) will have lower ratios than servers on which most messages go to mailboxes on the same server. You can monitor your server's effectiveness in Performance Monitor by checking the MSExchangeIS Private\Single Instance Ratio Performance Monitor counter. Finally, the deleted-items cache represents the percentage of the store that soft-deleted items occupy. The system retains soft-deleted items until their retention period expires, then permanently removes them. The Deleted Items cache is a feature that Exchange Server 5.5 introduced. The percentage depends on the length of the retention period. You can estimate that the cache will occupy approximately 5 percent of the database for a 7-day period and as much as 10 percent for a 14-day period. Your mileage will vary, but these percentages are safe for planning purposes.

Now you can calculate how large a private store will be. For example, here's the calculation for a typical large Exchange server:

Private Information Store = (50MB * 3,000 mailboxes) / 1.8) +
   5% = 87.5GB (rounded)

Believe it or not, an 87.5GB private store isn't especially large anymore. Many companies run servers that have private stores larger than 100GB, and I've heard of some stores approaching 200GB. A database that large requires much care and maintenance—and a complete backup regimen—to protect against hardware or software failure. Hardware and software are more reliable than ever before, but errors still occur. If you lose a 100GB database and you don't have a good backup plan, you'd better make sure you've sent your résumé to as many recruitment firms as possible.

Finally, you need to consider the time that complete online backups for such large databases require, and you need to make sure that the I/O subsystem can cope with the I/O load that large servers generate. A server equipped with multiple fast CPUs is useless if you can't retrieve the data quickly.