Disk capacity, backup windows, and database overhead all play a role in setting mailbox quotas
|Server capabilities are not the only concerns when setting Microsoft Exchange Server 2007 mailbox quotas. Learn how disk capacity and performance, overhead, and backups affect mailbox quotas, and get formulas for calculating maximum database and mailbox sizes.|
Over the past few years, users have been demanding increasingly larger mailboxes. Compared with a couple of years ago, today's users send and receive larger messages and more rich attachments such as Microsoft Office documents, multi-megapixel images, and even video. Users also expect more space for their mailboxes because they've become accustomed to having gigabytes of storage space from free services such as Hotmail and Gmail. But before you increase mailbox quotas, it makes sense to take a close look at users' needs and evaluate the impact of large mailboxes on your organization.
When most administrators try to decide what the maximum mailbox size should be, they tend to look at things solely from the server’s perspective. Although the server’s capabilities need to be your primary concern, it's also important to take Outlook into account.
In the distant past, Exchange Server was less reliable than it is today, and configuring Outlook to dump messages into PST files for safekeeping was a common practice. At the time, a PST file had a maximum size of 2GB. If that limit was exceeded, you had to use the Pst2gb tool to reduce the file's size. In the process, some of the data was always deleted, and you had no way of knowing what data you'd lose.
Although Microsoft Office Outlook 2003 eliminated the 2GB limit and PST files are seldom used on a regular basis anymore, it's still important to consider the workstation’s capabilities. By default, Outlook 2007 and Outlook 2003 use Cached Exchange Mode, which places a copy of every message into an offline folders (.ost) file stored on the workstation’s hard drive. Microsoft recommends leaving Cached Exchange Mode enabled because it helps relieve some of the server’s disk I/O burden. I also recommend leaving it enabled because it gives you a degree of fault tolerance, as I explain in the sidebar "Cached Exchange Mode Fault Tolerance."
Because of Cached Exchange Mode's use of .ost files, Microsoft recommends matching the workstation’s hardware to the mailbox size. According to the Microsoft article "Microsoft Server Storage Design" (http://technet.microsoft.com/en-us/library/bb738147.aspx), a 5,400rpm hard drive is sufficient for cached mailboxes up to 1GB in size, and a 7,200rpm hard drive is acceptable for mailboxes of up to 2GB in size. The article suggests that if a mailbox exceeds 2GB, you should reduce the mailbox size or switch to online mode. However, your mileage can vary depending on the workstation’s hardware. My own personal mailbox, for example, is 3.7GB in size. I have a 7,200rpm hard drive, and I use Cached Exchange Mode without any problems.
Server Disk Capacity
With regard to Exchange Server itself, the server’s disk capacity is one consideration in deciding where to set the mailbox quota. You might assume that you can calculate a reasonable mailbox quota simply by dividing the amount of available server disk space by the number of mailboxes on your server. However, this method is simplistic and causes problems even if most mailboxes never reach their quota.
The most obvious reason this method doesn’t work is that it doesn't allow for new mailboxes. If you were to use this method to set mailbox quotas, you'd have to lower user quotas every time you add a new mailbox to the store, confusing users and causing problems for those whose mailboxes are approaching the quota.
Allotting all available disk space to mailboxes also doesn't leave space for database maintenance. Occasionally, you might find yourself having to run Eseutil to perform a database repair or an offline defragmentation. Eseutil makes a copy of the database, performs the necessary maintenance against the copy, and then replaces the original database with the modified copy. Consequently, the drive that contains the store must have enough free space to accommodate a complete copy of the store, plus another 10 to 20 percent of the database’s size to allow for process overhead.
Microsoft has lots of guidelines for determining the maximum size of a database on a volume. Assuming that only one database is stored on the volume and that the transaction logs are stored elsewhere, a good way to determine a safe maximum database size is to multiply the volume's total capacity by 0.75, then divide the result by 2. This approach allows enough free space for a copy of the database, plus 25 percent overhead. For example, if you have a 500GB volume, you'd calculate
(500GB * 0.75) / 2
to get 187.5GB as the largest database that the volume could safely accommodate. Remember to do your calculations on the volume size, not the drive size, because the file system consumes some space.
In my opinion, disk performance is a far more important consideration than disk capacity. Sure, you need to keep your server from running out of disk space, but adding storage to a server is usually cheap and as simple as adding a disk to an existing storage pool.
Increasing disk performance isn’t as easy as increasing capacity, though, so you need to understand the performance implications from the beginning. One reason disk I/O performance is important is because you have to be able to back up your data. In the real world, backups are often the limiting factor when it comes to maximum mailbox size. I've seen countless examples of servers that had plenty of disk space and volumes fast enough to keep pace with user demands, but not fast enough to be backed up within the allotted time.
Imagine, for example, that you have an Exchange database that contains 1,000 2GB mailboxes. That would mean that you'd have roughly 2TB of data to back up each night. In this day and age, having 2TB of storage isn’t at all unrealistic, but backing up all that data can prove to be interesting. Even if you have high-end hardware and are able to back up 175GB per hour (48MB per minute), it would take more than 12 hours to back up the database—and that's a big problem if you have a four-hour backup window. So, even if you have sufficient disk space to give everyone a large mailbox, doing so might not be the best idea.
In a situation like this, there are several ways that you could get around the problem of backing up all the data. I mention two such options in the sidebar "Backup Options for Large Exchange Databases." Other techniques can also be used to back up large databases, but my point is simply that if you're going to allow large mailboxes, you must plan ahead how you're going to manage your backups.
Database White Space
I've showed you a quick-and-dirty method of calculating the largest database that a volume can safely accommodate. But even if you know the maximum database size, you don't want to set disk quotas that could theoretically reach that size. For example, if your server has capacity for a 100GB database and you have 100 users, you shouldn't simply assign a 1GB quota per user.
After calculating the maximum database size and before setting quotas, you need to take database overhead into account. Every Exchange Server database contains white space, so the amount of space the database consumes on a volume will always be greater than the amount of data actually in the database.
How much space is lost to database white space? The answer depends on factors such as the number of users and the level of daily activity, but you can estimate the amount of white space in a database by multiplying the number of mailboxes by the average volume (in megabytes) of the messages sent and received in a day. For example, if each of your 100 users sends and receives an average of 15MB of email per day, white space would account for about 1.5GB of space.
Now things get interesting. Each night, Exchange Server performs several automated maintenance tasks against the database. One task involves freeing up white space from items that have been deleted, leaving empty pages. The defragmentation maintenance process then groups all the empty pages together.
However, this online defragmentation doesn't physically reduce the database’s size. Only an offline defragmentation (which usually isn’t recommended) physically removes the empty pages from the database. But even after an online defragmentation, the volume of white space in the database tends to stay about the same. The white space is freed up each night, but fills again the next day because servers tend to do about the same amount of work each day. Assuming that the maintenance process is completed each night and the server’s workload doesn’t drastically change, the amount of white space tends not to fluctuate much. However, if disk I/O speed isn't sufficient for the maintenance process to be completed, database white space tends to grow exponentially.
Another thing to keep in mind is that when a user deletes an item, that item isn't instantly marked as white space. Instead, the item goes into the database dumpster, which, by default, has a 14-day retention period. Retention of items in the dumpster adds a lot of overhead to a user’s mailbox. Suppose a user deletes an average of 15MB of data a day. Over two weeks, the dumpster will grow to 210MB, which adds about 21 percent overhead to a 1GB mailbox. But the same size dumpster accounts for only about 10.5 percent overhead in a 2GB mailbox. Obviously, looking at the dumpster as a percentage of individual mailbox size isn't very helpful.
A more realistic way of calculating dumpster overhead is to multiply the average volume of mail deleted each day by 14 to get the average volume deleted over two weeks, then multiply that result by the number of mailboxes on the server. This calculation tells you roughly how much disk space will be consumed by the dumpster. For example, if the average user deletes 15MB per day, that's 210MB per user every two weeks, or approximately 20.5GB for 100 users.
What Size Quota?
The ultimate question is how large the quota should be. The first step is to figure out the maximum database size. I've given you a formula for determining how large a database can become, but you should also take Microsoft’s recommended maximums into account. One of Microsoft's recommendations is that an individual store should not exceed 100GB unless you use continuous replication. If you do, then Microsoft recommends a maximum size for an individual store of 200GB.
Next, use the following formulas to determine whether your proposed quota will allow your database to remain below the maximum size:
White Space = average daily volume in megabytes of incoming and outbound messages
Dumpster = (average daily volume in megabytes of messages deleted) * 14
Maximum Mailbox Size = Proposed Quota + White Space + Dumpster
Maximum Database Size = Maximum Mailbox Size * Anticipated Number of Mailboxes
Let's look at an example. Suppose you have 100 users who each send and receive about 15MB of data a day and who delete an average of 10MB of data every day. If you're considering a 2GB (2,048MB) quota, here's how the math works out:
White Space = 15MB
Dumpster = 10MB * 14 = 140MB
Maximum Mailbox Size = 2,048MB + 15MB + 140MB = 2,203MB
Maximum Database Size = 2,203MB * 100 mailboxes = 220,300MB, or roughly 215GB
I've explained how to calculate the amount of physical space consumed by mailboxes and whether your hard disk will accommodate a given quota size, but disk performance is far more important than capacity. If your disk subsystem’s performance is subpar, ever-increasing white space can cause the database to swell and might make it difficult to back up the store within your backup window. Make sure you set up nightly maintenance to keep white space under control, and when setting quotas, take into consideration all the factors that play a role. Your system's performance and users' experiences will be better for it.