The Information Store is the heart of Microsoft Exchange Server. Without a healthy Store, you don't have a successful Exchange deployment, and users are unhappy because they can't access their mailboxes. Exchange Server 2003 and Exchange 2000 Server introduced the concept of storage groups (SGs) and introduced new Store components such as the streaming database. Although the biggest change to the Store in Exchange 2007 is support for the Windows 64-bit platform, Microsoft has made other fundamental changes to the Store that you should also know about when planning your Exchange 2007 migration.

No SQL Server?
The biggest change that many expected to occur in Exchange 2007 was a transition to Microsoft SQL Server as the database engine for the Store. However, when the Exchange developers investigated the challenges of forcing a database engine designed for structured transactions to handle the work generated by Exchange and its clients, they decided they couldn't do it for Exchange 2007. This isn't altogether surprising when you consider the vastly different types of messages that flow through Exchange—everything from small 2KB messages to multimegabyte messages containing several attachments sent to large distribution lists. Also, users interact with Exchange in ways that affect the database. For instance, Microsoft Outlook users can click a heading in their Inbox to sort messages by that column. This action creates a new custom view in the database. If you consider that a server supports thousands of clients and each client can create many different views, you can sense why the transition to SQL Server could be such an engineering challenge.

The net result is that Exchange 2007 continues to use the Extensible Storage Engine (ESE) that it's used since 1996. Of course, each release of Exchange has modified ESE in different ways, and Exchange 2007 is no different; its major change is the move from a 32-bit platform to a 64-bit platform.

What 64-Bit Means for Exchange
The move to 64-bit is a big plus for Exchange because it addresses some fundamental Store problems in Exchange 2003 and Exchange 2000. For example, as Microsoft releases new versions of the OS, Windows servers gradually use more kernel mode memory to accommodate drivers, such as those used by antivirus and antispam products, and to handle the demand for connections from clients. Five years ago, users typically connected with just a desktop client, so administrators could easily figure out how many concurrent clients a server had to handle based on the number of users whose mailboxes the server hosted. Now, the proliferation of mobile devices means that people use multiple ways to connect to their mailbox. Research in Motion's (RIM's) BlackBerry is popular with many Exchange organizations, but it requires the expense of an additional server infrastructure, so organizations often restrict the use of BlackBerry devices. Microsoft's introduction of server-side ActiveSync (which is less expensive because it uses the same server infrastructure) and the growing popularity of Windows Mobile devices, especially the upgraded functionality delivered by the combination of Exchange 2007 ActiveSync and Windows Mobile 6.0, mean that Exchange 2007 will need to support even more mobile devices in the future. Each concurrent client connection requires memory, so you can see how demand increases.

Virtual memory fragmentation has been a bugbear for Exchange for years, especially in clustered systems. Applications request memory from Windows, which allocates the memory in chunks. Some applications require contiguous chunks of virtual memory to perform operations; if enough contiguous memory isn't available, the application fails. For example, Exchange requires relatively large amounts of contiguous virtual memory to load an SG and mount its databases. When failures occur on a cluster, the cluster attempts to transfer the SGs from the failed node to the other nodes in the cluster, but if enough virtual memory isn't available, the cluster can't transfer the SG, and users lose access to their mailboxes.

The huge increase in available memory made possible by 64-bit Windows OSs relieves the memory fragmentation problems that Exchange 2003 and Exchange 2000 have while also letting Exchange cache much more data than before. The advantage of caching more data is that Exchange 2007 trades expensive disk I/O for memory, which addresses another major performance bottleneck that Exchange suffers on the 32-bit platform. Microsoft predicts that the net effect reduces the I/O operations per second generated by users from around the 1.0 level to about 0.4 (your mileage will vary depending on the exact workload, CPU, and storage configuration). Reducing I/O demand lets you support more concurrent users, but it also requires you to equip Exchange 2007 servers with far more memory than you'd typically deploy with Exchange 2003. Exchange 2007 Mailbox servers with 8GB or more memory will be common, so you'll have to pay attention to the type and speed of memory DIMMs that you specify for servers as you deploy Exchange 2007.

Internally, Microsoft has made other changes to make the Store more efficient. Database pages are now 8KB instead of 4KB, which lets Exchange stuff more data into each page and so generate fewer I/O operations. The Exchange 2007 Store is smarter at write operations and groups transactions together so that single writes occur instead of multiple writes. Finally, the Store makes better use of memory to cache commonly accessed folders and information, such as calendars, to speed user performance.

The change to the database page size and other internal changes mean that you can't mount an Exchange 2003 database on an Exchange 2007 server and vice versa. Because of the complexities involved in an upgrade, Microsoft doesn't support upgrades from a server running 32-bit Windows and Exchange 2003 or Exchange 2000 to 64-bit Windows and Exchange 2007 (even on the same 64bit–capable hardware), so you won't find a special mode of Eseutil or any other utility to upgrade a database. To move mailbox data from Exchange 2003 or Exchange 2000 to Exchange 2007, you'll have to use the Move Mailbox wizard or the Move-Mailbox cmdlet. Fortunately, you can now script mailbox moves by using Exchange Management Shell to automate these operations. For more information about moving mailboxes, see the Exchange & Outlook Pro VIP article "Exchange 2007: Life Without ExMerge?" January 9, 2007, InstantDoc ID 94629.

Although production Exchange 2007 servers can run only on 64-bit Windows, Microsoft has made a 32-bit version of Exchange 2007 available for use just on test servers. You can also deploy the Exchange 2007 management components, including Exchange Management Shell, on 32-bit Windows XP SP1 workstations (you'll have to install Windows PowerShell and the latest version of the Microsoft .NET framework first). Support for Windows Vista workstations will be added in Exchange 2007 SP1, which is currently in beta and expected to be available by the end of 2007.

Maximum Databases
Every previous Exchange version has imposed a maximum database size on Standard Edition. Before Exchange 2003 SP2, the maximum database size was 16GB; SP2 increased this to 75GB, a limit that was still too small given the pace of growth in message volume and average message size. By comparison, Exchange 2003 Enterprise Edition supports database sizes that are limited only by available disk space. Some organizations run databases as large as 300GB, but the vast majority of Exchange databases are less than 50GB, largely because of the time required for backup and restore operations. Exchange 2007 doesn't restrict database sizes, so even with Standard Edition you can grow databases as large as you need to. However, Standard Edition is restricted to 5 mailbox databases and 5 SGs, whereas Enterprise Edition can support up to 50 databases and SGs.

You can still deploy up to five databases in a single SG, but Microsoft recommends that you deploy just one database per SG in Exchange 2007. This recommendation is partly to accommodate easier management, partly to provide better performance, and partly because you can only use log shipping to protect SGs that hold one database.

Transaction Logs and Replication
One of the more interesting changes that Microsoft made in Exchange 2007 is to reduce the size of transaction logs from 5MB to 1MB. Transaction logs capture details of every change made to databases in an SG; a busy server can generate tens of gigabytes of log data daily. The size change was made to accommodate log shipping, the mechanism introduced in Exchange 2007 to replicate databases to a location on the same server (local continuous replication—LCR) or to another server in a Majority Node Set (MNS) cluster (cluster continuous replication—CCR). Microsoft announced its intention to expand this functionality to accommodate database replication to a standby server in a different data center (standby continuous replication— SCR) in Exchange 2007 SP1. Log shipping is an important competitive step for Microsoft because IBM Lotus Notes, the company's major competitor in the enterprise market, has supported database replication for many years. It's also a feature that customers have demanded because they don't want to buy third-party products, such as Double-Take, to get similar functionality.

You can use log shipping only for SGs that hold a single database, which is an important consideration for system designers when planning a server's database layout. However, an Exchange 2007 Mailbox server can host up to 50 SGs, all of which can be replicated, so a great deal of flexibility exists. CCR can be used only for Mailbox servers, but you can deploy LCR on multirole servers such as those that support the Mailbox, Client Access, and Hub Transport roles. CCR and LCR don't support public folder databases directly, but Microsoft enables replication by supporting an SG that contains the only public folder database in the organization. If you have more than one public folder database, you can use regular public folder replication to ensure that multiple copies of public folder data exist within the organization.

When you enable an SG for database replication, Exchange "seeds" a passive copy of the database by copying the live database to the location where you want to host the copy (on the local server or another server). After seeding is complete, Exchange copies and validates transaction logs as the Store generates them, then replays the data in the logs to update the passive copy. If a failure occurs, you can replace the active copy with the passive copy of the database and continue operations. As storage vendors gain experience with Exchange 2007, it's likely that we'll see broader and deeper support for features such as database seeding and log shipping incorporated into third-party storage management products.

Reducing the size of the transaction log exposes Exchange to less risk of losing data due to incomplete file copies caused by a hardware failure. Clearly, you'll lose less data if Exchange can't copy a 1MB transaction log than if it can't copy a 5MB log. Exchange 2007 also includes a new feature called lost log resilience (LLR), which lets Exchange mount databases after a failure even if some data in transaction logs is missing (because of copy failures) and so can't be used to bring a database up to date. LLR runs on CCR Mailbox servers and is a major change for the Store; all previous Exchange versions require manual intervention by administrators if missing data caused the Store to refuse to mount a database. Typically, administrators have to run Eseutil to patch the database. Any manual operation is prone to error, and there are many horror stories where administrators have run Eseutil with the wrong command switches or attempted to apply the wrong set of transaction logs and caused major problems for a database.

Log shipping doesn't happen free of charge; servers incur a performance penalty to copy and replay logs into the passive copy of the database. Read and write I/O operations occur as Exchange copies transaction logs from the active location to an "inspector" directory, where Exchange checksums each log to ensure its integrity before replaying its content into the passive database. Memory is also consumed because Exchange uses a separate ESE instance to replay the transactions. Overall, early experience indicates that you can expect a 20 percent overhead to accommodate these operations on a server that supports LCR. With CCR, the passive node incurs the performance penalty because it pulls logs from the active node and replays them into its copy of the database.

The transport dumpster, another feature introduced in Exchange 2007, further reduces the risk of losing data by caching copies of messages on Hub Transport servers that go to mailboxes in replicated databases. When a failure occurs, Exchange can resend messages from the cache to affected mailboxes. Exchange suppresses the messages if they already exist in the destination mailbox. The transport dumpster can't cache some transactions, such as draft messages, but between log shipping, LLR, and the transport dumpster, Exchange 2007 is more resilient to hardware failure than any previous release.

A minor but important change associated with the smaller log size is the change in transaction log naming. Exchange 2007 still uses the SG prefix to identify the SG that a log belongs to (e.g., all logs beginning with "E0" belong to the first SG on a server), but Microsoft increased the hexadecimal number used to identify an individual log from six to nine characters, so you end up with names such as E010000aa15.log. Microsoft says you can now create more than 4 billion transaction logs per SG before Exchange is forced to reuse a log name.

Portability
Exchange 2007 databases are portable. In other words, you can take a database from a server and mount it on any other Exchange 2007 server in the same organization—something you couldn't do before. Portability helps with disaster recovery because it lets you quickly transfer databases from a failed server and mount them on another server, providing that you can still access the database files. The ability to access the database files to move them is a good reason why shared storage is still an important component in Exchange disaster-recovery planning: It's obviously much easier to switch databases between servers in a SAN than it is if the databases are on direct attached disks.

After the databases are mounted on the new server, you use the Move-Mailbox cmdlet with the -ConfigurationOnly switch to update the configuration data that's stored in Active Directory (AD) for the mailboxes that belong to the databases you just moved and redirect users to the server where the databases are now mounted. Portability is a welcome feature.

Deleted Items
Microsoft introduced the deleted items cache in Exchange Server 5.5 to let administrators avoid restoring databases to recover items deleted in error. The feature has been around for a while and is well understood; the only thing that has changed in Exchange 2007 is that the new default deleted-items retention period is 14 days (instead of 7 days). The deleted items retention period dictates when the Store permanently removes items from databases, so the change means that Exchange 2007 keeps deleted items for twice as long as Exchange 2003 does. This longer retention period means an increase in database sizes, possibly up to 10 percent depending on user behavior and the flow of message traffic in your organization.

The End of Streaming
Exchange 2000 introduced the streaming database (.stm) file as a companion to the regular Messaging API (MAPI) property database (.edb) file. .Stm files let Exchange store any information that arrived in raw Internet format, such as MIME-encoded messages and attachments, in a database purposely designed to support these formats instead of using MAPI properties in the .edb. Behind this decision was a belief that Internet formats would become much more prevalent than they have and that Exchange could avoid the performance hit of converting MAPI data when IMAP or POP clients such as Outlook Express fetched messages. Since Exchange 2000 appeared, server performance has increased dramatically, so the performance hit for format translation isn't all that important now. Outlook continues to be the most popular Exchange client by far, so MAPI remains the predominant format in use today. Microsoft therefore concluded that the .stm file was no longer needed and omitted it from Exchange 2007.

No More M Drive
The ability to map Exchange mailbox data as a DOS drive using the Server Message Block (SMB) protocol through the Exchange Installable File System (ExIFS) was another feature that received much hype when Microsoft launched Exchange 2000. On the surface, it seemed great that you could navigate through your mailbox as if you were moving through DOS folders.

Unfortunately, the feature turned out to be useless in production, and it even created a host of problems when administrators thought they could virus-scan messages and attachments through the M drive, or when they attempted to take file-level backups of Exchange data through the M drive! Of course, these backups were useless because they didn't contain all the necessary data (such as MAPI properties) that Exchange required, but no one discovered the problem until a server outage occurred and the backup was needed. Microsoft took the first step to eliminate the problem by hiding drive M by default in Exchange 2003 and now has completed the process by removing ExIFS from Exchange 2007.

Public Folders
Sometimes it seems Microsoft doesn't quite know what to do about public folders. At first, Microsoft deprecated their use in Exchange 2007 with an eye on eventually phasing out public folders completely in the next major release of Exchange. Although public folders aren't the most useful storage mechanism and have never realized the potential Microsoft promised when they appeared in Exchange 4.0, there's no doubt that there are millions of public folders in daily use across the Exchange installed base. Customer push back and the harsh realization that there's no good migration path for public folder data or the applications that depend on public folders forced Microsoft to rethink its decision. The company's latest position is that it will support public folders until 2016 at least.

Having nine years to think about what to do with your public folders is great, but don't expect to see much development around them in the future. You need to start developing an exit strategy. Microsoft will point you to SharePoint, and that's certainly one option, albeit one that requires a lot of manual effort because Microsoft has no automated migration utilities. Quest Software shipped Public Folder Migrator for SharePoint (http://www.quest.com/public-folder-migrator-for-sharepoint), and you can expect other companies to provide utilities over time.

Exchange 2007 includes no GUI to manage public folders, nor does Outlook Web Access (OWA) 2007 include a GUI to access public folders. However, Microsoft will fix these omissions in Exchange 2007 SP1, and you can keep Exchange System Manager around to manage public folders until SP1 appears. Alternatively, you can learn the PowerShell commands to manage public folders and forget about the GUI.

Enhancing the Heart of Exchange
Administrators should find the Exchange 2007 Store changes an improvement. The move to 64-bit Windows improves stability and performance, log shipping increases resilience, and some obsolete components are gone. You can manage the Store by using commands through Exchange Management Shell. There are some outstanding issues, such as the lack of support for public folders in the GUI and OWA, but Microsoft is working to fix these problems in SP1. Features such as log shipping will take time for administrators to learn how to deploy and use effectively, but they're a good step forward. Overall, Microsoft has done a nice job of enhancing the heart of Exchange 2007.