Email discovery—the process of fulfilling a legal request to provide archived email messages, typically as evidence in a civil or criminal court case—has become more important than ever. Compliance regulations, along with a tremendous growth in email traffic and a corresponding growing need for email storage, are forcing companies to scrutinize their electronic discovery (e-discovery) processes to ensure that they can produce specific archived messages on demand. Earlier this year, a cross-industry consortium called the Electronic Discovery Reference Model (EDRM) Project (http://www.edrm.net) published a work-in-process document that provides a standard for developing e-discovery products and services. The EDRM consists of various sections that describe requirements for different stages of the e-discovery process, as Figure 1 shows. Let's examine two of these sections, Identification and Records Management, and some ideas they provide Exchange administrators for implementing an e-discovery plan in an Exchange Server environment.
In a compliance investigation, everything hinges on your ability to produce evidence—for example, for a Freedom of Information Act (FOIA) request, a Securities and Exchange Commission (SEC) investigation, or a lawsuit. Your first step in producing such evidence is to identify individuals implicated in the request (custodians, in legal terms), along with any relevant concepts, timeframes, and company events of interest. Then you'll need to scope the underlying data that should be examined.
As an Exchange administrator, you can make identifying email-related evidence easier by establishing and maintaining both current and historical versions of the following Exchange inventories:
Mailbox inventories. Inventory and document all users who have mailboxes in your environment. To do so, you can use a third-party Exchange reporting tool or use Microsoft's CSVDE utility to export Active Directory (AD) user information to a comma-separated value (CSV) file. (For more information about CSVDE and its parameters, see http://www.microsoft.com/technet/prodtechnol/windowsserver2003/ library/serverhelp/1050686f-3464-41af-b7e4016ab0c4db26.mspx.) Your inventory should include at least the following information for each mailbox user: display name, user account, organizational unit (OU), email address, Exchange server name, Information Store (IS), department, title, and city. This inventory will let you quickly look up the history or location of a mailbox for a given period of time, thereby reducing the effort (and guesswork) involved when you need to dig out old email messages by recovering mailbox files from backup media.
PST inventories. End users will create personal folder files (PSTs) unless you've blocked this functionality. (For more information about disabling PSTs, see the Web-exclusive article "Dealing with .pst Files," November 2003, InstantDoc ID 40961.) Do you know where all the PSTs are in your environment? To find out, you can start by running the following Dir command—which generates a list of all PSTs and their owners—on your file server and saving the results to a text file:
Dir *.pst /s /q
Since the vast majority of PSTs are typically saved on local workstations, you'd need to get creative with logon scripts (e.g., write a logon script that runs on each workstation and sends a list of PSTs found to a central location for analysis) or use a systems management tool such as Microsoft Systems Management Server's (SMS's) inventory-collection feature to obtain a complete picture of all PSTs in your environment.
Hardware inventories. Although you're probably doing this already, you need to compile an up-to-date inventory of all hardware—including wireless devices such as BlackBerry handhelds and PDAs—used by everyone in your environment. Since many devices contain an email cache, at some point you might have to be able to identify them quickly if they're of interest to an investigation. Numerous options exist for hardware-inventory tracking, ranging from manual tracking in a Microsoft Excel spreadsheet to asset-tracking software, such as the products that Table 1 lists.
Archival and Records Management
The terms archive and compliance are often incorrectly used to mean the same thing. Deploying an email archive is one of the most important compliance tasks you can perform, but doing so is by no means sufficient for achieving compliance. In its simplest form, an email archive is simply a repository for records. Most email-archive solutions available today include records-management functionality, which lets them store email data in a manner that's securable, readily retrievable, easily searchable, and auditable. Some key email-archiving products include Symantec's Veritas Enterprise Vault, Quest Software's Archive Manager (formerly AfterMail), ZANTAZ EAS, Open Text's LiveLink ECM, and HP StorageWorks Reference Information Storage System (RISS). (For more information about email-archiving solutions, see "Regulatory Compliance," September 2005, InstantDoc ID 46946.)
Purchasing an archive is analogous to buying a fireproof safe for your home. The safe is valuable only if it contains the records you need to preserve. For example, if you have to produce your home-ownership papers and they're in the safe, the discovery process will be relatively simple for you. If, however, you keep these and perhaps other important documents in other places, you could spend hours or even days sifting through the piles of paper in your office and home trying to find the documents you need. Thus, an archive's real value lies in how it simplifies and centralizes the storage of important documents. Merely having a safe or an archive isn't enough, though, if you have many papers or millions of records; in this case, you need sophisticated searching and other records-management functionality to accomplish discovery as quickly as possible.
Email is a dynamic and high-volume communications medium. In its December 2005 Worldwide Email Usage 2005-2009 Forecast, IDC estimates that business email worldwide in 2006 will exceed 3.5 exabytes annually (approximately 10 million TB daily). No company can archive 100 percent of all its email records because email data is so dynamic. Both email data stored in your company's archive and data outside the archive are subject to e-discovery, as Figure 2 shows. As an Exchange administrator who must put together an e-discovery plan for your organization's email records, you face a dual challenge: Move email data that must be archived to meet compliance requirements into a central archive where it can be more readily managed via archiving and records-management solutions, and manage data not in the archive by using e-discovery tools. Email-archiving tools provide the infrastructure for long-term storage, indexing, and recordkeeping of email data, whereas e-discovery tools focus on search and retrieval of electronic documents on hard drives, servers, and other media throughout a given environment, with the view of producing evidence required for an investigation.
Data outside the archive can include email messages in PSTs on local machines or file servers, messages on mobile devices, email stored on backup media, and any other device or medium on which messages might be stored. Typically, Exchange administrators don't focus on identifying what data needs to be archived for compliance; rather, their job is to implement compliance policies that someone else in the organization decides. For example, your company's legal department might provide some compliance policies, which state that "all email meeting x criteria needs to be archived," which IT then translates into a technology implementation. Unless you're in a very small company, managing email records throughout their lifecycle from cradle to grave for compliance reasons can become extremely burdensome. Taking the following actions can make managing and identifying both archived and non-archived records for e-discovery easier.
Move to Exchange 2000 Server Service Pack 3 (SP3) or later. Exchange 2000 SP3 and later support envelope journaling. Microsoft actually added envelope journaling—which allows Exchange journaling to capture distribution list (DL) expansion metadata—in Exchange 2003 SP1 but has since back-ported this functionality into Exchange 2000 SP3. Before envelope journaling, if you sent an email message to a DL such as email@example.com, Exchange provided no way to view the journaled message and determine who was actually on the DL when it was expanded. With Exchange 2003 SP1's DL expansion support, envelope journaling retains DL membership with every journaled item. In the firstname.lastname@example.org DL example, then, you could determine from the DL journal that Peter and Susan Pevensie were on the DL, but Edmund Pevensie was not. This capability can be an important factor in proving chain of custody—an accounting of email evidence from the time it was originally created through to its presentation as evidence—should such email ever be subpoenaed. (For more information about adding the envelope journaling capability to pre-Exchange 2003 SP1 versions, see the Microsoft article "An update rollup is available to enable the Envelope Journaling feature in Exchange 2000 Server" at http://support.microsoft.com/?kbid=834634.)
Assemble an e-discovery toolkit. Your e-discovery toolkit should contain utilities and software products that enable discovery of data outside the archive. Include in the toolkit both PST-discovery tools and backup-discovery tools, such as those listed in Tables 2 and 3.
PST-discovery tools let you find and search all PSTs in your environment. One PST-discovery method is to search your drives for PSTs by using Windows Explorer's Search capability and copying the PSTs to a local machine, then using a desktop search engine such as Microsoft Outlook's Lookout add-on, MSN Desktop Search, or Google Desktop to index and search them, which could be quite time-consuming. Another method is to use a specialized third-party tool for PST discovery and/or in-place PST search, such as any of the tools that Table 2 lists.
Backup-discovery tools that let you rapidly extract email messages from backup files can help you in two ways. First, they can eliminate the mix of Exchange recovery technologies—Exchange recovery servers, Recovery Storage Groups (RSGs), the ExMerge tool, Deleted Item Retention, and Deleted Mailbox Retention—that you typically need to juggle when trolling large numbers of backup tapes looking for specific email evidence. Second, if you have an archive deployed in house, such tools can help you extract relevant email records from backups and move those records into the archive, so that you don't need to recover those records from backup media in the future.
Consider the Data
When developing an e-discovery plan for your company, it's important to consider both email data inside your archive (if you have one) and email stored in other nonarchive locations. Your goal should be to increase the percentage of email records stored within the archive relative to that outside the archive. To do so, you'll need to gradually migrate data from your backups into your archive and shorten the amount of time email stays on your production Exchange servers prior to archival. By performing mailbox and hardware inventories and assembling the right mix of tools for archiving and retrieving email records, you'll be able to shape an e-discovery strategy that works for your company.