Implementing an email discovery,
compliance, archiving, and retention (DCAR) solution is similar to remodeling a house. You need a “blueprint” (plan) that provides detailed
instructions for the “remodeling”
(design, components, and implementation steps of the DCAR solution).
You also need to be able to translate
the plan into reality—as a contractor
would on a remodeling job—within
the budget and schedule you've allotted for the project. Like a remodeling,
implementing a DCAR solution might
disrupt your current messaging
scheme somewhat, but in the end it
will make your job as an Exchange
administrator easier and add functionality to your messaging system.
Fortunately, Exchange provides some
built-in features—message journaling, backup and restore APIs, and message and transport security—that
provide a framework for building a
DCAR solution. We'll examine those
features as well as some related technologies in Exchange, such as event
sinks, protocol logs, and message
tracking, and explore how each fits
into a DCAR strategy. And in the Web exclusive sidebar “Third-Party Products in an Exchange DCAR Solution,”
we'll look at
some important DCAR functions that
you'll need to obtain through third-party products and what to consider
when choosing such products.
Message Journaling
You'll find a growing degree of confusion about the difference between
archiving and journaling. From a
high-level point of view, you can
successfully argue that little distinguishes the two methods; both extract messaging data from the messaging
system. However, you do need to
know the difference, not only to
implement your solution correctly but
also to evaluate whether third-party
components will meet your needs.
Archiving is the process of removing content from the messaging system for long-term storage in some
other system, usually some type of
database. Email messages are taken
from user mailboxes according to criteria, such as age. Archiving technology usually provides some kind of
Web or mailbox-based extension
mechanism that lets users continue to
access archived content as necessary.
Features generally address the discovery, archiving, and retention components of DCAR. Archiving technology
is useful for mailbox management,
reducing storage requirements by
consolidating and compressing multiple copies of the same data and ensuring the preservation of corporate knowledge.
Journaling is the process of creating
or capturing copies of email messages
as they enter and traverse the messaging system, ensuring that those copies
are collected in central locations.
Together, the journal copies comprise
a searchable form of documentation
that administrators and auditors can
use to see the email messages that users
are sending and receiving. Journaling generally addresses the compliance
component of DCAR and is one of
the three common mechanisms for
moving messaging data into compliance and policy solutions. Journaling
usually doesn't provide any direct
benefits to end users, but it can be a
vital part of a complete DCAR solution.
Although Exchange has no built-in archiving functionality, it has
included basic journaling capabilities
since Exchange Server 5.5 Service Pack 1 (SP1). Over the years, Microsoft has increased journaling functionality in the various Exchange
releases, service packs, and occasional hotfixes. Today, Exchange
includes three types of journaling:
- simple journaling (also known as
message-only journaling)
- blind carbon copy (Bcc) journaling
- envelope journaling
All three types work on the same basic
principle: Almost every email message
that enters the Exchange organization
is examined to see whether it's bound
for a recipient configured for journaling. If it is, the first Exchange Server
categorizer through which it passes
creates another copy of the email message (in a process called bifurcation)
that's delivered to a specified journal
mailbox or public folder. The only
messages exempted from journaling
are system messages such as Active
Directory (AD) replication messages,
public folder replication messages,
and journal messages.
Note: Although you can specify a
public folder as the journal destination, Microsoft recommends that you
specify a mailbox. Journal messages
delivered to public folders can't be
stamped with the full range of data
with which email messages delivered
to a mailbox can be stamped.
Although you have some control over
which recipients care configured for
journaling, you should be aware that
your ability to perform this configuration isn't very granular in any current
version of Exchange. For Exchange
Server 2003 and Exchange 2000
Server, journaling is enabled on a
per–message-store basis. All mailboxes in enabled message-store databases are journaled, and all journal
messages that mailboxes generate in
that database are sent to the same
journal mailbox (although you can
configure separate journal mailboxes
for each message-store database).
In Exchange 5.5, you can enable
journaling for an entire organization or on a per-site or per-server basis. Be
aware that Exchange will capture and
copy only email messages that are
transmitted. If someone edits a message in-mailbox, the change won't be
captured. I know of at least one lawsuit that involved lawyers being blind-sided because they weren't aware that
email messages in their organization
had been changed to cover up evidence of wrongdoing. Opposing
counsel produced records of the original, unaltered messages. I'll review
the three types of journaling relative to
the goal of implementing a DCAR
solution.
Simple Journaling
Simple journaling has existed in
Exchange since Exchange 5.5 SP1.
When simple journaling is enabled,
the first Exchange categorizer to handle a given email message parses the
P2 header—the header information
contained within the actual message
that determines whether the relevant
mailboxes are in databases with journaling enabled. For email messages
sent within the organization that use
Messaging API (MAPI), remote procedure call over HTTP Secure (RPC over
HTTPS), Microsoft Outlook Web
Access (OWA), or another form of
HTTP access, this server is the
sender's mailbox server. Otherwise,
the bridgehead server receives the
message through SMTP or the
Exchange Message Transfer Agent
(MTA) service. Journal copies of the
email message are then sent to all
relevant journal mailboxes. You control simple journaling through the
Mailbox Store Properties dialog box,
as Figure 1 shows.
Let's look an example. Imagine an
Exchange organization with four
mailbox servers, EXCH01 through
EXCH04. Each mailbox server has
two mailbox stores, one for regular
users and one for journaling. Each
regular mailbox store is configured to
deliver to the journal mailbox in the
journal mailbox store on the same server. In addition, an SMTP bridgehead server handles all incoming and
outgoing SMTP traffic.
An external email message comes
into the organization addressed to
four recipients: Adam, Barbara, Charlie, and Denise. By chance, these four
recipients are homed on separate
mailbox servers. In addition to forwarding the email message to the
actual recipient mailboxes, the
bridgehead forwards it to the four
journal mailboxes, requiring extra
bandwidth, disk I/O, and CPU in the
process. That kind of traffic multiplier
can cause a significant performance
hit in organizations with geographically dispersed servers linked by low-bandwidth WAN connections or
organizations whose servers are
already running close to their peak
performance.
You might wonder why extra
bandwidth is required, given that the
SMTP stack in Exchange 2003 and
Exchange 2000 is supposed to send
only one copy of a message between
servers even when there are multiple
recipients on the destination server. Because the
journal copy of the message has extra properties stamped on it during the bifurcation process, the journal copy
technically counts as aseparate message. Be aware of this behavior when you design your solution.
Simple journaling has
some other limitations,
mainly because it uses
the P2 header information. Simple journaling
can't
- capture Bcc recipients. This limitation reduces
or eliminates journaling's usefulness; you
can't accurately track
email message
recipients.
- capture the results of any address
rewriting you might have
configured in your organization.
- uniformly expand distribution list
(DL) membership. This limitation
could leave you with a journaled
email message that contains the
DL name instead of the list of
members. How do you go about
proving the list's membership at
the time the email message passed
through the system? What if the list
constantly has members added
and removed? And consider how
this limitation can affect a large
organization with a complicated
AD replication topology.
The Exchange 5.5 version of simple
journaling has an additional flaw: The
journal copy captures display names
rather than actual email addresses. In
essence, you can't prove that an email
message actually went to a particular
recipient; you can guarantee only that
the message was sent to a recipient
who had that specific display name
configured at that specific time.
Bcc Journaling
Bcc journaling is really simple journaling on steroids. No additional UI is
exposed to turn on Bcc journaling
functionality. Instead, in the HKEY_LOCAL_MACHINE\System\Current
ControlSet\Services\MSExchange
Transport\Parameters registry subkey,
you need to specify JournalBCC
(REG_DWORD), with a value of 0x01
to enable and 0x00 to disable.
Restart the SMTP and Exchange
Information Store services to pick up
the change. When the Exchange store
detects this registry value on startup,
it enables capture of the Bcc recipients
on journaled messages and records
this information in the journal copy of
the captured messages.
Caution: Make sure that all your
Exchange servers have the necessary
upgrades for the level of journaling
you use. If you use Bcc journaling on
one server, make sure you enable it on
every server that has message journaling enabled.
Microsoft included support for Bcc
journaling in the original release of
Exchange 2003, but the feature still
suffered from the DL-expansion problem. Exchange 2003 SP1 added support for DL expansion. If you have
Exchange 2000 servers in your organization, you can add both capabilities
by ensuring that you upgrade your
servers to Exchange 2000 SP3 and add
the hotfix described in the Microsoft
article “Bcc information is lost for
journaled messages in Exchange
2000” (http://support.microsoft.com/?kbid=810999). No upgrades or hotfixes can give you Bcc journaling functionality in Exchange 5.5.
Envelope Journaling
Microsoft introduced envelope journaling (aka advanced journaling) in
Exchange 2003 SP1. It's designed to
address the limitations of simple journaling and Bcc journaling by using the
P1 header data. Like Bcc journaling,
envelope journaling was retrofitted to
Exchange 2000 servers; they must run Exchange 2000 SP3 plus the post–SP3
Update Rollup. (For more information
about the rollup package, see the
Microsoft article “An update rollup is
available to enable the Envelope Journaling feature in Exchange 2000
Server” at http://support.microsoft.com/?kbid=834634.) Again, no
updates for Exchange 5.5 provide this
functionality.
In many respects, envelope journaling works much like simple journaling; it's still invoked by the first
categorizer to handle a message and
still configured on a per-store basis.
The big difference is that whereas
simple journaling looks at the P2
headers, envelope journaling looks at
the P1 headers. The latter approach
provides several benefits over simple
journaling or Bcc journaling. Envelope
journaling
- captures both display names and
actual SMTP addresses
- natively captures Bcc recipients
- allows full DL membership expansion (including hidden DL members)
- captures all mail-enabled recipient
objects: public folder, contacts,
alternate recipients, ad hoc recipients, and query-based DLs
- captures delivery reports, nondelivery reports (NDRs), read receipts,
and out-of-office messages
Instead of sending a bifurcated
copy of the original email message to
the journal mailbox, envelope journaling creates a separate journal
report message. It then makes an
exact copy of the original message as
an attachment to the journal report
message. This approach preserves the
original message exactly as it was
sent; it contains original headers, DLs
are intact, and Bcc recipient information isn't displayed in the original
message while still being available in
the journal report message.
It sounds great, but the downside is
that envelope journaling frequently
results in multiple copies of the journal report message (and attached original
message) being delivered to the journal mailbox. This behavior is an
expected consequence of using the P1
headers. The server that performs the
original categorizer might not have all
the information it needs to completely
identify all message recipients. It generates an incomplete journal report
message with the best information it
has, then forwards copies to other
servers for them to process. The steps
of the process follow:
- When the originating store can't
perform DL expansion, it sends the
email message to an expansion server
for DL expansion.
- The DL-expansion server generates a new journal report message
that now contains the full DL membership recipient information.
- Each of the recipient stores generates an additional journal report
message to show that the original
message was, in fact, delivered.
The result? You now have multiple
copies of the journal report (and the
original, attached email message) in
the journal mailbox. All these copies
contain different subsets of the final
message recipients. They're all keyed
from the same message ID, which
means that whenever you audit message delivery, you must examine all
related journal-report messages.
If your organization is dispersed
over multiple sites linked by low-bandwidth WAN connections, envelope journaling can have a significant
effect. The following steps show you
how to set up envelope journaling in
a way that mitigates potential negative
effects:
- Ensure that all your servers run at
least Exchange 2000 SP3 plus the
post–SP3 Update Rollup or Exchange
2003 SP1. You'll experience inconsistent behavior if you don't upgrade all
your servers.
- Download the Microsoft Exchange
Server Email Journaling Advanced
Configuration tool (exejcfg.exe) at http://www.microsoft.com/downloads/details.aspx?familyid=e7f73f107933-40f3-b07e-ebf38df3400d&displaylang=en.
- If you currently use Bcc journaling,
remove the registry subkey or set its
value to 0x00 (disabled) on all the
servers on which you have journaling
enabled. Restart the SMTP and
Exchange Information Store services
to activate the change.
- From the command prompt, use
the exejcfg.exe tool to enable envelope
journaling across the entire organization. This tool makes a simple change
to AD that tells all the Exchange
servers to change the heuristics they
use to perform journaling.
At this point, you don't need to
restart services. As your AD replication
takes place, your Exchange servers will
pick up the changed configuration
value and switch over to the new
behavior.
Backup and Restore APIs
If you're like many of your fellow
Exchange administrators, your
backup and restore plans for your
Exchange organization are often a
major driver of current operational
procedures. It might not seem that
your backup and restore process is
directly related to DCAR, but if you
stop and think for a moment, you'll
realize that it is. Here are a handful of
ways in which DCAR and backups are
related:
- Some regulations mandate the ability to restore backups for a period of
years. A viable disaster recovery
plan is essential for demonstrating
your organization's intent to comply.
- Unauthorized use of backup tapes
can be a vector for disclosure of
protected information. Poor control
over backup materials can be
another route to compliance
nightmares.
- Backups protect the data currently
in your mailboxes before it's
migrated to your archiving system. Most archiving systems aren't
designed to be able to inject content
back into the messaging system.
- Backups are the only way to recover
improperly deleted email messages
when your retention policies don't
work the way you expect them to,
especially if you don't detect the
malfunction right away.
If you use (or intend to use) some
form of message journaling, you'll
need to modify your backup and
restore plans. Journaling mailboxes
accumulate a large amount of traffic
quickly, which can affect your backup
and restore windows. If you follow
recommended practices and establish your Exchange journaling mailboxes in separate message stores,
you'll need to back up those stores.
Whatever form of backup and
restore you use, you must ensure that
it actually supports Exchange. Taking
a backup of the database files directly
from the file system (including several
snapshot solutions that storage vendors offer) is a bad idea for several reasons. Not only is it extremely difficult
for a solution that uses this approach
to ensure that your databases are
backed up in a consistent state, such
a solution does nothing to address the
growth of your Exchange database
transaction logs. A proper Exchange-aware backup uses established APIs to
perform the necessary log truncation
after a successful backup of the store,
which shortens restoration time
(fewer transactions must replay after
the files have been restored) and helps
conserve disk space.
If you need a snapshot-based
backup strategy and you run Exchange
2003 on Windows Server 2003, you
might be able to take advantage of
Microsoft Volume Shadow Copy Service
(VSS)–based backups. By using VSS,
the Windows OS provides a tested and
supported snapshot capability, letting
compatible backup systems take their
backup against a consistent shadow
copy of the database.
Sometimes, designing a complicated backup solution for Exchange
isn't the best answer. If Windows
Backup (aka NTBackup) is good
enough for Microsoft's production
servers, it's probably good enough for
yours. You can use NTBackup to create an on-disk backup of your
Exchange databases, then use your
enterprise backup software to transfer those files to other media. A disk-to-disk strategy removes your reliance
on slower, less reliable technologies,
and, in turn, helps your backup and
restore process run more quickly.
Don't gamble with your backups.
Actively test your backup and restore
plans by using duplicates of your production hardware and software so
that you know you can restore your
data when it counts. Ensure that your
backup methodology is Exchange
aware and that Microsoft supports it.
And don't forget that Exchange 2003
features—namely the Recovery Storage Group (RSG)—can make your
restoration processes run much more
smoothly.
Message and Transport
Security
Message security encompasses two
main areas: message encryption (using
cryptography to protect the actual
message from inspection by unauthorized parties) and transport encryption
(using cryptography to protect discrete connections between components of the messaging system).
Message encryption. Message
security has clear implications for
your DCAR solution. In particular,
you need to consider the following
questions:
- If you use Secure MIME (S/MIME),
which Exchange supports, does
your archiving solution support it?
- Does your archiving solution
archive older certificates, so that
you can still view email messages
encrypted with them?
- How do you protect, back up,
and restore whatever public key infrastructure (PKI) you use with
S/MIME? (And although pretty
good privacy—PGP—isn't optimal
for DCAR, if you use it, ask yourself
how you'll protect, back up, and
restore your users' keyrings
encrypted with PGP.)
- Can your policy-compliance software handle encrypted email messages?
- Are you required to protect message integrity through every hop of
your network?
- Can attackers (whether internal or
external) eavesdrop on unencrypted transport links?
Exchange 2003 and Exchange 2000
come with strong support for
S/MIME; the Exchange 2003 version
of OWA extends this support to OWA
users. However, the practical considerations of deploying and managing
the requisite PKI, dealing with the
content-inspection challenges, and
archiving keys tend to make the use of
S/MIME unattractive for most organizations unless they're required to use
it (e.g., government Exchange deployments).
Transport encryption. Transport
encryption, on the other hand, is easy
with Exchange and Windows and
tends to mesh well with any third-party components of your DCAR solution. Exchange 2000 and later natively
support Secure Sockets Layer (SSL)
and Transport Layer Security (TLS) for
a variety of protocols; Windows 2000
and later provide built-in IPsec functionality. Don't rely on MAPI encryption to protect connections between
Outlook and Exchange; either deploy
IPsec policies or upgrade to Microsoft
Office Outlook 2003 and Exchange
2003 so that you can use RPC over
HTTPS.
In my experience, Microsoft Internet Security and Acceleration (ISA)
Server 2004 is one of the best investments you can make to help provide
a higher level of message security
between the Internet and your Exchange organization. Placing an ISA
server in your demilitarized zone
(DMZ) means never having to expose
your Exchange servers directly to
incoming Internet traffic and greatly
simplifies your firewall configuration.
Plus, ISA permits SSL bridging, so
that you can perform protocol-aware
proxying and filtering of SMTP and
HTTP connections while still providing transport encryption for every
connection.
Related Technologies
A variety of other Exchange technologies and features aren't directly related
to DCAR but still provide useful hooks
into your Exchange organization or
make deployment and troubleshooting easier to perform:
- Event sinks—Exchange event sinks
provide a powerful mechanism for
extending Exchange functionality.
Many DCAR components use this
feature to plug into your Exchange
servers and intercept email messages before they're passed off to
internal Exchange components.
Common uses include alternative
journaling implementations, content inspection, and disclaimer
injection.
- Protocol logs—Although protocol
logs are disabled by default, you
can easily turn on Exchange's powerful protocol-level logging on a
per–virtual-server basis. These logs
provide an accurate picture of all
the communications that transpire
through that virtual server, letting
you easily track down problems or
perform spot audits.
- Message tracking—Exchange's
message-tracking feature is disabled by default. When enabled on
all your Exchange servers, message
tracking lets you quickly trace the
passage of email messages through
your organization. Enabling message tracking takes a small amount
of overhead, but the ability to easily
find out where an email message
went astray more than makes up for the overhead, especially if you
need to troubleshoot your DCAR
implementation.
- Message hygiene—Exchange 2003,
in particular, includes some
impressive antispam features that
can help you reduce the level of
junk that makes it into your organization. The reduction in spam in
turn reduces the load on your
retention, archiving, and compliance components. Exchange also
provides a comprehensive antivirus
API that lets you stop worms,
viruses, and Trojan horses.
Completing the Solution
As you've seen, you can use
Exchange's built-in journaling, along
with Exchange 2003's support for VSS
and message and transport encryption plus related features such as message tracking, as the foundation of
your Exchange recovery and compliance solution. However, Exchange
doesn't provide certain other essential
DCAR functions, such as archiving
and PST management. To complete
your Exchange DCAR solution, you'll
want to look into third-party products
that can provide these capabilities.
EXCHANGE COMPLIANCE RESOURCES
E-discovery and compliance:
“Build an Email-Discovery Plan,”
InstantDoc ID 49896
“Regulatory Compliance,”
InstantDoc ID 46946
Email Compliance Requirements: Getting Started, and Preventing the IT Search Party: Be
Prepared for E-Discovery—on-demand Web seminars, http://www.windowsitpro.com/events
Exchange backup and recovery:
“6 Common Backup and Restore Mistakes,”
InstantDoc ID 49828
“Best Practices for Recovery Storage Groups and Exchange Server 2003,”
InstantDoc ID 48878
“How can I back up my Microsoft Exchange Server storage groups and databases?”
InstantDoc ID 41820
“Exchange Server 2003 data backup and Volume Shadow Copy Service,”
http://support.microsoft.com/?kbid=822896
Microsoft's in-house Exchange 2003 backup strategy: “Backup Process Used with Clustered
Exchange Server 2003 Servers at Microsoft,”
http://www.microsoft.com/technet/itsolutions/msit/operations/exchbkup.mspx
Exchange journaling:
“An Exchange 2003 Journaling Primer,” InstantDoc ID 45348
“Exchange 2003 Advanced Journaling,” InstantDoc ID 45644
“What message journaling options does Microsoft Exchange Server 2003 support?”
InstantDoc
ID 93060
“Troubleshooting message journaling in Exchange Server 2003 and Exchange 2000 Server,” http://support.microsoft.com/?kbid=843105
Exchange's built-in antispam features:
“Get the Most from Exchange Antispam,” InstantDoc ID 93520
Exchange security:
“Messaging Security,” InstantDoc ID 93965
“Secure Email with S/MIME,” InstantDoc ID 49878 |
This article is adapted from Email
Discovery and Compliance, Chapter
5: Implementation, Part 2—Hardware
and Software (Windows IT Pro eBooks, 2006).