Microsoft's new mail client takes spam filtering seriously
Spam is a scourge that continues to be a major concern for systems administrators. Although governments in the United States and elsewhere are attempting to address the problem through legislation, spammers will probably circumvent this obstacle by simply moving their operations to locations outside the jurisdiction of any laws. As an illustration of just how bad a problem spam has become for major corporations, HP bastion hosts deployed at the network perimeter to scan all messages arriving at hp.com dropped just 30 percent of messages in 2002 because they could be immediately identified as suspicious, perhaps because they contained "well-known" content or virus-ridden attachments. Today, the same hosts drop 70 percent of messages—some 21 million messages a month. The upsurge in spam activity accounts for the increase in dropped messages, and every company that hosts an email server is now a potential target for spammers, no matter which email server you run.
Most large organizations deploy various server-based tools to block as much spam as possible before it gets to users. Bastion hosts can catch a lot of spam, but deployed in isolation, they can't keep up with the ever-changing techniques that spammers employ to mask their activities, so administrators often deploy a second line of defense in the form of antispam software that integrates with the email server.
Microsoft added a spam confidence level (SCL) Store property to Exchange Server 2003 that antispam software can update with a value that indicates whether the software thinks a message is spam. The Store and email clients can then suppress messages with high SCL values. Server-based antispam software often combines spam checking with antivirus protection for Exchange servers, but even with two lines of defense (i.e., the bastion host and server-based antispam software), some spam gets through. In the past, if Microsoft Outlook users wanted maximum spam protection, they had to install add-on products. Microsoft has incorporated a Junk E-mail Filter into Microsoft Office Outlook 2003 that you can also deploy with Exchange Server 5.5 or later. My experience is that Outlook 2003 can block most spam that comes along, but you still need to deploy multiple lines of protection if you really want to fight spam.
Detecting Junk Email
Spam-detection software relies on a mixture of techniques to identify unwanted messages, with different software products using different technique combinations. One technique is looking at originator addresses to block messages from well-known spammers that appear on Realtime Blackhole Lists (RBLs); another is examining message properties (such as the message subject) and the content to pick up keywords such as Viagra and porn. (You can implement similar checks by using Outlook rules, but doing so slows down processing considerably because rules aren't designed for this purpose. Antispam products typically supply dictionaries of common words or phrases and use compiled code to check message content against the dictionaries, so they can process messages much faster.)
Detection software also analyzes message structure for patterns typical of junk mail messages. For example, spam authors seem to feel compelled to add emphasis to their messages with a lot of exclamation points. If a spam tool's scoring system finds 20 exclamation points in a message, that message could be spam. (However, it could be from an enthusiastic member of your marketing department.) Detection software might also look for fingerprints of known spam messages. Antispam-tool vendors track known spam and analyze the message content to create a fingerprint (typically included in their tools' dictionaries) that the filters can use to recognize similar messages.
Exchange 2003 includes upgraded connection-filtering features as well as the ability to block mail from anyone other than authorized senders. You can connect Exchange 2003's connection filters to an RBL subscription and perhaps avoid the need to purchase an additional antispam product for the server. This Exchange/RBL option is inexpensive, but you must keep your RBL subscription up-to-date to ensure that Exchange can recognize incoming email from newly registered spammers. In addition, if you have just one RBL subscription, you're relying on that RBL maintainer to keep up with new spamming techniques and indeed to resist Denial of Service (DoS) attacks on its own service. Subscribing to multiple RBLs lessens your risks, but you incur extra costs. Purchasing and deploying a commercial-quality antispam product is a lot easier, especially for any production server that supports more than a few hundred mailboxes.
The latest spam suppression tools deploy analytical techniques to spot spam attacks early. Network probes monitor traffic that passes along the Internet in an attempt to detect traffic surges from a mail server; such surges could be the result of a spammer generating hundreds of thousands of messages that contain similar content. The probes use algorithms similar to those that generate a hash value for an electronically signed message to create a digital signature based on the message content, then store the signature in a database. The tools then check new messages against the digital signatures in the database to determine whether a message is spam. This kind of technique is available only in server-based software today, not in client software. For a list of server-based antispam products, see Buyer's Guide, "Enterprise Spam Filters," April 2003, http://www.winnetmag.com, InstantDoc ID 38277.
All Outlook versions support rules processing to let you automate common tasks such as moving messages from a particular sender into a dedicated folder. Outlook 2002 and earlier versions attempt to use a set of standard rules to filter junk email messages, but the growing volume of spam and the more sophisticated techniques used by spammers to avoid detection have rendered the rules-based approach ineffective—plus this approach is slow. The Outlook 2003 Junk E-mail Filter doesn't use the old rules-based approach coupled with a static list of keywords and junk-email senders to detect junk mail. Instead, Outlook 2003 uses a combination of compiled code and a dictionary to detect spam—an approach that's the result of Microsoft Research's text analysis work. MAPILab, a small company that specializes in Outlook add-ons, recently performed an in-depth Outlook 2003 Junk E-mail Filter analysis that throws some light on the processing Outlook does behind the scenes. See http://www.mapilab.com/news/0042.html for more information. The dictionary is stored in \program files\microsoft office\office 11 dictionary\outlfltr.dat and is approximately 2MB. The dictionary's content and accuracy is crucial to the operation of the Junk E-mail Filter, and Microsoft has committed to issuing regular updates with the most recent information gathered about junk email messages. Microsoft issued the first update in December 2003, as described in the Microsoft article "Overview of the Outlook 2003 Junk E-mail Filter Update: December 16, 2003" (http://support.microsoft.com/?kbid=832333).
Note that because Outlook 2003's Junk E-mail Filter runs on the client, you can use the filter only if you configure Outlook 2003 in cached Exchange mode or connect to a server with POP3 or IMAP4 (protocols that always put messages in a local store for processing). You can also use the Junk E-mail Filter if you configure Outlook to download messages to a Personal Folders (.pst) file, but this kind of configuration is largely outdated by the advent of cached Exchange mode and is really useful only if you deploy Exchange with small mailbox quotas. Microsoft could have designed Outlook to connect to an Exchange mailbox in the traditional client-server manner and process messages online, but Outlook would need to fetch the message content from Exchange before the client could filter the messages. This approach would work for small messages, but the network communications overhead required to fetch messages for checking is excessive, so Outlook limits its processing to local content.
The rules in Outlook 2002 and earlier can't perform the sophisticated filtering that server-based or dedicated client-side antispam software can, but Outlook 2003's Junk E-mail Filter can suppress a high percentage of the spam that creeps through corporate defenses to penetrate your Inbox. Thus, you can view Outlook 2003's Junk E-mail Filter as another layer of defense against spam, much like you run a desktop antivirus tool to supplement the antivirus software you run on your servers. Outlook 2003's Junk E-mail Filter is one of many client-side antispam tools. For information about finding others, see the sidebar "Other Outlook Antispam Tools."
How the Junk E-mail Filter Works
If you opt for some level of junk email protection, Outlook 2003 begins to process new messages waiting in your Inbox as soon as it starts up and checks incoming messages as they arrive. If you don't want Outlook to look for junk mail, select Options from the Tools menu, click Junk E-mail, go to the Options tab, which Figure 1 shows, and select the No Automatic Filtering option. The default protection level is Low, meaning that Outlook will detect only obvious spam. Some users will prefer to start cautiously and leave the protection level set to Low. I set the protection level to High so that Outlook aggressively checks for spam and moves any message that seems to be spam into the Junk E-mail folder. Outlook automatically creates the Junk E-mail folder if it doesn't already exist, and you don't have the option to select another folder. The Options tab also lets you permanently delete junk mail immediately (the equivalent of using Shift+Delete to remove messages without putting them in the Deleted Items folder). I don't recommend this option unless you have a high degree of confidence in Outlook's spam filters, which you can gain by setting the filter to High and periodically checking your Junk E-mail folder to make sure Outlook isn't filtering out messages that you need to receive. The results I got from the High setting were so good that I now have Outlook permanently delete any spam that it detects.
You can't change the algorithm that Outlook 2003 uses to decide whether a message is spam, but you can help Outlook improve its level of accuracy by creating lists of safe senders and blocked senders. Safe senders are email addresses that you recognize and that you don't want Outlook to mark as spam senders. Blocked senders are the opposite—addresses or domains that you gather from spam messages that elude Outlook's filters. To block a sender, right-click a message and select Junk E-mail, Add Sender to Blocked Senders List. Outlook will add the sender's email address to its list of known spammers. You can add to the list anyone who doesn't appear in your Global Address List (GAL—to block messages from someone on your GAL, you can use a regular Outlook rule).
Outlook's Junk E-mail Filter performs the following actions:
As I mentioned earlier, you should check the messages in your Junk E-mail folder from time to time before deleting them, just in case. You might find that your selected protection level is too aggressive and that Outlook is filing legitimate messages in the Junk E-mail folder. If Outlook consistently filters out messages from a specific correspondent, you can add that sender to your Safe Senders List to take care of the problem.
The reasons for the Blocked Senders List and Safe Senders List are apparent; the purpose of the Safe Recipients List is less so. Essentially, a safe recipient is a distribution list (DL) or newsgroup from which you want to receive messages. You don't control the membership of the DL or newsgroup, but you assume that the administrator won't send spam to the members. You tell Outlook that you want the messages from the email address of the DL or newsgroup by designating the address as a safe recipient.
Building and Sharing Lists
After a while, you'll accumulate a list of blocked senders that you might want to share with others. You might also want to share lists of safe recipients or safe senders. You can export data from or import data into any list. Go to the Blocked Senders tab of the Junk E-mail Options dialog box, and click Export to File to generate a simple text file that you can manipulate with any text editor. You can append lists gathered from other users and share updated lists of known spammers in a central location so that anyone can import them into Outlook.
Note that you can also add complete domains to your Blocked Senders List to block any attempt to send you email from those domains. However, be careful not to be too enthusiastic about adding individuals or domains to the Blocked Senders List because long lists will slow down processing.
Even the best implementation of multilayer network protection against incoming spam will let some messages through. If you take the time to add the senders of any spam messages that get past the Junk E-mail Filter to the Blocked Senders List, Outlook's ability to recognize and block new spam will gradually improve. In my case, the Junk E-mail Filter intercepts most of the offending messages that make it past my company's bastion host and server-based spam filters, so very few spam messages actually reach my Inbox now.
In Outlook 2002 and earlier versions, rules can slow down message delivery, especially when they call for complicated processing such as the type necessary to detect spam. Outlook 2003 caches the lists it uses and implements the Junk E-mail Filter in compiled code, so performance is acceptable—I don't notice any delays with approximately 40 entries in my Blocked Senders List. Outlook 2003 doesn't perform junk mail filtering until it has fully downloaded the header and content of new messages, so if you're quick, you might see a message appear in the Inbox, then disappear after Outlook checks its content, decides that it's spam, and moves it into the Junk E-mail folder. Apart from that evidence, you shouldn't be aware that spam processing is going on.
The Exchange 2003 Connection
Outlook 2003 leverages some Exchange 2003 features. The client software stores the Safe Senders List and Blocked Senders List as well as its Junk E-mail Filter settings as properties of user mailboxes so that Outlook Web Access (OWA) 2003 can use the same data when it checks for spam. Figure 2 shows the Blocked Senders List for a mailbox as viewed through OWA 2003.
Antispam tools that run on Exchange 2003 servers can also exploit information about blocked senders, and Exchange honors the Blocked Senders List to direct email messages from those addresses into the Junk E-mail folder as soon as the messages arrive on the server (before Outlook ever gets involved). This feature means that clients don't need to incur the network and processing overhead to download and check the messages.
In addition, OWA 2003, like Outlook 2003, blocks external content to prevent spammers from confirming that email addresses to which they've sent messages are valid. The content might be a Web beacon, a small and often invisible graphic file that signals a spammer that a message has reached a real email address when a client downloads the file for display. (For more information about Web beacons, see "Spam Beacons," September 2003, http://www.winnetmag.com, InstantDoc ID 39501.) The golden rule here is to always avoid downloading anything when you're unsure of its source. Outlook 2003 and OWA 2003 will display the graphic content in messages from senders on your Safe Senders List. For example, I added Amazon.com to my Safe Senders List so that I could see details in order confirmations after I've bought something, and I added Unitedcomics.com so that I could see the Daily Dilbert cartoon that the company sends by email.
Outlook won't suppress all the spam that arrives at your organization, so be prepared to deploy bastion hosts and server antispam software to stop as much spam as you can before it gets anywhere near your clients. But every weapon that comes along to fight spam is welcome, so I'm glad to have client-side blocking incorporated into Outlook 2003.