Viruses, Trojan horse programs, and spyware have all had a large, and, for too many, a devastating impact on business. Viruses attack both servers and end-user systems constantly. Unfortunately, most IT departments continue to lag behind when it comes to solutions that properly protect users against these attacks. To make matters worse, spyware is still a relatively new concern for most organizations, with products that can detect and remove spyware being introduced only in the past few years. Generally, server-level and network-level solutions to stop viruses, Trojan horses, and spyware rely on either reactive or predictive systems.

Reactive Systems
Malware detection is still very much in the domain of reactive solutions. With a reactive solution, malware is detected through a multistage process that depends upon the vendor. Figure 1 shows the steps of the process.

Reactive solutions generally can't detect a particular piece of malware until after it's released (although some solutions do try to detect patterns that would alert them to malware to reduce this disadvantage as much as possible). After the malware is released, the antivirus vendor accepts a sample of the malware (e.g., from a customer, from a vendor-run honeypot). The vendor analyzes the malware and creates a signature or rule that will match the malware in email messages. The vendor adds the new signature or rule to its database for release to customers through either a push-initiated or a pull-initiated download. After the download, the antivirus product on the client can detect the new malware.

This process has both advantages and disadvantages. One advantage is that signature- and rule-based scanning for malware tends to be fast compared to other virus-detection methods (e.g., predictive/heuristic). Additionally, signature-based matching tends to be exceptionally accurate, which results in low false-positive and false-negative rates. Unfortunately, a key disadvantage of purely reactive systems is that they can't detect and clean new malware until they've added a new signature for that malware, thus creating a window of opportunity for hackers during which internal mail systems and end users are vulnerable.

Also, malware can easily be modified to prevent a match with the vendor's signature. Indeed, programs are readily available that have been built for this purpose. For example, Trojan horses are often masked by the use of packers (programs that compress the Trojan horse into a version that still runs but no longer resembles the original). Packers can compress Trojan horses to different levels, meaning that a single Trojan horse might be released in several versions. Because readily available binder programs can bind a packed Trojan horse program to a legitimate program in a single program, Trojan horses are a popular and easy to produce method of attack.

Predictive Systems
Signature-based reactive systems can't sufficiently protect networks from malware because although reactive systems are fast after a signature has been developed for a new piece of malware, the signature-creation process itself takes time. The antivirus vendor must obtain and analyze samples of new malware, create and release new signatures, and get the new signatures to the customer for application, so antivirus software is inherently slow to catch new malware. Analyzing a new piece of malware can be slow because it's often a manual process (i.e., a technician at the vendor must manually scan the malware, determine a pattern that will consistently match the malware, and create the signature).

Predictive systems provide an alternative to reactive systems. Instead of being based strictly on signatures, predictive systems use different types of analysis to determine whether malware exists in an email message, is attempting to initiate a remote connection, or is already on the network. Predictive systems are based on several types of solutions, including heuristic filters, behavioral analysis, and traffic analysis.

Heuristic filters. Based on artificial intelligence (AI) engines, heuristic filters use a combination of patterns and scores to determine whether a given program is malware. Heuristic filters are used in both spam and malware detection. They enable malware detection without an existing signature. Unfortunately, heuristic filters aren't foolproof, and they tend to have much higher false-negative and false-positive rates than purely reactive systems.

Behavioral analysis. Behavioral analysis is an advanced method of malware detection. With behavioral analysis, any program that an email message contains is loaded into a sandbox environment in which a built-in analysis engine monitors and analyzes the email-borne program's behavior. This process lets a malware detection engine acquire a great deal of information about how a given program operates. The analysis engine can then use this information to make a reasonably accurate determination about whether the software is malware (e.g., spyware). Detectable signs of malware include attempts to silently install the software on the client and connections made to an Internet server after the software runs on the client system. However, as you might expect, behavioral analysis is resource-intensive and therefore expensive to implement. Behavioral analysis isn't feasible in every environment.

Traffic analysis. Traffic analysis detects patterns in email traffic and notes changes to the usual patterns. For example, virus outbreaks tend to create a notable influx in incoming mail as external mail users are infected and their systems attempt to deliver malware to other systems. If an internal corporate network is compromised, a corresponding increase in outgoing email occurs.

Traffic analysis is difficult. Small environments might not have enough traffic for effective detection of anomalous traffic. Large environments can better support traffic analysis, but the analysis requires that a highly reliable and precise traffic monitoring and alerting system be in place.

The Layered Defense
Because of the danger of viruses, Trojan horses, and spyware and because email is now the main attack vector, most organizations rely on multiple layers of defense. Those layers can include a packet-filtering firewall, an email firewall, and a demilitarized zone (DMZ) mail server, as Figure 2 shows.

The first layer of defense—and the layer that best protects the underlying network and provides a crucial level of protection for network-oriented applications—is the packet-filtering firewall. A packet-filtering firewall understands networks at the TCP/IP layer, including such matters as TCP, UDP, and ports. This type of firewall is configured to let only certain types of incoming packets through to specifically allowed ports on the internal hosts that the firewall protects. For example, a firewall might allow incoming packets on TCP port 25 on the DMZ mail server and TCP port 80 or TCP port 443 on the DMZ Web mail server.

The second layer of defense is an email firewall, one example of an application-level firewall. This type of firewall works at a higher level in the protocol stack. It not only understands SMTP but can scan the content of mail envelopes and mail content to detect spam, phishing attacks, and viruses. The email firewall is usually hardened against SMTP-based attacks (e.g., buffer-overflow attacks), so the DMZ mail server is less susceptible to such attacks. An email firewall protects email systems (i.e., computer systems that provide mail service) as well as providing a layer of protection for internal users from dangerous email messages in their mailboxes.

Note: Email firewalls must provide comprehensive antivirus capabilities to properly defend against both known and unknown viruses. As I've discussed, much antivirus software has been reactive. However, because of how quickly viruses now spread and because many viruses are polymorphic, a reactive approach is no longer enough. Antivirus software must also provide predictive scanning, meaning that it should be able to perform heuristic scanning to detect key characteristics that identify a virus rather than needing to know an exact signature. Reactive scanning still has a place in virus defense, but real-time defense against zero-day threats requires predictive scanning from antivirus software.

The third layer of defense is a well-configured DMZ mail server. This server accepts only mail destined for the domains that it owns—that is, for internal users. This approach prevents spammers from using this mail server for relaying spam. The DMZ mail server is also hardened so that attacks that jump to it from the email firewall (e.g., invalid input that the email firewall accepts and passes on to the DMZ mail server) don't compromise it.

Finally, additional layers of defense can be beneficial, such as an Intrusion Detection System (IDS) and a separate DMZ Web mail server. (Because Web mail servers usually run complex Web applications, they often provide an avenue for an attack that can compromise internal systems.)