EDITOR'S NOTE: At press time, Fidelia released NetVigil 3.5. This new version includes support for nested service containers, which lets users group together a variety of network elements into a virtual business-service view. For information about the new version, visit the company's Web site.

Back in prehistoric times—that is, before the Internet—every IT organization had a built-in system for monitoring the availability of crucial services: the end users. Administrators could always rely on end users to flood the Help desk with calls as soon as the network hiccupped or stopped responding.

Times have changed, and these days many organizations have a significant number of computing resources that aren't designed to serve the end-user community. Instead, these resources serve business partners or the general public. More often than not, the front end to these outward-facing systems is Web based because the Web has become the de facto standard for businesses that provide applications to other businesses or the public.

Without end users constantly looking at the outward-facing systems, outages and performance problems are more likely to go unnoticed. Therefore, over the past several years, a new class of applications has developed. These applications can monitor complex Web infrastructures for availability and notify the appropriate people or systems when a problem occurs.

The first few generations of Web monitoring applications were often simplistic—usually nothing more than tools that would ping a device and check port 80 for a proper response. If the tools found any anomalies, they sent an alert. Over the past several years, these tools have improved significantly and can now simulate complex Web transactions, record and chart response times, listen for SNMP traps, provide availability statistics, document service level agreement (SLA) violations, escalate notification if mission-critical failures aren't resolved, and much more.

Decision Factors
In the Web monitoring space, you'll find many applications from which to choose. With so many choices, how is an overworked network administrator to decide which product is the best? You need to consider several factors, the first of which is whether you want to manage your Web monitoring solution inhouse or outsource it. An outsourced monitoring solution typically has a quicker implementation time and, in the short term, is cheaper to implement. In addition, monitoring providers typically invest significant resources into making sure that their systems are highly available through redundant Internet connectivity and clustered, load-balanced servers. However, you typically have less flexibility with an outsourced solution than with an inhouse solution. And depending on the monitoring provider's monthly charge, an inhouse solution might be less expensive than an outsourced solution in the long run.

If you decide to manage your Web monitoring solution inhouse, you need to know what's most important to monitor in your Web environment so that you can purchase a package that best meets your organization's monitoring needs. Here are some points to consider:

  • Do you need transaction monitoring that's more advanced than simply checking a URL for an appropriate response? For example, do you need to test a series of steps in succession, such as completing an order form, confirming the transaction, and receiving an order number as a response? If so, you need a product that provides complex transaction monitoring.
  • Do you need to measure availability and responsiveness from multiple discrete points of presence? If so, you need a product that you can configure in a distributed architecture and that has deployable multiple probe points that can report their findings to a centralized database.
  • What types of notification options do you need? For example, do you want the product to notify you through email or a pager? Do you need a product that can display SNMP traps in a management console? Do you need the ability to escalate notifications after an alert condition has existed for too long?
  • Basic reporting capabilities are standard in most Web monitoring software, but do you need more sophisticated reports? For example, do you need availability reports or responsiveness reports (e.g., a report showing how quickly a page is loaded or a transaction is processed)? Do you want the system to automatically create and distribute reports? Do you need reports that document SLA violations?
  • In addition to monitoring Web transactions and server or network availability, do you want to be able to monitor system-level health parameters, such as CPU, disk, and memory usage?

The easy answer to these questions is to simply say, "Yes, I'd like the package to provide all these services and capabilities." With that in mind, let's take a look at a handful of today's leading products: RGE's IPSentry 4.5, Fidelia's NetVigil, Freshwater Software's SiteScope 7.6, Visualware's VisualPulse 3.0, and Ipswitch's WhatsUp Gold 8.0. All these products handle the basic functions—periodic polling, notifications, and reporting—but that's where the similarities end. Each package brings different strengths to the table.

For Web monitoring needs that extend beyond ping, port, and URL loads, IPSentry is a great package that includes some of today's more advanced monitoring capabilities, such as checking the contents of Web pages, monitoring service availability, and monitoring event logs. With 11 add-ons, IPSentry is easily expandable to meet new monitoring requirements as they arise.

Configuring IPSentry to monitor Microsoft IIS or another Web server is easy. First, you define the device you want to test by specifying its type (in this case, a network device) and giving it a unique name and description to identify it. Then, you specify the device's IP address or DNS name. Finally, you specify what type of test you want to conduct (i.e., which port you want to monitor). For the test, you can use any number of protocols, including Daytime Protocol, DNS, Echo Protocol, FTP, HTTP, HTTP Secure (HTTPS), Internet Control Message Protocol (ICMP—often referred to as ping), POP3, SMTP, and Telnet.

In addition to checking for device availability, you can check the content that the device returns as a result of the check. For example, you can check for the 220 response that an SMTP server's logon banner typically returns or the </HTML> tag that a Web server returns when a Web page has closed properly. Content checking is especially useful for determining whether an intruder has defaced your Web site.

IPSentry can monitor the services on a Web server to make sure they're all running. IPSentry can also monitor other parameters that might affect system performance or availability. For example, you can configure IPSentry to monitor CPU usage, drive space, event-log entries, and temperature readings from temperature probes. When a problem arises, IPSentry can alert you by sending you an email message, paging you, creating a syslog event, sending an SNMP trap, or even sending an X10 power-line control message.

When IPSentry is running, you can quickly learn what the product is monitoring and how your system is performing by glancing at the Active Display Console. IPSentry logs the availability statistics of all the devices that it monitors and can automatically generate a report on a regular basis. By having IPSentry automatically generate the report into a directory that's available through your Web server, you can make these reports available inside your organization to anyone who is curious. Figure 1 shows a sample availability report.

The cost of IPSentry ranges from $99 for a single-server license for the base package to $8195 for an enterprise license for the IPSentry package bundle, which consists of the base package and all 11 add-ons. In addition, RGE has generously created an IPSentry LITE version that's free. This version is useful for small organizations that need to monitor only one or two devices. (I evaluated the full version for this article.)

According to Fidelia, NetVigil is designed for midsize to large enterprises and service providers whose businesses depend on 24 * 7 transaction reliability. Founded in 2000, Fidelia is the youngest company in this review. Despite the company's youthfulness, NetVigil is a surprisingly mature product.

Installing NetVigil was straightforward, except for one slight mishap. By the time I finally got around to reviewing NetVigil, the evaluation copy the company sent me had expired. On a Saturday at 8:00 p.m., I sent an email message to the support group—in less than an hour, I had a new license key file.

One unique advantage of NetVigil is its modular architecture. You can deploy NetVigil on one server or distribute it across many servers. The modular architecture provides for greater scalability of the package and the ability to deploy multiple testing servers (referred to as Data Gathering Elements) at disparate locations. If you want to measure your availability and performance from every continent, all you need to do is deploy a Data Gathering Element server to each location, then push your tests to each server.

Configuring NetVigil is easy. You simply enter a device's IP address or host name and let NetVigil automatically probe the device with pings, port scanning, and SNMP. Based on what NetVigil finds, it presents you with a list of tests that you can perform against the device. The first time I configured NetVigil, I was surprised at what it was able to find. I pointed NetVigil at a Compaq desktop. NetVigil found not only the device and the IIS service I had installed on it but also several Windows 2000 performance metrics that were extensible through SNMP, such as CPU usage, disk usage, and memory usage. NetVigil also recognized several performance statistics that my Compaq network adapter tracks and let me monitor those statistics.

When NetVigil is running, it constantly refreshes a status screen that shows the network availability, system health statistics (e.g., CPU usage), and application availability for any device that you've specified. To get more detailed information about a device, you just click that device.

For Web server monitoring, you can verify Web page loads and content, test Web transactions, and test regular expressions. In the event of a warning, critical-threshold violation, or failure, NetVigil can notify you through email, AOL Instant Messenger, MSN Messenger, or Yahoo! Messenger. The Instant Messaging (IM) approach is unique.

After collecting data, NetVigil stores it in a database for later reporting. Figure 2 shows a typical performance summary report. This reports contains some interesting metrics, such as 95th and 98th percentile, mean, and standard deviation. Even more interesting is NetVigil's predictive capabilities for certain statistics (e.g., CPU usage). NetVigil can make an educated guess as to when you're likely to start crossing your warning and critical thresholds. Such predictions are useful for resource planning. Pricing for NetVigil starts at $10,000.

With an easy installation process and an extraordinarily wide range of tests you can conduct and applications you can monitor, SiteScope is a package well suited for organizations of any size. The breadth of SiteScope's monitoring capabilities is impressive. You can test IIS and other Web servers for URL page loads, contents, or a sequence of URL events (e.g., the sequence of loading a shopping cart, checking out, and receiving a payment confirmation). You can configure SiteScope to raise an alert if a Web page's content changes, then have SiteScope use the new content as the new baseline. You can also have SiteScope break down and time Web transactions by their component parts, such as DNS lookup, page load, and graphic-element load.

SiteScope's monitoring capabilities don't stop with IIS. SiteScope can monitor a considerable number of other applications, including Active Server Pages (ASP), Apache, DNS, Citrix MetaFrame, IBM's WebSphere, ICMP, SMTP, and POP. You can configure SiteScope to immediately retest any device that reports a failure to determine whether the device is in a failed state or whether a temporary hiccup occurred. A Web-based interface lets you see the overall status of all the devices SiteScope is monitoring.

With the ability to keep detailed performance and availability statistics comes the ability to create reports—and SiteScope excels in this area. One of my favorite features is SiteScope's ability to automatically create and email reports to anyone you designate. As Figure 3 shows, these reports are not only detailed but also good-looking. SiteScope's graphical report format is especially nice if you must document meeting an SLA for a customer or demonstrate your system's availability for your boss.

In addition to Web performance statistics, SiteScope can monitor system health parameters (e.g., CPU, disk, and memory usage) and any parameter available through SNMP. Thus, you can monitor your Web server's performance and your system's health and look for possible correlations when problems arise.

Pricing for SiteScope is based on a point system. Freshwater Software assigns points based on what you're monitoring. For example, the company assigns one point for using SiteScope to monitor one URL, one NT event log, CPU usage on one server, or one metric on a Microsoft SQL Server machine. Pricing starts at $2394 for a 25-point starter package.

VisualPulse provides ping, HTTP, and TCP monitoring through a Web-based interface. It also provides realtime and historical reports on network latency, packet loss, and application availability.

Drawing on VisualPulse's strengths in trace-routing capabilities, you can configure VisualPulse to run trace routes to a monitored IP address based on threshold violations. This feature is handy because you can pinpoint a network latency problem as it occurs instead of later when the problem might no longer exist. For example, let's say you have a system at a Web-hosting provider's facility, which provides the bandwidth for your system. During certain times of the day, you've noticed that your site is sluggish, but you know your server isn't the problem. Given this problem's temporary and intermittent nature, pinpointing the cause is typically tricky. However, if VisualPulse is performing trace routes at the exact moment of a threshold violation, you can look at the trace-route data and determine which device was responsible for the problem. Perhaps one of the Web-hosting provider's Internet connections is over-subscribed, and when the route to your server takes that path, traffic naturally slows down.

VisualPulse monitors devices by pinging them, checking for a response on a specified TCP port, and loading a URL if the monitored device is a Web server. You can set thresholds at two levels—warning and critical—for either latency or packet loss. As Figure 4 shows, you can monitor Web servers in addition to monitoring the system's overall health. VisualPulse can summarize the latency and availability data that it collects for all three data sources (i.e., ping, HTTP, and TCP) and provide performance reports.

VisualPulse starts at $295 for a 10-element license (an element is a pingable IP device). The price tops out at $2495 for a 250-element license, which is the maximum number of elements that a single VisualPulse instance can monitor.

WhatsUp Gold
If a picture speaks a thousand words, WhatsUp Gold speaks volumes. With its unique ability to lay out monitored devices in a graphical format and link higher-level maps to lower-level maps, Whats-Up Gold provides a unique drill-down, graphical view of the health of devices in an organization.

With WhatsUp Gold's auto-discovery capability, defining the devices you want to monitor is easy. You simply instruct WhatsUp Gold to use an ICMP or SNMP scan for devices, and WhatsUp Gold learns what's on your network and automatically maps the results, as Figure 5 shows. All you need to do is a bit of visual rearranging to suit your tastes. WhatsUp Gold also automatically scans for available services (e.g., FTP, HTTP, POP, Telnet) on any device and adds any services that it finds to the device's properties.

WhatsUp Gold's auto-discovery capability is even more useful for intranet environments. You can instruct WhatsUp Gold to periodically rescan the network for new devices. WhatsUp Gold issues an alert for any new devices it finds and scans those devices for available services. This feature is great for networks that are in a constant state of flux.

After you define a device and configure what you want monitored, WhatsUp Gold can notify you of problems by means of pager, beeper, email, telephone, or desktop alarm. WhatsUp Gold can report on and chart historical performance data and availability data. WhatsUp Gold has a flat-rate price of $795 without a service agreement and $1090 with a service agreement, which includes 1 year of upgrades and telephone technical support.

Many from Which to Choose
As you can see, you have many Web-monitoring packages from which to choose. All these packages provide a suite of services and capabilities. You need to determine which services and capabilities are most important to your organization, then pick the package that offers them.

Contact the Vendors
RGE * 317-745-3398

Fidelia * 609-452-2225

Freshwater Software * 303-443-2266

Visualware * 703-802-9006 or 866-847-9273

Ipswitch * 781-676-5700