The smartest MOM yet, MOM 2005 is packed with tools for pinpointing and resolving system problems
As you're probably aware, Microsoft released the latest version of Microsoft Operations Manager, its event-monitoring and performance-management product, in late August. The latest release of MOM—MOM 2005—builds upon the core competencies of its predecessor, MOM 2000, by adding capabilities to MOM's Management Packs—the snap-in components that application providers supply and which let MOM collect and analyze data to determine the health of an OS or application. (For example, the MOM Management Pack for Exchange Server 2003 raises Exchange 2003—specific alerts that help you resolve Exchange-related problems.) MOM 2005 adds health modeling and improved state-based monitoring features to Management Packs. In addition, MOM 2005 provides improved trending and analysis capabilities that work in conjunction with a data warehouse and Microsoft SQL Server 2000 Reporting Services. MOM 2005 also provides a new Operator Console UI that makes it easier for an administrator to quickly identify and resolve problems—either by resolving an alert's cause or escalating the condition.
These enhancements significantly improve MOM's ability to boost an IT organization's service-monitoring and operations-reporting capabilities. With MOM 2005, Microsoft is also following through on its promise to improve the manageability of Windows; each server product group at Microsoft is creating MOM Management Packs for its software offering before a new product release. Let's take a tour through MOM 2005, so that you can get acquainted with its updated architecture and Management Packs, Operator Console, and other key new features. (For a quick look at a version of MOM 2005 that's geared to smaller sites, see the sidebar "MOM 2005 Workgroup Edition.")
A MOM 2005 implementation essentially consists of a Mmanagement group (called a configuration group in MOM 2000). A management group is a set of key components for operating MOM, which include a database, one or more Management Servers (called Database Access Server Consolidator Agent Managers—DCAMs—in MOM 2000), and MOM agents. A MOM agent is a service that's installed on each server you want to manage via MOM. Each MOM agent can report to multiple management groups. Here, we refer to servers running the MOM agents as managed servers. Ultimately, the data that the MOM agents collect is stored in the MOM database. The MOM database also stores alert data, configuration information, and Management Packs.
The Management Server maintains and manages the MOM agents. Agent maintenance involves installing and uninstalling agents. Agent management involves collecting computer attributes and receiving event, alert, and performance data from the agents. The Management Server uses computer attributes to determine the OS, services, and applications that are installed on each managed server. Based on the systems information it collects, the Management Server deploys monitoring rules that are relevant to each managed server. Computer-attribute discovery (i.e., collecting attributes from managed servers) occurs every 60 minutes by default, so that when an administrator removes a service such as DNS from the server, MOM also removes the rules that monitor DNS from the MOM agent on that server.
The Management Server receives event, alert, and performance data from the MOM agents and writes that information to the MOM database. Each agent also sends a heartbeat to the Management Server every 10 seconds by default, which the Management Server uses to determine whether the managed server is operational.
Microsoft designed the MOM architecture to scale from small-server implementations to enterprises containing thousands of servers. A small MOM installation consists of a management group with a server that houses the MOM database and acts as a Management Server. In contrast, a large installation could consist of multiple management groups, each having a dedicated server for the MOM database and multiple Management Servers in each management group.
One scenario that deviates from the architecture we just described is agentless monitoring. In this case, the Management Server itself monitors a managed server. This is possible because MOM also installs the MOM agent on the Management Server. Agentless monitoring doesn't support all the data sources that agent-based monitoring does, such as Windows Management Instrumentation (WMI) events and text logs. Agentless monitoring lets you monitor a server that can't have an agent installed on it. Typical situations that preclude the installation of a local agent include support for servers that run Windows NT Server 4.0 and warranty-violation or change-management problems, which prevent you from adding software to a server. Agentless monitoring doesn't circumvent the need for a MOM license for each managed server. From a resource perspective, agentless monitoring isn't any less resource intensive on the managed server than agent-based monitoring, and it increases the processing load on the Management Server because the Management Server must also function as the client agent. Microsoft recommends that no more than 60 servers per MOM implementation be managed with agentless monitoring.
Upgrade and Installation
Upgrading from MOM 2000 to MOM 2005 is straightforward and ensures that the time and effort you spent deploying a MOM 2000 infrastructure isn't wasted. The Management Server offers backward compatibility for MOM 2000 agents, so that you can perform agent upgrades incrementally. Also, MOM 2000 Management Packs are compatible with MOM 2005. The first part of a MOM installation is a prerequisite check to insure that the server or servers on which you'll install MOM 2005 meet all mandatory software and hardware requirements before installation continues. MOM 2005 adheres to Microsoft product-installation guidelines because it uses Windows Installer packages for installing both server and agent components. Thus, you can deploy MOM agents by using MOM's built-in remote-deployment capabilities or by using Windows Installer packages (through Group Policy Objects—GPOs), Microsoft Systems Management Server (SMS), or any other method that supports Windows Installer packages.
Management Pack Enhancements
Management Packs contain operational knowledge about how best to monitor an application. The operational knowledge facility of a Management Pack in MOM 2000 and later includes elements such as computer groups, rules, providers, scripts, views, reports, and product knowledge-base items that are combined into a single package that can fully monitor an application or a service. Microsoft has made Management Packs even more useful in MOM 2005 by adding capabilities for health modeling, rule and criteria overrides, state-based rules, and tasks.
You use rules to look for the presence of a specific condition, such as a particular event ID or performance-counter value, and to verify that an application is working correctly. For example, the Exchange 2003 Management Pack uses a script to test mail flow between Exchange servers. To monitor applications, rules use various data sources called providers. Provider types include event logs, performance counters, WMI events, Syslog messages (such as those in Linux and UNIX), and text log files. If you can't find a provider that can adequately monitor an application, you can write scripts or use the scripts that are included in some Management Packs to provide the capability you need.
When the MOM rules-processing facility detects an event, it performs one or more actions, such as running a script or raising an alert. When an alert is generated, one of its elements is product knowledge, which the Management Pack author provides. Additionally, you can add your own knowledge-base information, which is independent of the product knowledge, about how to resolve the alert so that if the alert condition occurs again, that customized information will be contained in the alert.
Health modeling, a new feature in MOM 2005 Management Packs, lets a MOM administrator use an application role, server roles, and component roles within a particular Management Pack. An application role can span multiple server roles, and each server role can span multiple component roles. For example, a health model for an Internet Information Server (IIS)-based Web farm Management Pack could contain a Web application role, an IIS server role and the FTP service (MSFtpsvc), and the World Wide Web Publishing service (W3SVC) component role. The health model could specify that if MSFtpsvc stops, it won't cause the Web application to report an error state, but if W3SVC stops, the Web application will show an error state. The Management Pack might also specify that if fewer than 50 percent of the IIS servers in the Web farm are in an error state, the Web application appears operational and if more than 50 percent are in an error state, the Web application appears nonoperational.
Rule and criteria overrides let you selectively disable a rule or change a rule's criteria on a computer or group of computers without disabling the rule or changing the criteria on all other managed computer targets. For example, you could set the rule called % Processor Time to alert you when % Processor Time exceeds 50 percent on a business-critical application server and set that rule to alert you when % Processor Time exceeds 90 percent on other servers.
State-based rules let you quickly view the state of an alert to see whether the condition that generated the alert still exists. For example, you create the % Processor Time rule and, if % Processor Time exceeds 90 percent, you instruct MOM to generate an alert. Suppose then, that the % Processor Time threshold is less than 90 percent the next time the state-based rule evaluates % Processor Time. MOM sets the problem state of the alert that was generated when % Processor Time exceeded 90 percent to inactive. In contrast, when an alert condition fires in MOM 2000, you must investigate the alert's cause even if the condition that caused the alert has been resolved.
MOM 2005 also introduces the concept of tasks in Management Packs. You predefine tasks within a Management Pack to run diagnostics or even take action to resolve a problem. Although MOM has always provided the ability for a rule to automatically respond to an event, tasks let you initiate an action to further diagnose or resolve a problem. For example, you receive an alert from a managed server. In response, you click the Ping task to check network connectivity. If the Ping succeeds, you run the System Information task to determine whether the OS is functioning properly. The idea behind tasks is to consolidate various troubleshooting tools in a central location to help operators and expedite problem resolution. In addition, Management Pack authors can create tasks that are specific to their applications to help you respond appropriately to an event.
Management Packs for Microsoft applications are included with MOM 2005; Management Packs for non-Microsoft products are supplied by hardware vendors, such as Dell and HP, to provide specialized monitoring of their hardware, and software vendors, such as Citrix Systems, to monitor their own applications. Other vendors such as eXc Software and NetIQ offer Management Packs and agents for a variety of non-Microsoft applications and OSs.
A Better UI
The most visible improvement in MOM 2005 is the Operator Console, the new MOM UI that lets you more easily and quickly identify, understand, and resolve alerts. The Operator Console has a familiar look and feel and resembles the Microsoft Office Outlook 2003 UI. For example, you can display the data in the MOM database in several different views, just as you can display information in Outlook 2003. The Operator Console also lets you create your own views. Some of the default views are the Alert View, State View, and Diagram View.
You use the Alert View—which Figure 1 shows—to view, filter, and drill down into a list of alerts. You can quickly filter alerts by computer group so that you can conveniently view alerts about a set of related servers. You can also customize alert views by specific alert criteria such as owner, severity, status, and creation time. Clicking an alert in the top left pane displays alert properties, information about the alert, and how to resolve the problem.
You use the State View—which Figure 2 shows—to review a list of computers that MOM is monitoring and the configured server roles. Clicking a server role in the top middle pane displays the components that make up the role in the bottom middle pane. The State View lets you quickly identify a server that has a problem and identify the server role and component within the server role that's causing the problem.
You use the Diagram View to display a graphical view of computer groups or applications and the role relationships that exist between servers. The Active Directory (AD) and Exchange Server diagrams are particularly interesting because they give you a complete picture of each application's infrastructure. Figure 3 is a sample AD diagram, which shows four AD sites (NYC, Boston, Westcoast, and Europe) and how they connect for replication. The diagram also shows the domain controllers (DCs) within each site and the health of each DC. A green check mark means that the DC is healthy, a yellow triangle with an X or an exclamation point indicates an error or a warning state, and a red circle with an X means that the DC server has a critical error. The AD and Exchange Server Management Packs automatically create these diagrams directly from AD or Exchange Server, respectively. You can export diagrams created in the Diagram View to Microsoft Office Visio 2003.
The Operator Console also lets you put a managed server in Maintenance Mode. Maintenance Mode lets you configure changes to a managed server without notifying other operators of critical events that are related to configuration changes, such as services stopping or the server rebooting. While in Maintenance Mode, the server continues to send event, alert, and performance data to the Management Server, but alert data doesn't appear in the Operator Console and MOM doesn't send email messages or pages about the condition to operators.
A Boost for Reporting and Analysis
MOM 2005 includes predefined reports that let you view the events and alerts in the MOM database and view performance-analysis and capacity-planning information. Management Packs provide application-specific reports, and you can also create custom reports.
MOM 2005 relies on SQL Server 2000 Reporting Services to provide its reporting capability. SQL Server 2000 Reporting Services is a full-featured reporting product that lets you view reports in various formats (e.g., Excel, HTML, PDF, Web page) and delivers reports in several ways (e.g., email, file share). You can view Web-based reports in real time (i.e., reports are automatically refreshed), or you can generate new reports on a predefined schedule.
Installing MOM 2005 reporting automates the creation of a separate data repository, called the System Center Data Warehouse. Creating a separate data repository for reporting lets you perform long-term capacity planning and trending in a separate data store while ensuring that the online MOM database remains as small and optimized as possible. Ideally, you should install System Center Data Warehouse on a server other than the MOM database server so that report generation doesn't affect the monitoring capabilities of the MOM infrastructure. To populate the System Center Data Warehouse, MOM replicates data from the MOM database server on a scheduled basis.
Microsoft designed MOM 2005 to monitor Microsoft platforms and applications. To provide support for monitoring non-Microsoft platforms and applications, MOM 2005 relies on Management Packs and product connectors that third-party vendors supply. Product connectors let third-party management products, such as HP Openview and IBM Tivoli, send and receive their management data (i.e., bidirectional alert transfer) to and from MOM 2005.
Microsoft created the MOM Connector Framework (MCF) to help third-party systems management vendors connect their management products to MOM. MCF is a Web service that gives application providers a standards-based method to let their products access and interact with MOM data. MCF ships with a MOM 2000-to-MOM 2005 connector, and the MOM Software Development Kit (SDK) contains a generic connector that third-party application providers can use to better understand how to connect their products to MOM 2005. The MOM-to-MOM connector lets one MOM server (MOM server A) forward alerts to another MOM server (MOM server B). If the alert is marked as resolved on MOM server B, the alert state is sent back to MOM server A, which sets the alert state on MOM server A to resolved. The generic product connector is a Microsoft Visual Studio .NET 2003 solution that contains code for a sample connector and a simple trouble-ticket application that demonstrates how to forward and view MOM alerts in a separate system. The MOM Resource Kit also includes a MOM-to-Tivoli TEC (Tivoli Enterprise Console) connector and a MOM-to-HP Openview connector.
The Best MOM Ever
MOM 2005 builds on the solid foundation of its predecessor and adds features that are crucial for many IT shops but were missing from MOM 2000. Management Pack enhancements provide health-monitoring and state-management capabilities that aren't in MOM 2000, and SQL Server Reporting Services gives MOM 2005 a robust reporting engine. Microsoft also significantly improved MOM 2005 by providing an Operator Console that greatly simplifies the process of identifying and resolving problems. Our assessment is that MOM has definitely improved with age.