A useful script keeps Windows Services—instead of IT staff—up in the night
For as long as I can remember, our business has experienced occasional Windows OS service failures. Such failures were definitely more prevalent in the earlier days of Windows than now, but we still sometimes get a call from the Help desk reporting that a service isn’t working or we discover a stopped service during a routine check of a server’s health. A few months ago, someone asked me whether I could write a script that could monitor a service and send out an email notification if the service wasn’t running. I said I was pretty sure it could be done and that the script could probably restart the service while it was at it. Having this kind of automatic monitoring and alerting in place is certainly beneficial. The script I wrote helps keep our Windows Services up during the middle of the night instead of keeping me up in the middle of the night. Having more reliable services helps maintain customer satisfaction and, more importantly, helps keep our company from taking hits on our Service Level Agreements.
To fulfill our monitoring and alerting requirements, I wrote the script that Listing 1 shows. The ServiceMonitoringEmailNotificationScript.vbs script uses the Windows Management Instrumentation (WMI) Win32_Service class to monitor a Windows OS service and uses Collaboration Data Objects (CDO) to send email message notifications if the script finds that the service isn’t running. You can use this script to send email notifications to one or more individuals, pagers, or text-messaging systems that can accept text-based Internet email. The script attempts to restart the service and sends out a second email message indicating the status of the restart attempt. The script sends a final email message when the service is running again. In this article, I also touch on some basic file-checking and file-handling techniques and explain a couple of specialized functions that the script contains. The first of those functions, the Delay function, waits either for a given number of seconds or until the service starts, whichever occurs first. The second specialized function, the InterrogService function, is used to translate the return value of the restart attempt. Some of the return values include Success, Unknown Failure, and Service Disabled.
How to Use the Script
Initially, I designed this program to send out a message every 10 minutes while a service was down. Receiving these multiple notices can be helpful to the IT people who are working to get the service running again. However, in addition to notifying the regular domain administrators about outages, I needed to include several upper-management individuals in the notification, and for these people, multiple notifications could be annoying. So, to keep from sending repeated alerts while the service wasn’t running, I created a flag file after I sent out the first notifications. Every time the script runs, it checks for the existence of this flag file. If the service isn’t running and the flag file exists, the program terminates. If the flag file exists and the service is running, the script deletes the flag file and sends an email alert indicating that the service is running.
I recommend that you run this script as a scheduled task repeatedly throughout the day. By using Scheduled Tasks, you can set the script to run for specific durations at specific time intervals. For example, you could schedule the script to run Monday through Friday every 10 minutes between 7:00 A.M. and 6:00 P.M. If you decide you want or need repeated alerts while the service is down, simply remove the logic that checks for the flag file.
The script requirements are as follows:
- The script will run on Windows Server 2003, Windows XP, or Windows 2000. CDO, ADO, and WMI are integrated into these OSs.
- You must have the SMTP service running and available somewhere on your network, and SMTP must be capable of sending Internet email.
- You or the service account you choose to run this script must have sufficient privileges to access the SMTP service and the server it resides on. Note that it’s possible to pass usernames and passwords when you’re using CDO to send an email message, but this practice is strongly discouraged. Account names and passwords should never reside within a program or script such as this one. This script is designed for server administrators who have sufficient rights and privileges within their domain.
I’ve attempted to make this script as generic as possible. Because I’ve used variables instead of hard-coded values, you can easily monitor a different service, point to a different SMTP server, or change the email sender or recipient list. I’ve avoided hard-coding drive letters and folder names for the flag file as well, instead creating the flag file in the folder that the script is running out of.
Walking Through the Script
The ServiceMonitoringEmailNotificationScript.vbs script in Listing 1 starts by setting the values for the monitoring task’s most critical pieces: the service name, the emailto recipient, the SMTP server DNS name, the computer name, the emailfrom address, and the path variable for my flag file. First, the code specifies the service name variable, strService. In this example, I’m going to monitor the Windows Time Service, w32time, which Figure 1 shows. Next, the code assigns a valid email address as the value of the emailto variable. This address will more than likely be a pager email address if you’re monitoring production servers. Then, the code creates a WSHNetwork object for retrieving the name of the computer that the script is running on; the code assigns that name to the variable called ComputerName. Then, the code uses ComputerName as part of the emailfrom address. Using the computer name in the emailfrom address lets the email or pager recipient immediately know which server to look at and which service to check.
Note that the service name might be different from the service display name you see when you look at a program such as Services.msc. For example, if you open Services.msc and double-click the displayed name of a service (or right-click and chose properties), you’ll see the actual service name listed first under the General tab.
The next two lines of code determine the path that the script file is running from. The code uses this path information later when creating the flag file. The flag file will reside in the same folder that the VBScript is in. Note the use of the InstrRev function that’s nested within the Left function. This nice little function returns the position of the occurrence of one string within another, starting from the end of the string instead of the beginning. For my situation, I needed to extract just the path without the filename, and the InstrRev function made finding the last backslash in a string a cinch.
Moving along, the code next sets up the WMI object and query the Win32_Service class for a specific service, in this case, Windows Time Service. Then, the code stores the results of the query as a collection, as the following lines show:
GetObject(“winmgmts:\\” & _
ComputerName & “\root\cimv2”)
Set colItems = _
(“Select * from Win32_Service”_
& “where Name=’” & _
strService & “‘“,,48)
As long as no errors occur, the code now has all it needs to interrogate the service. The key service properties we’re concerned with are Started and State. Started is a Boolean data type, so a value of True indicates that the service is started and False indicates that it’s not started. State is a string data type and represents the state of the base service; its possible values are Stopped, Start Pending, Stop Pending, Running, Continue Pending, Pause Pending, Paused, and Unknown.
The methods the code uses are StartService and InterrogateService. StartService attempts to start the service, and InterrogateService requests an update on the state of the service. There are 25 possible return codes (0 through 24). I found the description table for these return codes on the Microsoft MSDN Web site (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wmisdk/wmi/interrogateservice_method_in_ class_win32_service.asp) and wrote the InterrogService function so that I could convert the return code to a description and include that in the second email that alerts email recipients to the service restart result.
At the beginning of the script, the code line
simply creates a FileSystemObject that the code uses to create, delete, or check for the existence of the flag file. And the line
creates a CDO Message object. This section of the script is where the power of messaging comes into play and where the code configures and sends the email alerts. If you examine the code, you’ll notice that the CDO.Message class is the only class you’ll need to reference when you create a message object. Once you have the message object, you simply need to configure a few message properties and a few configuration field items, then use the update and send methods. Presto! You’ve got a powerful tool for sending email. We’ll examine the CDO code in more detail shortly.
After the CDO.Message object is created, the code begins evaluating the Windows Time Service collection. Even though the collection contains only one item (Windows Time Service), the code still needs to use a For Next loop to evaluate an item in the collection. Immediately after the For Next loop begins, the code creates a variable (FlagFile), which will be used to create the flag file if it’s needed. The flag file name is made up of four concatenated items: the path (which was created earlier), a tilde (~), the name of the service we’re monitoring, and a .txt extension.
Checking the Service
Now it’s time to check and see how the service is doing. The code starts checking with this line:
(objItem.Started And _
The line loosely translates to “If Windows Time Service is not started OR the Service reports as being started but it’s not running.” You’re probably saying, “What?” I chose to add the OR condition only because I’ve seen situations in which a service was really hosed and appeared to be started but in fact wasn’t. Feel free to omit this second condition if you simply want to determine whether the service is started.
If the service isn’t running, the script first determines whether the flag file exists. If it does, its existence implies the service is down and an email has already been sent. If the flag file doesn’t exist, the code creates it and closes it.
The ServiceMonitoringEmailNotificationScript.vbs script stores the state of the service to a variable named CurrentState. I want to retain the value of the current state before I attempt to start the service because I use that state in the first email message. After storing this value, the code attempts to start the service by using the StartService() method, and while this attempt is underway, the code composes and sends the service failure email message. (I’ll explain more about the email section of code shortly.) After the email message is configured and sent, the code calls a Delay function that waits up to 30 seconds for the service to start and exits the wait if the service starts before the 30 seconds is up. The Delay function acts like a Sleep function, but can “wake up” early if a second condition is met during the waiting period. So if the service starts in three seconds, that’s how long the delay will last.
After the delay, the code is ready to send the second email message that lets you know whether the StartService result was successful. Take a look at the InterrogService function to see the list of possible return codes and descriptions. Because the code attempted to start the service, some of the service properties have likely changed, so the code needs to run the WMI service query again to get an updated collection. Then, the code checks to see whether the service is started and running and whether the flag file exists. If these conditions are true, you know that the service did have a problem but now it’s OK. Now that everything’s back to normal, the code can delete the flag file and send a final email message indicating that the service is up and running.
Now it’s time to check out the CDO code that sends the “All’s well” email message. As it turns out, CDO comes in several versions. The version that VBScript language uses is called CDO for Win2K and is contained in the CDOSYS.dll module. CDO is integrated into Windows 2003, XP, and Win2K. Take a look at the section of code beginning at the line
objMessage.To = emailto
As you can see, the message object properties are pretty straightforward. You have a From property that specifies who the message is from, a To property that specifies who the message is being sent to, and a Textbody property that houses the plain text message you’re sending. The next lines contain the four Configuration. Field items that describe the message configuration.
However odd these configuration item names might seem, they need to be set up as you see them in the code at callout A on page 13. Without going into pages of explanation, I’ll simply say that the string http://schemas.microsoft.com/cdo/configuration indicates that these are unique identifiers; the string isn’t a reference to a Web site, as it might first appear. By setting the sendusing configuration item to 2, you’re indicating that you’re sending your message over a network. You need to set the smtpserver configuration item to the DNS name or IP address of your SMTP server. This information is hard-coded into the smtpserver variable at the beginning of the program. The smtpserverport configuration item is set to 25 because that’s the port that’s typically used for SMTP. Of course, you’d change this setting if you were using a different port. To update the configuration setting, you must use the Update method after you’ve finished setting up the configuration. That’s it for the message configuration. Finally, the code can send the message on its way by using the Send method of the message object.
Stay Tuned for the Next Script
I hope you find the ServiceMonitoringEmailNotificationScript.vbs script a useful addition to your scripting toolbox. I think my next iteration of this script will be to create an engine similar to this one, but instead of having one script for every service, I’m thinking of using a text file as input and devising the code so that I can monitor multiple services with a single script. I look forward to sharing that script with you in an upcoming article.