4 easy fixes that restore network connectivity
| Executive Summary: “The network is down!” At some point in your IT career, you'll hear this phrase above the din of the office photocopiers. Rarely, if ever, does the actual physical network go down. But some problems can make it appear to your users that the network is in fact down. |
The following four network annoyances typically will make your users shout the above phrase and leave you scratching your head. As you’ll see, when it comes to networking not everything is as it appears. Two of the situations give the illusion of a problem with the network, although the problem actually lies elsewhere. Let's look at these network annoyances and how you can fix them.
“The network is down!” At some point in your IT career, you’ll hear this phrase above the din of the office photocopiers. Rarely, if ever, does the actual physical network go down. However, some problems can make it appear to your users that the network is in fact down.
The following four network annoyances typically will make your users shout the above phrase and leave you scratching your head. As you’ll see, when it comes to networking, not everything is as it appears. Two of these situations give the illusion of a problem with the network, although the problem actually lies elsewhere. Let’s look at these network annoyances and how to deal with them.
You’re summoned to the cubicle of a user who lost network connectivity. Windows shows the connection enabled but the cable unplugged. With a sigh, you check the user’s Ethernet cable. It’s connected. You swap it for a different cable, but it still doesn’t work. In a fit of frustration, you move the cable in the wiring closet between the patch panel and the Ethernet switch to a different switch port. The connection is restored. What’s going on?
Many enterprise switches have built-in functionality that automatically disables a port that appears to be behaving improperly. A typical cause is a duplex mismatch. The reasoning behind the functionality is that a disabled port is likely to be noticed immediately, letting the administrator correct the problem that caused the port to be automatically disabled in the first place.
In Cisco parlance, this functionality is known as error-disable or errdisable. Other vendors might use different terminology.
The following five steps show how you can identify and fix this problem. I use examples based on Cisco CatOS, though similar functionality exists for Cisco IOS. (To learn more about enabling Cisco routers, see “9 Steps to Setting up a Cisco Router,” InstantDoc ID 98740.)
1. Physically identify the switch port that you believe to be error-disabled.
2. From enable mode, execute the command
3. If the port is in fact error-disabled, the status will be listed as errdisable.
4. Fix the problem that caused the port to become disabled. (Make sure to check your duplex configuration and cabling).
5. To re-enable the port, from enable mode execute the command
Functionality also exists so that you can set an error-disabled switch port to automatically re-enable after a specified period of time. You can also enforce all causes of error-disabling or enforce particular causes (such as automatically re-enabling for all reasons except for a duplex mismatch). To set a switch port to be automatically re-enabled, investigate the documentation for the set errdisabletimeout command.
Routing Between VLANs
Many IT pros use VoIP or have it noted on their ever-growing to-do list to set up. VoIP vendors typically recommend that a separate VLAN be created explicitly to separate data traffic from voice traffic. This separation primarily reduces the amount of broadcast traffic that must be received by VoIP phones. It also eases the enforcement of more stringent Quality of Service (QoS) parameters in an effort to make voice quality as good as or better than that of a standard non-VoIP telephone.
I often hear from folks who are in the midst of setting up this separate VLAN or who have just recently configured it. For some, it’s the first VLAN they’ve ever configured.
They try to ping a device with an IP address resident in the data VLAN, because inter-VLAN connectivity is necessary in most cases for Internet access and management of the VoIP infrastructure from the data VLAN. This ping fails. In fact, all communication fails. The VLAN must not be set up correctly, right?
No. The problem here is simple forgetfulness of Networking 101 skills. Switching occurs at Layer 2 of the OSI model, but IP is at Layer 3—you can’t route IP packets without a device that can perform this routing function.
The first reaction I hear is “Oh no; now I need to buy a router. I didn’t budget for that in my VoIP plan.” Not to worry, a router isn’t always necessary. Here are some alternatives.
Upgrade. You might have one or more switches that can provide Layer-3 IP routing or that can be upgraded to support it. The higher you go on the switch vendor’s food chain, the more likely this is the case. If you’re investing in new switches that provide Power over Ethernet (PoE) for a VoIP deployment, it might be wise to spend the extra money up front to ensure you have a switch with Layer-3 functionality.
Use a small office/home office router. Using a SOHO router might not be the best idea from a scalability and support standpoint, but at least it gets you up and running.
Continued on page 2
Use a firewall. Either a hardware or software firewall works, as long as it has an unused Ethernet interface that you can utilize. Mark the interface as a connection to an additional internal or trusted network and set the appropriate routing and security policies.
Use RRAS. You can use RRAS, available in Windows Server, as long as the machine you set it up on has two Ethernet interfaces. This is an attractive option if you also need to provide infrastructure services such as DHCP and DNS to your voice VLAN.
How many times have you experienced this behavior: You type a URL into Internet Explorer’s (IE’s) address bar and press Enter. The site begins to load, but then you receive a dialog box with the message “Internet Explorer cannot open the Internet site <Web address>. Operation aborted.” When you try to load the page again, however, you succeed. I’ve encountered this behavior on Microsoft’s TechNet site many times but never thought much about it. When people in my office started reporting it happening on other sites, including Google’s home page, I decided to investigate.
Initially I thought it was a problem with the web proxy functionality of our ISA Server firewall. However, even after temporarily bypassing the firewall, I was still able to reproduce the problem. Then I thought that my TCP session was being reset at some point before the page finished loading. A network trace proved this wasn’t the cause.
Finally, I started to think that the problem might be with IE itself, so I dug around on the Internet and found this Microsoft article “BUG: Error message when you visit a Web page or interact with a Web application in Internet Explorer: ‘Operation aborted’” at support.microsoft.com/kb/927917.
This was my exact problem, but the cause—script code trying to modify particular container elements—didn’t explain why the problem wasn’t occurring continuously. After all, it’s unlikely that the TechNet site was modified within the two seconds that it takes to press F5. However, further examination indicated that the problem lies with IE’s parser: The exact order in which the page is loaded and parsed changes ever so slightly from refresh to refresh. This would explain how I could refresh a problem page 10 times but the problem would occur twice.
I never found a workaround. The article states that the fix lies with the site author, who has to modify the code. Let’s hope Microsoft will fix this problem in IE 8.0.
A former colleague of mine contacted me in a panic: His network was going down around him. The symptoms he described included a user unable to access Exchange Server through Microsoft Office Outlook. Some mapped drives were also inaccessible. Strangely though, the Internet was accessible.
While he was troubleshooting the problem with the user, a user in the next cubicle experienced the same symptoms. Five minutes later, someone else across the office yelled, “I have the same problem!”
I asked my colleague whether he had checked the obvious: Were those servers actually experiencing a problem? No. Were there any recent software, hardware, or configuration changes? No. Was basic networking connectivity present and functioning? Yes.
After thinking for a few moments, I asked if he had checked the user accounts in Active Directory (AD) to see if the users having problems had somehow all locked themselves out at the same time. He had checked this, and they weren’t locked out. Finally, in a Eureka! moment, I asked him to check whether the users had recently received a prompt notifying them that their password would expire soon and asking whether they would like to change it.
He reported back that they had all been seeing the prompt for “a few days” and that today it said there was one day left. No one likes changing their password, so they had been clicking No for days. Today was no different. However, “one day” in this sense doesn’t mean “24 hours from now”; it really means “sometime in the next 24 hours.” My colleague directed the users to log off and log back on. Sure enough, all were forced to change their passwords and experienced no problems afterwards. Because these people arrive each day at approximately the same time, the validity period of their passwords was almost identical. We both now tell users that “one day” really means “today.”
Michael Dragone (firstname.lastname@example.org) is a contributing editor for Windows IT Pro and a systems engineer in New York. He is an MCDST and an MCSE: Messaging and remembers when Windows IT Pro was called Windows NT Magazine.