The Index column shows the index that AD used to perform the search. In the most egregious search shown, a search of the Schema naming context (NC), the search filter cn=*a* specifies a medial-string search (i.e., a search using a filter containing a wildcard that is not at the end of the string) on the Common Name (CN) attribute. The CN attribute is indexed by default in AD, but not for medial-string searches, so AD has to read every object in the Schema NC to determine whether it matches the filter.
Looking at the other searches this client performed, you'll see they do the same thing as the schema search: They perform subtree-scoped medial searches on the CN attribute, causing AD to use either the ancestors index or distinguished name tag index (dnt_index), neither of which is good for performance. When you see a lot of searches that use the ancestors index or dnt_index, you should either modify the search filter to take advantage of the indexes that already exist in AD or create new indexes. For example, if you determine that cn=*a* is a legitimate search filter that should be optimized, you can add a medial search index on the CN attribute.
Adding an index to AD is simple, as the Microsoft article "Index an attribute in Active Directory" (http://go.microsoft.com/fwlink/?LinkId=46790) explains. Keep in mind that each index consumes additional disk space and will cause update operations to take longer. If you already have a large Directory Information Tree (DIT), that additional disk space could be substantial. Also, all objects in AD will be indexed, not just the objects in one container or NC, so consider the performance implications of a new index and carefully test it before making the change.
We've explained two of the three performance warnings we started with. Now let's look at the third NIC's long output queue.
NIC performance counters section. Go back to the Warnings section at the top of the report and click the hotlink for the NIC output queue length warning. SPA will navigate to the Network Interface performance counter table, which Figure 5 shows. Hovering your cursor over the flag that appears in the Mean column on the Output Queue Length counter displays a description of the performance warning.
You'll notice several interesting values in Figure 5. The Current Bandwidth counter shows 100 megabits, indicating that the 10/100 NIC is in fact running at 100Mbps. The Bytes Total/sec counter is 3.6Mbps, which is well below the network's capacity. (I know that on the server that produced this report, all network traffic on the segment was either going to or coming from the DC.) Finally, we see that Bytes Sent/sec accounts for nearly all the traffic, which makes sense considering that the LDAP searches are retrieving a lot of data.
So why is the output queue so long? There are several possibilities:
- The NIC is running at or near capacity.
- Too few output buffers are configured for the NIC.
- Output queue processing is slow because of insufficient CPU horsepower.
Although the report shows that the average total throughput on the NIC is 3.6MBps, the throughput peaked at 10.8MBps, which exceeds the theoretical maximum of 100Mbps (about 10MBps). So it would seem that the NIC is occasionally overloaded, which suggests that other components on the network segment might also be overloaded. Because the CPU is nearly maxed out, adding another NIC probably won't help. More analysis would be required to come up with a definitive answer, but the best strategy is probably to reduce the load on the server by optimizing the queries (or by adding an index) and, if necessary, either upgrading to a faster CPU or adding another DC and distributing the query load.
LDAP requests. There's one more anomaly-we should investigate. In the Summary section of Figure 1, if you click the Top Client hotlink, SPA displays the Clients with the Most CPU Usage subsection of the LDAP Request section of the report. Sure enough, the client at 10.7.0.131 is the prime offender.
When you expand the entry for that IP address, SPA displays a summary of the operations the client performed and the percentage of CPU resources each operation consumed, as Figure 6 shows. We can see that the client at 10.7.0.131 used most of the CPU by issuing nonDSE searches (i.e., searches of some part of the directory tree). This usage is consistent with what we discovered earlier.
What's surprising, however, is the second row in the operation summary, which shows that a significant number of the client's searches failed with a Size Limit Exceeded error. These search failures accounted for another 12 percent of CPU utilization. So SPA has helped uncover at least one other problem with this client: It isn't using paged searches. Not using paged searches is a common application programming error and can cause AD to not return all the data the application is searching for.
What Have We Accomplished?
We started out with an overloaded DC, and after running the SPA Active Directory data collector group on the DC and generating a report, we identified three r performance problems:
- Clients are issuing medial search queries on an attribute not indexed for medial searches.
- Clients are not using paged searches.
- The NIC (and possibly other system or network components) are overloaded.
Not bad for 15 minutes of work! You can also use SPA to collect and archive performance data on a scheduled basis, generating baseline data. In an upcoming article, I'll discuss how to set up SPA to collect data from multiple servers and generate the reports on a centralized reporting server.
SOLUTION SNAPSHOT
PROBLEM: AD performance has plummeted.
SOLUTION: Run SPA and use its easy-to-read performance reports and recommendations to troubleshoot the problem.
WHAT YOU NEED: The most recent version of Windows Server 2003 Performance Advisor (SPA); Windows Server 2003
DIFFICULTY: 3 out of 5 |
SOLUTION STEPS:
- Download and install SPA.
- Run SPA.
- Run a collection.
- Review the report.
|
phershey November 29, 2006 (Article Rating: