Return to
PerformanceTestingGuidance
How to Identify a Disk Performance Bottleneck Using the Microsoft Server Performance Advisor (SPA) Tool
Clint Huffman
Applies To
* Microsoft Server Performance Advisor (SPA)
* Performance Testing
* Performance Analysis
* Microsoft Windows Server 2003
Summary
This How-To shows how to use the Microsoft Server Performance Advisor (SPA) tool to identify which processes and files are causing a disk subsystem performance bottleneck on Windows Server 2003.
Contents
* Objectives
* Overview
* Download
* Summary of Steps
* Step 1. Run and Configure the Microsoft Server Performance Advisor (SPA) Tool
* Step 2. Capture Data
* Step 3. Compile the Report
* Step 4. Analyze the Report
* Conclusion
* Production Server Considerations
Objectives
In this module, you will learn to do the following:
* Identify a disk subsystem bottleneck
* Identify which processes are causing highest disk usage
* Identify which files are causing the highest disk usage
* Determine the data pattern (read/write bytes and I/O’s) of the disk usage
Overview
Microsoft Performance Monitor (perfmon) can gather performance counter data and Event Tracing for Windows (ETW) data, but it requires manual intervention to do the analysis. This is where the Microsoft Server Performance Advisor (SPA) picks up. The Microsoft Server Performance Advisor (SPA) tool colletects performance data in the same manner as Peformance Monitor. In addition, it analyzes the data and generates a detailed report on its findings.
Here is the disk related section of the SPA report:
*
Hot Files: Files Causing Most Disk I\O: This section of the report identifies specific files which are causing the most disk I\O, the process involved, and the read/write bytes and IO’s per second.
*
Disk Breakdown: Disk Totals: This section of the report identifies specific processes which are causing the most disk I\O on the physical disk.
In this how to article, we will use the SPA tool on a Windows 2003 Server to identify a disk subsystem bottleneck, identify which processes are causing the highest disk usage, identify which files are causing the highest disk usage, and determine the data pattern (read/write bytes and I/O’s) of the disk usage.
Download
You can download the Microsoft Service Performance Advisor (SPA) from the following location:
http://www.microsoft.com/downloads/details.aspx?FamilyID=09115420-8c9d-46b9-a9a5-9bffcd237da2&DisplayLang=en
Summary of Steps
Here is a summary of steps to be discussed in this article:
- Run and configure the Microsoft Server Performance Advisor
(SPA) Tool.
- Capture data.
- Compile the report.
- Analyze the report.
Step 1. Run and Configure the Microsoft Server Performance Advisor(SPA) Tool
In order for the SPA tool to properly diagnose a performance problem, it must capture performance data from the computer when the problem is occurring. Prior to capturing
*
Run the SPA Tool - Run the SPA tool by clicking Start, All Programs, then click Server Performance Advisor (at the root of All Programs). The SPA tool start page will appear, as shown in figure.
http://farm1.static.flickr.com/166/336100322
5369379328o.jpg
If you wish to have a quick tour of the SPA tool, then click “Quick Tour”. Otherwise, locate the “System Overview” role.
*
Open the “System Overview” Data Collector - Click View, then Scope Tree. The scope tree will show. Under Local Computer, Data Collectors and Reports, locate the System Overview role.
http://farm1.static.flickr.com/147/336100323
209a4c3baao.jpg
*
Configure the “System Overview” Role *
Optional Set the report generation to Manual - If this is a production server, then configure the report generation for the “System Overview” to be manual. SPA’s data collection mechanisms are very low overhead and designed to be ran in production, but the report generation/compilation takes up a lot of resources and should be ran on another non-production server.
* In the scope tree, right-click the “System Overview” role, and then click Properties. The System Overview Properties data sheet shows.
* Set Generate to Manual.
http://farm1.static.flickr.com/133/336100321
1e1fef17f1o.jpg
*
Optional Set the Data Collection Interval - Click the Schedule tab and set the Duration to the desired collection period in seconds. Keep in mind that SPA gathers a large amount of data quickly, so keep the collection interval as low as possible.
http://farm1.static.flickr.com/155/336078918
368f18d5a8o.jpg
* Click OK on the System Overview Properties window.
*
Set the Disk Utilization ThresholdsPrior to using SPA v2.0 for disk analysis, it is necessary to set the disk utilization thresholds according to the I\O’s per second that your physical disks are expected to perform at. The following steps show how to set the disk utilization thresholds:
* Click Edit, then select Rules.
* Locate the Disk Utilization thresholds and set them to the performance specifications of your locally attached physical disks.
* Scroll to the bottom and click Apply. This will persist the new threshold settings. Keep in mind, this change affects all of the data collectors in SPA.
Step 2. Capture Data
SPA must be configured to capture data when a performance problem is occurring.
* Start the System Overview Data Collectors
* Select the System Overview data collector, then click the green play button. Alternatively, you can click Record, then Start.
* Wait for SPA to Automatically Stop
* The SPA tool will automatically stop collecting data when the elapsed time equals the Duration setting of the data collector.
Step 3. Compile the Report
After SPA has finished collecting data, it will automatically begin generating/compiling the report unless you optionally set the report generation to manual. SPA is finished generating/compiling the report when an icon under reports with a red clock icon shows.
http://farm1.static.flickr.com/157/336078916
ca685584a7o.jpg
http://farm1.static.flickr.com/139/336078914
bfaa919d51o.jpg
If you choose to manually generate a report, then follow these steps on how to compile the report on another server.
The following steps show how to compile the report on another server:
* Prior to capturing data, set the role report generation to “Manual”.
* After capturing data, move or copy the data to another server with SPA installed. Copy the data to the respective “data” directory specified during the installation of SPA. For example, if both servers are using default installations, then copy the data
need more work here * At a command prompt, change directory to the SPA installation directory, then type:
spacmd compile “System Overview”
* Follow the “Analyze the Report” section above.
Step 4. Analyze the Report
Once the report is generated, we need to review the report to see what is causing our disk bottleneck.
*
Locating the report: To review the report, click on the icon with the red clock to see a list of reports that the server had generated.
http://farm1.static.flickr.com/140/336078206
6559601034o.jpg
http://farm1.static.flickr.com/159/336078207
c82913c5f9o.jpg
The reports are listed by computer, year, month, day, and time corresponding to when the data was collected. Select the report that corresponds with when the performance problem occurred.
Note: The symbols in the Status column relate to weather forecasts. A cloudy symbol represents a server under distress while a sunny symbol represents a relatively idle server.
*
Overview of the Report: After selecting the report, the report shows. The Summary section of the report shows us that
NTBackup.exe is taking up 18% CPU and a file in the catalog.wci directory is using the most disk I/O.
http://farm1.static.flickr.com/135/336078208
54dc8cb489o.jpg
The SPA tool will analyze the performance of the system. If it has a significant finding, then it will show its recommendations in the Performance Advise section.
Note: The Performance Advise Section will only show if there are any significant findings.
Note: Shown below is only a sample of performance advise and does not apply to our analysis.
Next, the System Health section is an overview of the overall health of the 4 subsystems of the computer.
http://farm1.static.flickr.com/148/336078911
7cb55ec301o.jpg
As you can see here in the System Health section, SPA has detected a disk performance bottleneck. Normally, 78 I/Os per second is not considered to be high usage for a fast, locally attached hard disk. In this case, we ran our tests on a slow, externally attached hard disk and adjusted SPA’s thresholds accordingly.
*
Analyzing the Disk SubSystem Performance: In this section we will look at more details of the disk response times and discover which processes and files are involved.
*
Analyzing Disk Response Times: To determine if the disk subsystem is responding poorly, we need to look at the response times of the disks. To look at the details of the disk response times, then we need to look at the System Monitor view of the report. Click on the System Monitor icon at the top of the report.
http://farm1.static.flickr.com/165/336078205
72dd1a421eo.jpg
* Clear the existing counters by clicking the icon in the upper left hand corner.
* Next, click the Plus sign button to add counters.
* Add all of the instances for the “Physical Disk\Avg. Disk sec/Read” and “Physical Disk/Avg. Disk sec/Write” counters. These counters are how long the disk responded in seconds.
http://farm1.static.flickr.com/152/336078200
cd552a8903o.jpg
* The System Monitor will show the counter values. We are looking for times when the response times were greater than 15ms (milliseconds) which is (0.015 seconds). In the chart below, all values above the black line (15ms) are considered a long response time and considered to be a bottleneck.
http://farm1.static.flickr.com/139/336078202
739fdf2c41o.jpg
* Based on this data, we can conclude that C: drive (thin red line and thin green lines) has significant disk latency loads and is a performance bottleneck on the system.
*
Identify the files and processes consuming the most disk I/O: Now that we have identified a disk bottleneck, let’s see which processes and files are involved with the bottleneck.
* Navigate to the Disk, Disk Breakdown, Disk Totals section.
http://farm1.static.flickr.com/130/336100320
d67cb07712o.jpg
In this section, we see a breakdown of each of the physical disks on the system and the processes that are most active on the disk. In this case, we see the cisvc.exe (Indexing Service) consuming the most I/O of physical disk 0 (C: drive).
http://farm1.static.flickr.com/129/336078919
7570c90d74o.jpg
* Next, navigate to the Disk, Hot Files, “Files Causing Most Disk
IOs” section.
http://farm1.static.flickr.com/136/336078920
fec1d7cb1do.jpg
In this section, we see a breakdown of the files consuming the most disk I/O. Each breakdown shows the respective processes involved with that file and it’s data patterns (Read/sec, Kb/Read, Writes/sec, and Kb/Write). In this case, the Indexing Services’s catalog files are causing the most I/O on the disk.
Note: The Summary Section at the beginning of the report shows the file taking up the most I/O.
http://farm1.static.flickr.com/127/336100319
496094b574o.jpg
Conclusion
The Microsoft Server Performance Advisor (SPA) tool is very good at showing which files and processes are causing the most disk I/O.
Production Server Considerations
The SPA tool is designed and is recommended to be used in a production environment. However, its default setting is to immediately compile the report after the capture interval has ended. SPA’s data collection uses very little overhead which is ideal for a production environment, but its report compilation requires significant resources. Therefore, it is recommended to run the report compilation on a non-production server. The non-production server must have the SPA tool installed.
Feedback
Pending response from the SPA Team
Technical Support
Pending response from the SPA Team
Community and Newsgroup
Pending response from the SPA Team
Contributors and Reviewers
<< List the names of people who have reviewed or contributed >>
Return to
PerformanceTestingGuidance