Return to HomePage



How To: Tune Your Server (Server Tuning)

Source: http://msdn.microsoft.com/library/en-us/dnpag/html/scalenetchapt17.asp
J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman

How well your applications, services, and operating system use shared system-level resources has a direct impact on throughput, queuing, and response time. Tools such as System Monitor enable you to monitor resource usage. You should monitor the following shared system components at a minimum:
* CPU
* Memory
* Disk I/O
* Network I/O

CPU
Each application that runs on a server gets a time slice of the CPU. The CPU might be able to efficiently handle all of the processes running on the computer, or it might be overloaded. By examining processor activity and the activity of individual processes including thread creation, thread switching, context switching, and so on, you can gain good insight into processor workload and performance.
Metrics
The performance counters shown in Table 17.1 help you identify processor bottlenecks.
Table 17.1: Performance Counters Used to Identify CPU Bottlenecks
Area Counter
Processor % Processor Time
% Privileged Time
System Processor Queue Length
Context Switches/sec

For more information about how to measure these counters, their thresholds, and their significance, see "Processor" in Chapter 15, "Measuring .NET Application Performance."
You can monitor an individual process or use _Total for all instances. High rates of processor activity might indicate an excessively busy processor. A long, sustained processor queue is a more certain indicator of a processor bottleneck. If a single processor in a multi-processor server is overloaded, this might indicate that you have a single-threaded application using just that single processor.

Note: Processor utilization depends on your system and application characteristics. The 75% threshold value given in "Bottlenecks" (following) is based on typical observations. Increase or decrease this threshold based on your system characteristics.

Bottlenecks
You might have a CPU bottleneck if you see the following:
* Processor\ % Processor Time often exceeding the 75% threshold.
* A sustained queue of 2 for a prolonged period indicated by System\ Processor Queue Length.
* Unusually high values for Processor\ % Privileged Time or System\Context Switches/sec.

If the value of % Processor Time is high, then queuing occurs, and in most scenarios the value of System\ Processor Queue Length will also be high. Figure 17.6 shows a sample System Monitor graph that indicates a high percentage of processor time and a high processor queue length.

{Insert figure: CH17 – System Monitor Graph.gif}
Figure 17.6
System monitor graph showing high percentage of processor time and high processor queue length
The next step is to identify which process is causing the spike (or consuming processor time.) Use Task Manager to identify which process is consuming high levels of CPU by looking at the CPU column on the Processes page. You can also determine this by monitoring Process\%Processor Time and selecting the processes you want to monitor. For example, from the System Monitor output shown in Figure 17.7, you can see that the ASP.NET worker processor is consuming a majority of the processor time.

{Insert figure: CH17 - PerfMon - CPU Time.gif}
Figure 17.7
System monitor output showing the ASP.NET worker process consuming over 98% of processor time
Tuning Options
Once you determine that your CPU is a bottleneck, you have several options:
* Add multiple processors if you have multi-threaded applications. Consider upgrading to a more powerful processor if your application is single-threaded.
* If you observe a high rate of context switching, consider reducing the thread count for your process before increasing the number of processors.
* Analyze and tune the application that is causing the high CPU utilization. You can dump the running process by using the ADPLUS utility and analyze the cause by using Windbg. These utilities are part of the Windows debugging toolkit. You can download these tools from http://www.microsoft.com/whdc/ddk/debugging/default.mspx.
* Analyze the instrumentation log generated by your application and isolate the subsystem that is taking the maximum amount of time for execution, and check whether it actually needs a code review rather than just tuning the deployment.


Note: Although you can change the process priority level of an application by using Task Manager or from the command prompt, you should generally avoid doing so. For almost all cases, you should follow one of the recommendations in the previous list.

Memory
Memory consists of physical and virtual memory. You need to consider how much memory is allocated to your application. When you evaluate memory-related bottlenecks, consider unnecessary allocations, inefficient clean up, and inappropriate caching and state management mechanisms. To resolve memory-related bottlenecks, optimize your code to eliminate these issues and then tune the amount of memory allocated to your application. If you determine during tuning that memory contention and excessive paging are occurring, you may need to add more physical memory to the server.
Low memory leads to increased paging where pages of your application's virtual address space are written to and from disk. If paging becomes excessive, page thrashing occurs and intensive disk I/O decreases overall system performance.
Configuration Overview
Memory tuning consists of the following:
* Determine whether your application has a memory bottleneck. If it has, then add more memory.
* Tune the amount of memory allocated if you can control the allocation. For example, you can tune this for ASP.NET and SQL Server.
* Tune the page file size.

Metrics
The performance counters shown in Table 17.2 help you identify memory bottlenecks. You should log these counter values to log files over a 24 hour period before you form any conclusions.
Table 17.2: Performance Counters Used to Identify Memory Bottlenecks
Area Counter
Memory Available MBytes
Page Reads/sec
Pages/sec
Cache Bytes
Cache Faults/sec
Server Pool Nonpaged Failures
Pool Nonpaged Peak
Cache MDL Read Hits %

For more information about how to measure these counters, their thresholds, and their significance, see "Memory" in "CLR and Managed Code" in Chapter 15, "Measuring .NET Application Performance."
Bottlenecks
A low value of Available MBytes indicates that your system is low on physical memory, caused either by system memory limitations or an application that is not releasing memory. Monitor each process object’s working set counter. If Available MBytes remains high even when the process is not active, it might indicate that the object is not releasing memory. Use the CLR Profiler tool at this point to identify the source of any memory allocation problems. For more information, see "How To: Use CLR Profiler" in the "How To" section of this guide.
A high value of Pages/sec indicates that your application does not have sufficient memory. The average of Pages Input/sec divided by average of Page Reads/sec gives the number of pages per disk read. This value should not generally exceed five pages per second. A value greater than five pages per second indicates that the system is spending too much time paging and requires more memory (assuming that the application has been optimized). The System Monitor graph shown in Figure 17.8 is symptomatic of insufficient memory.

{Insert figure: CH17 – Insufficient Memory.gif}
Figure 17.8
Insufficient memory
Tuning Options
If you determine that your application has memory issues, your options include adding more memory, stopping services that you do not require, and removing unnecessary protocols and drivers. Tuning considerations include:
* Deciding when to add memory
* Page file optimization

Deciding When to Add Memory
To determine the impact of excessive paging on disk activity, multiply the values of the Physical Disk\ Avg. Disk sec/Transfer and Memory\ Pages/sec counters. If the product of these counters exceeds 0.1, paging is taking more than 10 percent of disk access time. If this occurs over a long period, you probably need more memory. After upgrading your system’s memory, measure and monitor again.
To save memory:
* Turn off services you do not use. Stopping services that you do not use regularly saves memory and improves system performance.
* Remove unnecessary protocols and drivers. Even idle protocols use space in the paged and nonpaged memory pools. Drivers also consume memory, so you should remove unnecessary ones.

Page File Optimization
You should optimize the page file to improve the virtual memory performance of your server. The combination of physical memory and the page file is called the virtual memory of the system. When the system does not have enough physical memory to execute a process, it uses the page file on disk as an extended memory source. This approach slows performance. To ensure an optimized page file:
* Increase the page file size on the system to 1.5 times the size of physical memory available, but only to a maximum of 4,095 MB. The page file needs to be at least the size of the physical memory to allow the memory to be written to the page file in the event of a system crash.
* Make sure that the page file is not fragmented on a given partition.
* Separate the data files and the page file to different disks only if the disk is a bottleneck because of a lot of I/O operation. These files should preferably be on the same physical drive and the same logical partition. This keeps the data files and the page file physically close to each other and avoids the time spent seeking between two different logical drives.

 To configure the page file size
  1. Open Control Panel.
2. Double-click the System icon.
3. Select the Advanced tab.
4. Click Performance Options.
5. Click Change. The Virtual Memory dialog box appears (see Figure 17.9).

{Insert figure: CH17 – Virtual Memory.gif}
Figure 17.9
Virtual memory settings
6. Enter new values for Initial size and Maximum size. Click Set, and then click OK.

More Information
For more information about the location and partitioning of the page file, see Knowledge Base article 197379, “Configuring Page Files for Optimization and Recovery," at http://support.microsoft.com/default.aspx?scid=kb;en-us;197379.
Disk I/O
Disk I/O refers to the number of read and write operations performed by your application on a physical disk or multiple disks installed in your server. Common activities that can cause disk I/O – related bottlenecks include long-running file I/O operations, data encryption and decryption, reading unnecessary data from database tables, and a shortage of physical memory that leads to excessive paging activity. Slow hard disks are another factor to consider.
To resolve disk-related bottlenecks:
* Start by removing any redundant disk I/O operations in your application.
* Identify whether your system has a shortage of physical memory, and, if so, add more memory to avoid excessive paging.
* Identify whether you need to separate your data onto multiple disks.
* Consider upgrading to faster disks if you still have disk I/O bottlenecks after doing all of above.

Configuration Overview
Microsoft Windows® 2000 retrieves programs and data from disk. The disk subsystem can be the most important aspect of I/O performance, but problems can be masked by other factors, such as lack of memory. Performance console disk counters are available within both the LogicalDisk or PhysicalDisk objects.
Metrics
The performance counters shown in Table 17.3 help you identify disk I/O bottlenecks.
Table 17.3: Performance Counters Used to Identify Disk I/O Bottlenecks
Area Counter
PhysicalDisk Avg. Disk Queue Length
Avg. Disk Read Queue Length
Avg. Disk Write Queue Length
Avg. Disk sec/Read
Avg. Disk sec/Transfer
Disk Writes/sec

For more information about how to measure these counters, their thresholds, and their significance, see "Disk I/O" in Chapter 15, "Measuring .NET Application Performance."

Note: When attempting to analyze disk performance bottlenecks, you should always use physical disk counters. In Windows 2000, physical disk counters are enabled by default, but logical disk counters are disabled by default. If you use software RAID, you should enable logical disk counters by using the following command.
DISKPERF –YV


Tuning Options
If you determine that disk I/O is a bottleneck, you have a number of options:
* Defragment your disks. Use the Disk Defragmenter system tool.
* Use Diskpar.exe on Windows 2000 to reduce performance loss due to misaligned disk tracks and sectors. You can use get the Diskpar.exe from the Windows 2000 Resource Kit.
* Use stripe sets to process I/O requests concurrently over multiple disks. The type you use depends on your data-integrity requirements. If your applications are read-intensive and require fault tolerance, consider a RAID 5 volume. Use mirrored volumes for fault tolerance and good I/O performance overall. If you do not require fault tolerance, implement stripe sets for fast reading and writing and improved storage capacity. When stripe sets are used, disk utilization per disk should fall due to distribution of work across the volumes, and overall throughput should increase.
If you find that there is no increased throughput when scaling to additional disks in a stripe set, your system might be experiencing a bottleneck due to contention between disks for the disk adapter. You might need to add an adapter to better distribute the load.
* Place multiple drives on separate I/O buses, particularly if a disk has an I/O - intensive workload.
* Distribute workload among multiple drives. Windows Clustering and Distributed File System provide solutions for load balancing on different drives.
* Limit your use of file compression or encryption. File compression and encryption are I/O-intensive operations. You should only use them where absolutely necessary.
* Disable creation of short names. If you are not supporting MS-DOS for Windows 3.x clients, disable short names to improve performance. To disable short names, change the default value of the \NtfsDisable8dot3NameCreation registry entry (in HKEYLOCALMACHINE \SYSTEM \CurrentControlSet \Control \Filesystem) to 1.
* Disable last access update. By default, NTFS updates the date and time stamp of the last access on directories whenever it traverses the directory. For a large NTFS volume, this update process can slow performance. To disable automatic updating, create a new REGDWORD registry entry named NtfsDisableLastAccessUpdate in HKEYLOCAL_MACHINE \SYSTEM\CurrentContolSet \Control \Filesystem and set its value to 1.

Caution: Some applications, such as incremental backup utilities, rely on the NTFS update information and cease to function properly without it.

* Reserve appropriate space for the master file table. Add the NtfsMftZoneReservation entry to the registry as a REGDWORD in HKEYLOCAL_MACHINE \SYSTEM \CurrentControlSet\Control \FileSystem. When you add this entry to the registry, the system reserves space on the volume for the master file table. Reserving space in this manner allows the master file table to grow optimally. If your NTFS volumes generally contain relatively few files that are large, set the value of this registry entry to 1 (the default). Typically you can use a value of 2 or 3 for moderate numbers of files, and use a value of 4 (the maximum) if your volumes tend to contain a relatively large number of files. However, make sure to test any settings greater than 2, because these greater values cause the system to reserve a much larger portion of the disk for the master file table.
* Use the most efficient disk systems available, including controller, I/O, cabling, and disk. Use intelligent drivers that support interrupt moderation or interrupt avoidance to alleviate the interrupt activity for the processor due to disk I/O.

* Check whether you are using the appropriate RAID configuration. Use RAID 10 (striping and mirroring) for best performance and fault tolerance. The tradeoff is that using RAID 10 is expensive. Avoid using RAID 5 (parity) when you have extensive write operations.
* Consider using database partitions. If you have a database bottleneck, consider using database partitions and mapping disks to specific tables and transaction logs. The primary purpose of partitions is to overcome disk bottlenecks for large tables. If you have a table with large number of rows and you determine that it is the source of a bottleneck, consider using partitions. For SQL Server, you can use file groups to improve I/O performance. You can associate tables with file groups, and then associate the file groups with a specific hard disk. For information about file groups, see Chapter 14, "Improving SQL Server Performance."
* Consider splitting files across hard disks. If you are dealing with extensive file - related operations, consider splitting the files across a number of hard disks to spread the I/O load across multiple disks.
* Check the feasibility of caching in RAM any static data that is being frequently read.
* Consider increasing memory, if you have excessive page faults.
* Consider using a disk with a higher RPM or shifting to a Storage Area Network (SAN) device.

Network I/O
Network I/O relates to amount of data being sent and received by all of the interface cards in your server. Common activities that can cause disk I/O – related bottlenecks include excessive numbers of remote calls, large amounts of data sent and received with each call, network bandwidth constraints, and all of the data being routed through a single network interface card (NIC).
To resolve network I/O bottlenecks:
* Reduce the number of remote calls and the amount of data sent across the network. Ensure that you do not exceed your bandwidth constraint levels.
* After you have optimized your code, determine whether you need to divide the traffic on the server among multiple NICs. You can divide traffic based on protocols used, or you can use separate NICs to communicate with separate network segments.
* Consider upgrading your NIC.

Configuration Overview
Monitor both front and back interfaces for indicators of possible bottlenecks. To monitor network-specific objects in Windows 2000, you need to install the Network Monitor Driver.
 To install the Network Monitor Driver
  1. In Control Panel, double-click Network and Dial-up Connections.
2. Select any connection.
3. On the File menu, click Properties.
4. On the General tab, click Install.
5. Click Protocol, and then click Add.
6. Click Network Monitor Driver, and then click OK.
7. Click Close.

Metrics
The performance counters shown in Table 17.4 help you identify network I/O bottlenecks.
Table 17.4: Performance Counters Used to Identify Network I/O Bottlenecks
Area Counter
Network Interface Bytes Total/sec
Bytes Received/sec
Bytes Sent/sec
Server Bytes Total/sec
Protocol Protocol_Object\Segments Received/sec
Protocol_Object\Segments Sent/sec
Processor % Interrupt Time


For more information about how to measure these counters, their thresholds, and their significance, see "Network I/O" in Chapter 15, "Measuring .NET Application Performance."
Bottleneck Identification
If the rate at which bytes sent and received is greater than your connection bandwidth or the bandwidth your network adapter can handle, a network bandwidth bottleneck occurs. This rate is measured by Network Interface\Bytes Total/sec.
Tuning Options
If you determine that network I/O is a bottleneck, you have the following options:
* Distributing client connections across multiple network adapters. If your system communicates over Token Ring, Fiber Distributed Data Interface (FDDI), or switched Ethernet networks, attempt to balance network traffic by distributing client connections across multiple network adapters. When using multiple network adapters, make sure that the network adapters are distributed among the Peripheral Connect Interface (PCI) buses. For example, if you have four network adapters with three PCI buses, one 64-bit and two 32-bit, allocate two network adapters to the 64-bit bus and one adapter to each 32-bit bus.
* Use adapters with the highest bandwidth available for best performance. Increasing bandwidth increases the number of transmissions that occur and in turn creates more work for your system, including more interrupts. Remove unused network adapters to reduce overhead.
* Use adapters that support task offloading capabilities including checksum offloading, IPSec offloading, and large send offloading.
* Use network adapters that batch interrupts by means of interrupt moderation. High rates of interrupts from network adapters can reduce performance. By using network adapters that batch interrupts by means of interrupt moderation, you can alleviate this performance problem, provided that the adapter driver supports this capability. Another option is to bind interrupts arising from network adapters to a particular processor.
* If your network uses multiple protocols, place each protocol on a different adapter. Make sure to use the most efficient protocols, especially ones that minimize broadcasts.
* Divide your network into multiple subnets or segments, attaching the server to each segment with a separate adapter. Doing so reduces congestion at the server by spreading server requests.



Return to HomePage
Microsoft Communities