Return to
HomePage
Overview: Performance Testing
Source: http://msdn.microsoft.com/library/en-us/dnpag/html/scalenetchapt16.asp
J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman
Performance testing is used to verify that an application is able to perform under expected and peak load conditions, and that it can scale sufficiently to handle increased capacity. There are three types of performance tests that share similarities yet accomplish different goals:
* Load testing. Use load testing to verify application behavior under normal and peak load conditions. This allows you to verify that your application can meet your desired performance objectives; these performance objectives are often specified in a service level agreement. It enables you to measure response times, throughput rates, resource utilization levels, and to identify your application's breaking point, assuming that breaking point occurs below the peak load condition.
* Stress testing. Use stress testing to evaluate your application's behavior when it is pushed beyond the normal or peak load conditions. The goal of stress testing is to unearth application bugs that surface only under high load conditions. These can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application's weak points, and how it behaves under extreme load conditions.
* Capacity testing. Capacity testing is complementary to load testing and it determines your server's ultimate failure point, whereas load testing monitors results at various levels of load and traffic patterns. You perform capacity testing in conjunction with capacity planning. You use capacity planning to plan for future growth, such as an increased user base or increased volume of data. For example, to accommodate future loads you need to know how many additional resources (such as CPU, RAM, disk space, or network bandwidth) are necessary to support future usage levels. Capacity testing helps you identify a scaling strategy to determine whether you should scale up or scale out. For more information, refer to “How To: Perform Capacity Planning for .NET Applications” in the “How To” section of this guide.
This document demonstrates an approach to performance testing that is particularly effective when combined with the other principles in this guide.
Performance Testing
Performance testing is the process of identifying how an application responds to a specified set of conditions and input. Multiple individual performance test scenarios (suites, cases, scripts) are often needed to cover all of the conditions and/or input of interest. For testing purposes, if possible, the application should be hosted on a hardware infrastructure that is representative of the live environment. By examining your application's behavior under simulated load conditions, you identify whether your application is trending toward or away from its defined performance objectives.
Goals of Performance Testing
The main goal of performance testing is to identify how well your application performs in relation to your performance objectives. Some of the other goals of performance testing include the following:
* Identify bottlenecks and their causes.
* Optimize and tune the platform configuration (both the hardware and software) for maximum performance.
* Verify the reliability of your application under stress.
You may not be able to identify all the characteristics by running a single type of performance test. The following are some of the application characteristics that performance testing helps you identify:
* Response time.
* Throughput.
* Maximum concurrent users supported. For a definition of concurrent users, see “Testing Considerations,” later in this chapter.
* Resource utilization in terms of the amount of CPU, RAM, network I/O, and disk I/O resources your application consumes during the test.
* Behavior under various workload patterns including normal load conditions, excessive load conditions, and conditions in between.
* Application breaking point. The application breaking point means a condition where the application stops responding to requests. Some of the symptoms of breaking point include 503 errors with a “Server Too Busy” message, and errors in the application event log that indicate that the ASPNET worker process recycled because of potential deadlocks.
* Symptoms and causes of application failure under stress conditions.
* Weak points in your application.
* What is required to support a projected increase in load. For example, an increase in the number of users, amount of data, or application activity might cause an increase in load.
Performance Objectives
Most of the performance tests depend on a set of predefined, documented, and agreed-upon performance objectives. Knowing the objectives from the beginning helps make the testing process more efficient. You can evaluate your application’s performance by comparing it with your performance objectives.
You may run tests that are exploratory in nature to know more about the system without having any performance objective. But even these eventually serve as input to the tests that are conducted for evaluating performance against performance objectives.
Performance objectives often include the following:
* Response time or latency
* Throughput
* Resource utilization (CPU, network I/O, disk I/O, and memory)
* Workload
Response Time or Latency
Response time is the amount of time taken to respond to a request. You can measure response time at the server or client as follows:
* Latency measured at the server. This is the time taken by the server to complete the execution of a request. This does not include the client-to-server latency, which includes additional time for the request and response to cross the network.
* Latency measured at the client. The latency measured at the client includes the request queue, plus the time taken by the server to complete the execution of the request and the network latency. You can measure the latency in various ways. Two common approaches are time taken by the first byte to reach the client (time to first byte, TTFB), or the time taken by the last byte of the response to reach the client (time to last byte, TTLB). Generally, you should test this using various network bandwidths between the client and the server.
By measuring latency, you can gauge whether your application takes too long to respond to client requests.
Throughput
Throughput is the number of requests that can be served by your application per unit time. It can vary depending upon the load (number of users) and the type of user activity applied to the server. For example, downloading files requires higher throughput than browsing text-based Web pages. Throughput is usually measured in terms of requests per second. There are other units for measurement, such as transactions per second or orders per second.
Resource Utilization
Identify resource utilization costs in terms of server and network resources. The primary resources are:
* CPU
* Memory
* Disk I/O
* Network I/O
You can identify the resource cost on a per operation basis. Operations might include browsing a product catalog, adding items to a shopping cart, or placing an order. You can measure resource costs for a given user load, or you can average resource costs when the application is tested using a given workload profile.
A workload profile consists of an aggregate mix of users performing various operations. For example, for a load of 200 concurrent users (as defined below), the profile might indicate that 20 percent of users perform order placement, 30 percent add items to a shopping cart, while 50 percent browse the product catalog. This helps you identify and optimize areas that consume an unusually large proportion of server resources and response time.
Workload
In this chapter, we have defined the load on the application as simultaneous users or concurrent users.
Simultaneous users have active connections to the same Web site, whereas concurrent users hit the site at exactly the same moment. Concurrent access is likely to occur at infrequent intervals. Your site may have 100 to 150 concurrent users but 1,000 to 1,500 simultaneous users.
When load testing your application, you can simulate simultaneous users by including a random think time in your script such that not all the user threads from the load generator are firing requests at the same moment. This is useful to simulate real world situations.
However, if you want to stress your application, you probably want to use concurrent users. You can simulate concurrent users by removing the think time from your script.
For more information on workload modeling, see “How To: Model Workloads.
Return to
HomePage