Return to PerformanceTestingGuidance


How To: Conduct Performance Testing


Applies To

* Performance Testing

Summary

This how to explores, at a high level, an approach to Performance Testing your applications and the systems that support those applications. Performance testing is typically done to help identify bottlenecks in a system, establish a baseline for future testing and collect other performance related data to help stakeholders make informed decisions related to the overall quality of the application. In addition, the results from performance testing and analysis can help you to estimate the hardware configuration required to support the application(s) when you go live to production operation.

Contents

* Objectives
* Overview
* Summary of Steps
* Step 1. Identify Desired Performance Characteristics
* Step 2. Identify Test Environment
* Step 3. Create Test Scripts
* Step 4. Create Performance Tests
* Step 5. Identify Metrics
* Step 6. Execute Tests
* Step 7. Analyze the Results, Report and Retest
* Resources

Objectives

* Become familiar with performance testing fundamentals.
* Learn a basic approach to performance testing Web Applications.
* Learn how to establish performance testing baselines.

Overview

At its simplest level, the basic approach to conducting performance testing can be viewed as a sequence of planning and preparation steps, followed by the test execution and data analysis that is repeated until performance testing is deemed complete. Below, those steps are represented as a simple, but powerful set of activities that are easy to apply on a performance testing project; be it the smallest unit-level performance testing or a large-scale production simulation and capacity planning initiative. The power behind this approach is not that it anticipates and prescribes action for every possible circumstance. Rather the power lies in the fact that the same overall thought process and activities can be applied equally effectively to the expected cases as the unexpected exception cases.

Performance testing can be thought of as a process of identifying how an application responds to a specified set of conditions and input. To accomplish this, multiple individual performance test scenarios (suites, cases, scripts) are often needed to cover the most important conditions and/or input of interest. To improve the accuracy of the output the application should, if at all possible, be hosted on a hardware infrastructure that is separate from your production environment, but still a near representative of production. By examining your application's behavior (the output) under simulated load conditions (the input), you can typically identify whether your application is trending toward or away from the desired performance characteristics.

The most common reasons for conducting performance testing can be summarized as follows:
* To compare the current performance characteristics of the application with the performance characteristics that equate to end-user satisfaction when using the application.
* To verify that the applications exhibits the desired performance characteristics, within the budgeted constraints of resource utilization. These performance characteristics may include several different parameters such as the time it takes to complete a particular usage scenario (known as response time) or the number of simultaneous requests that can be supported for a particular operation at a given response time. The resource characteristics may be set with respect to server resources such as processor utilization, memory, disk I/O, and network I/O.
* To analyze the behavior of the Web Application at various load levels. The behavior is measured in metrics related to performance characteristics and other metrics that help to identify the bottlenecks in the application.
* To identify the bottlenecks in the Web Application. The bottlenecks can be caused by several issues such as memory leaks, slow response times, or contention under load.
* To determine the capacity of the application’s infrastructure and determine the future resources required to deliver acceptable application performance
* To compare different system configurations to determine which one works best for the application and for the business
Performance testing of Web Applications is frequently sub-categorized into several types of tests. Load tests and stress tests are two of the most common. Additionally, performance testing can add value at any point in the development life cycle. For example, performance unit testing frequently occurs very early in the lifecycle and endurance testing is generally saved for very late in the life cycle (for more see Explained: Types of Performance Testing).

Input

Common input items for performance testing include:
* Desired performance characteristics.
* Application usage characteristics (scenarios).
* Workload characteristics.
* Metrics or characteristics of interest for each scenario.
* Performance test plans, techniques, tools and strategies.

Output

Common output items for performance testing include:
* Updated test plans
* Bottleneck suspects deserving additional analysis
* Current operating capacity
* Behavior and performance characteristics of the application at various load levels

Steps

Step 1. Identify Desired Performance Characteristics
Step 2. Identify Test Environment
Step 3. Create Test Scripts
Step 4. Create Performance Tests
Step 5. Identify Metrics
Step 6. Execute Tests
Step 7. Analyze the Results, Report and Retest

Step 1. Identify Desired Performance Characteristics

This step walks you through test planning and identifying desired characteristics for the application. It involves following steps.

Identify Desired Characteristics

The performance characteristics should be identified, or at least estimated, early in the application development life cycle. Write down the performance characteristics that equate to a successfully performing application to your users and stakeholders.
Characteristics that frequently correlate to a user’s or stakeholder’s satisfaction typically include:
* Response time. For example, the product catalog must be displayed in less than 3 seconds.
* Throughput. For example, the system must support 100 transactions per second.
* Resource utilization. For example CPU utilization is not more than 75%. Other important resources that need to be considered for objective setting are memory, disk I/O, and network I/O.

{See How To: Quantify End-User Response Time Goals and How To: Identify Performance Testing Objectives for more information about capturing and recording desired Performance Characteristics}

Step 2. Identify Test Environment

The degree of similarity between the hardware and network configuration of the application under test and the hardware and network configuration of the application as it will be in production is frequently a significant consideration when deciding what performance tests to conduct and what size loads to test. It is important to remember that it isn’t only the physical environment that impacts performance testing; the business or project environment will also flavor your performance testing.

In addition to the physical and business environments, consider the following when you identify your test environment:

* Identify Test Data Volumes – determine what kind of data is consumed by the application, at each step of activity through the system. How many records are moving throughout the end-to-end transaction? How big are the queried results sets? How much unique data will I need to feed the automation for my testing, to emulate real-world conditions?
* Identify Critical System Components – (does the system have any known bottlenecks or weak points? Any integration points that are beyond our control for testing?)

Identify Physical Environment

The key factor is to completely understand the similarities and differences between test and production environments. Some critical factors to consider are:

* Machine Configuration
* Machine Hardware (Processor, RAM etc)
* Overall Machine Setup (software installations etc).
* Network Architecture and the location of the end-users

Identify the Business Environment

Consider the following test project practices:
* Document Test Team Roles and Contacts – (e.g. the network guy, the database guy, the developer, the test scripters, the business analysts, the PM, the CIO)
* Document Risks to Failure of the Testing Project (e.g. if we can’t get a lab, or the tool doesn’t work, or the LOB folks don’t give us input)

Considerations

* Not many performance testers install, configure and administrate the application under test, but it beneficial for them to have access to the servers, software and administrators who do.
* Pointers for configuring the load generation tool.
* Performance testing is frequently conducted on an isolated network segment to prevent disruption of other business operations. If that is not true for you, ensure that you have permission to generate load during certain hours over the network you are on.
* Get to know the IT staff. You will likely need their support to do things like monitor overall network traffic and configure your load generation tool to simulate a realistic number of IP addresses.
* Remember to figuring out how to get load balancers to treat generated load like actual user load.
* Validate that firewalls, DNS, routing, etc. are treating generated load like natural load and that the test environment is treated similarly to production.
* Determine how much load you can generate before the load generation is a bottleneck
* It is frequently appropriate to have systems administrators set up resource monitoring software, diagnostic tools, etc. on AUT servers.

Step 3. Create Test Scripts

Typically in the process of identifying the desired performance characteristics of the application in Step 1, you should also have identified the key user scenarios for the web application. {See How To: Identify Key Scenarios (That Hasn’t Been Written Yet) for more information.} To create test scripts from those scenarios, do the following:

* Identify the activities involved in each of the scenarios, for example "Place an Order" scenario will include the following activities:
* Log on to the application.
* Browse a product catalog.
* Etc.
* For each step you should define the data inputs and outputs for each step in a table:

Scenario StepData InputsData Outputs
||Log on to the application||Username (unique)
Password (matched to username)||
||Browse a product catalog|Catalog Tree/Structure (static)
User Type (weighted)||Product Description
Sku#
Catalog Page Title
Advertisement Category||

Once you have detailed the individual steps, create a test script to emulate the requests against the application for each of the key scenario identified {for more information, see HowTo:blah}.

Considerations


* If the request accepts parameters, ensure the parameter data is populated properly with random and/or unique data to avoid any server-side caching.
* Remember to account for user abandonment, if it applies to your application.
* It is useful to create the test script such that it can optionally execute multiple iterations without end, simplifying the ability to vary the test duration.
* If appropriate, set a delay for each iteration of the transaction this will serve as a control on the test/experiment
* If the tool doesn’t do so automatically, you will likely want to add a wrapper around the requests in the test script to measure the request response time.
* Beware of allowing your tools to influence your test design. Better tests almost always result from designing tests by assuming they can be executed and adapting the test or the tool when that assumption is false, than by not designing particular tests based on the assumption that you don’t have access to a tool to execute the test.
* It is generally worth the time to make the script match your designed test over changing the designed test to save scripting time.
* Significant value can be gained from evaluating the data collected from the tests executed to test or validate script development.
* Ensure that your test design document (diagram, email, sketch, etc.) what the script is actually doing.


Step 4. Identify Metrics

When identified and captured correctly, metrics provide information about how close your application is to your performance objectives. In addition, they can help you identify problem areas and bottlenecks within your application.

Using the desired performance characteristics identified in step 1, identify metrics to be captured which focus on measuring the performance and identifying the bottlenecks in the system.
When identifying metric, use the performance objectives baseline as Accepted Level. Baseline values help you analyze your application performance at varying load levels. Here is an example of metric corresponding to the performance objectives identified in step 1.

MetricAccepted level
Request Execution timeMust not exceed 8 seconds
Throughput100 or more requests / second
% process timeMust not exceed 75%
Memory Available25 % of total RAM

Considerations

* It may sound like common sense, but validate that all of the machines that resource data will be collected from have their system clocks synchronized. Finding out that the time stamps on the collected data after the test is complete will cost you significant time to adjust for, or cause you to dispose of the data entirely and repeat the tests after synchronizing the system clocks.
* Involve the developers and administrators in both the process of determining which metrics are likely to add value and in the process and what is the best method to integrate the capture of those metrics into the test.
* Collecting metrics frequently produces very large volumes of data. It is tempting to reduce the amount of data through averaging. Use caution when use this or other data reduction techniques. It is quite common for the most valuable data to be lost when reducing data.

Step 5. Create Performance Test

The details of creating an executable performance test are extremely tool specific. Regardless of the tool that you are using, creating a performance test typically involves taking a single instance of your test script (or virtual user), and gradually add more instances and/or more scripts over time – thereby increasing the load on the component or system.

To know how many instances of which script are necessary to accomplish the objectives of your test, you need to identify a workload that appropriately represents the usage scenario related to the objective.

Identifying a Workload to combine User Scenarios

A workload profile consists of an aggregate mix of users performing various operations. Identify the workload associated with each of the identified key user scenarios. Following steps show how to identify the workload.
* Identify the distribution / ratio of work - For each key scenario, identify the distribution / ratio of work. The distribution is based on the number users executing each scenario; this is based on the purpose of your application scenario.
* Identify the peak user loads - Identify the maximum expected concurrent users for the web application. Using the work distribution for each scenario calculate the % user load per key scenario.
* Identify the user loads under a variety of conditions - Identify the maximum expected concurrent users for the web application at normal and peak hours. Using the work distribution for each scenario calculate the % user load per key scenario at normal and peak hours.
For more information about how to create a workload model for your application, see "How to - Model Workload for Web Application" at <<Add url>>

Creating Performance Test

Once you have identified workload for each of the user scenario to be tested. Follow these steps for creating a Performance Test.

* Create a Performance Test that will take a single instance of your test script each corresponding to the user scenario to be tested.
* Gradually add more instances over time – increasing the load for the user scenario to the maximum identified workload in the above step. It is important to have sufficient time between each step of increasing number of users, so that the system gets time to stabilize before next set of user connection executes the test case.
* Measure the resource utilization on the server(s): CPU, Memory, Disk and Network at a minimum.
* If you can, set thresholds in your testing tool according to your Performance Test Objectives. For example the resource utilization thresholds can be as follows:
* Processor\% Processor Time: 75 percent
* Memory\Available MBytes: 25 percent of total physical RAM

Step 6. Test Execution

Once the previous steps have been completed to some appropriate degree for a test you wish to execute, do the following:
* Validate that the test environment matches the configuration that you were expecting and/or designed your test for.
* Ensure that the test and the test environment are correctly configured for metrics collection.
* First execute a quick smoke test to make sure the test script and remote performance counters are working correctly.
* Reset the system (unless your scenario is to do otherwise) and start a formal test run execution.

Considerations

* If at all possible, execute every test twice. If the results aren't very similar, do it again... Try to determine why the different one is different.
* Observe your test during execution and pay close attention to "That seems odd" feelings... they are usually right, or at least valuable.
* No matter how far in advance a test is scheduled, give the team a 30 and 5 minute warning before launching the test (or starting the day's testing) and tell them whenever you're not going to be executing for more than 1 hour in succession.
* Don't process data, write reports or draw diagrams on your load generation machine while generating load. This could corrupt the data.
* Turn off active virus scanning on load generation machines during load generation, etc. to minimize the likelihood unintentionally corrupting the data.
* Use the system manually during test execution so that you can later compare your observations with the results data.
* Remember to simulated ramp-up and cool-down periods appropriately.
* Don't throw away the first iteration due to script compilation, etc. Measure it separately so you know what the first user after a system wide reboot can inspect.
* Test execution is never really done, but eventually you will reach a point of diminishing returns on a particular test, when you stop gaining valuable information, change your test.
* If neither you nor the development team can figure out the cause of an issue in twice as long as it took the test to execute, it may be more efficient to eliminate one (or more variables/potential causes) and try again.

Step 7. Analyze the Results, Report and Retest

Following are important points to consider while analyzing the data.
* Analyze the captured data and compare the results against the metric's accepted level to determine whether the performance of the application being tested shows a trend toward or away from the performance objectives.
* If all the metric values are within accepted limits and none of the thresholds set have been violated then the tests have passed and you are done testing that particular scenario on that particular configuration.
* If the Test fails, a diagnosis and tuning activity is generally warranted {See some other HowTo}
* After fixing the bottleneck again re-iterate the process step 4 onwards, till the tests passes.

Note: If required capture additional metrics in the subsequent test cycles. For example, suppose that during the first iteration of load tests, the process shows a marked increase in the memory consumption, indicating a possible memory leak. In the subsequent iterations, additional memory counters related to generations can be captured to study the memory allocation pattern for the application.

Considerations

* Share results and raw data both broadly and immediately.
* Talk to the consumers of the data, validate that the test did what it was meant to do and that the data means what you think it means... adjust fire immediately before you forget what this test was all about
* Filter out the fluff (if 87 of the 100 pages tested responded under their target, take them off the graph... they're just clutter)
* Report in pictures, supporting paragraphs and strong, but factual language (i.e. The home page fails to achieve target response time for 75% of the users)
* Report in user and business language, not simply technical data (not “The CPU utilization is hovering at 85%” - rather, “The application server is not powerful enough to support the target load as the application is currently deployed/operating”.)
* Use current results to set priority for next test.
* After each test, tell the team what you expect the next two tests to be (so they can provide input concerning what they’d like to have done next while you are executing the current test one.
* Always have supporting data handy and deliver it in the index.

Resources

<<TBD>>
Microsoft Communities