Snippets of code and fun.
I have always had a problem doing performance testing (aka load testing). I am quite adept at recording web tests and collecting counters, but I still find it troublesome interpreting the counters that I gather from load tests. The problem is that I need a plan that will produce relevant results with which I can conclude with confidence: "This system is a juggernaught! It can withstand more hits than Google.com." If you are like me and you have trouble planning your load tests, then maybe this post will help you. The following post describes how I like to run my own plan and what conclusions I can make with this plan.
Special thanks goes out to Jeff Dunmall, our co-CEO, whose article "Real-World Load Testing Tips to Avoid Bottlenecks When Your Web App Goes Live" got me to write something more about load testing here, and Randar Puust who uses this plan (and thus now I use it too).
There are a number of facets to load testing:
- setting up a load testing rig and setting up your environment in Visual Studio
- recording the web tests
- planning the load tests
- collecting performance counters and figuring out what the recorded result mean
Unless you have a plan, planning the load tests and reading the results will be meaningless. A performance test plan will keep you on track throughout the last two phases mentioned above. The results from a test plan will allow you to determine:
- maximum number of users that can hit the site before it becomes unresponsive
- can the system recover from overload
- are there memory leaks in the code
A performance test plan consists of four types of tests that need to be run in the following order:
1. Performance Tests
2. Load Tests
3. Stress Tests
4. Endurance Tests
These are the benchmark tests. By this point you should have recorded web tests which cover the most-used parts of your web application. Now, you will create a load test that gathers and runs all of the web tests that you have recorded. Run the load test with settings similar to these:
- load: low end of the expected number of users for the system. I like to start with 5 or 10.
- time: 10 minutes
- think time: 5 seconds
- warmup: 30 seconds to 1 minute is more than enough
A load test like this should not stress the web server, it should be low-impact. The goal of this test is for it to be run later on in this plan and be used to compare against (i.e. a benchmark). Collect and save the result counters from the web server and from the client. Web Server counters to collect are: "CPU %", "Used Memory". Client counters to collect are: "Respose time per page", "Number of Requests per second". Other counters you might like to save are "pages that take the longest time to load", "number of errors", "network usage" - these are useful for neat reports that you might want to compile.
The load tests are going to stress the web application and try and find its breaking point. The goal of these test will be to find the lucky number X. This number is the number of users that are concurrently hitting the system and that cause the system to be unresponsive or response times to be unacceptable. For this part of the plan you will need to run a number of load tests. In each case, increase the number of users and collect the counters:
- load: 10 users, 15 users, 20 users, 50 users.... until the response times get too large, or server CPU % gets dangerously high.
- time, think time, warmup: same as for the performance test. Basically we are running the performance test from step 1but with more and more users.
By the end of this step you will know what is the maximum number of users that can hit the system.
Stress tests check that the web application can recover from errors induced by overload.
- load: X + 20% (where X is the magic number from step 2 which causes the system to be unstable)
- time: 1 minute
After this test is run the system should become unresponsive. If it is not unresponsive, then increase the 20%. One way to check if it is unresponsive is to check that response times are suddenly exponentially far worse than the results in step 2 and CPU % usage on the server is close to 100%.
Immediately after this test runs run the performance test from step 1. If the performance test runs fine and the results are comparable to the test that was run at step 1, then your system passes! It can safely recover from overload issues.
The endurance tests are unrelated to the other 3 tests. These tests are run over a long period of time, which causes many users to hit the site. If there are memory leaks, then over the course of this endurance test the free memory on the web server should decrease dramatically.
- load: X - 20% (where X is the magic number from step 2 which causes the system to be unstable). We'll use -(minus) 20% because we don't want to overload the system
- time: 12 hours
Remember to record the "Free" or "Used Memory" on the web server. The goal at the end of this step is to make sure that the memory usage on the web server has not decreased dramatically and thus there are no memory leaks in your code. If that is the case with your results then tip of the hat to you!
At the end of these four types of tests you should have a better idea of where your system stands in terms of load that it can handle. You will know:
- what is the maximum number of users it can handle before performance is unacceptable
- can it recover after an overload
- are there any memory leaks