Performance testing is the art and science of generating load or stress against a system or application environment. By replicating real-world situations, as well as the extremes to which the platform might be used, developers can gauge the infrastructure’s ability to support and sustain that load.
Did you ever do an experiment where you had to build a container that let you throw an egg off a building and have it reach the ground safely? This type of experiment is actually similar to the idea of performance testing. You come to the table with a prototype and then throw it off the building to find out if it works. Once you’ve found what doesn’t work, you redesign and reinforce those weak spots to make sure that egg stays safe and sound when it’s time to submit for a grade—in this case, customer and end user experience.
This testing, to make sure that systems and applications function properly, comes with a price tag. When companies have already gone over budget on developing a digital product, the idea of more expenses to find out what’s wrong may not be appealing. Some companies, when their budgets start to tighten, will try to save money on performance testing with a bad idea: spend less money by reducing the number of concurrent virtual users being tested. Instead of testing what happens when a hundred thousand people try to access their portal at once, they decide that only ten thousand are in the budget.
Then, to try to get away with having fewer concurrent virtual users, test designers reduce the think time to ‘trick’ the system into thinking there are more users than there actually are by increasing velocity with fewer virtual users.
The less that virtual user behavior mimics real user behavior, the more abstract the comparison becomes. This marginalizes the value of the test by decreasing the accuracy of the performance analysis. How do real people use systems? They need time to think—from reading the text on the screen before clicking “log in” to typing in their personal information, people don’t just click through a system in a few short seconds.
So, when a business wants to test their new app with a lesser number of virtual users, they compensate by also cutting corners on the think time. They believe that making the actions happen faster still adequately tests their system.
What happens when you reduce the number of virtual users?
In a nutshell, using this strategy leads to skewed data and an unrealistic testing environment. Let’s take a look.
Remember the business from earlier? Imagine they need to test the impact of 93,000 transactions per hour total, to recreate the conditions that were seen during a recent peak load event.
Calculations have indicated to their quality assurance team that with the load test scenarios and scripts, they need 1500 concurrent virtual users to hit the target number of transactions. This works out to effectively 62 transactions per hour per user. This means that each virtual user must iterate—go through a single loop of the load test script or action—62 times per hour, or slightly faster than one iteration per minute.
However, when the company received the licensing cost for our scenario, it exceeded the budget by a significant margin. They realized they can only spend enough to license 500 virtual users. So, their quality assurance team asked to find a way to configure the test to generate the same load (93,000 total transactions per hour) with 500 virtual users instead of 1500.
This means that each virtual user would have to do three times the work, or put another way, complete each iteration in 1/3 the time of the 1500 virtual user model. Remember, the original number was 62 transactions per hour to hit our target numbers. So in this decreased virtual user model, that number will have to triple to 186 transactions per hour per user. Or, said another way, each iteration would need to complete in less than 20 seconds. Decreasing think time in this way causes a larger issue.
Why does it matter if you reduce think time to compensate?
Reducing think time may seem like a brilliant solution at first. With lots of virtual users, they may be “sitting around doing nothing” for a lot of the time—the total iteration, that is the time from beginning to end of the interaction, can be fairly lengthy. The idea is that having fewer virtual users who do more things quicker and have a shorter iteration could simulate the same effect as having lots of virtual users, and essentially make your server think there are more users than there actually are.
However, server response times are the key variable to hitting these numbers. With a reduced number of virtual users, if an iteration takes longer than 20 seconds (with or without think time), we will be low on our total transaction count. We are also still constrained by server capacity and optimal response times. That user who is “sitting around doing nothing” is still taking up space on the server in a way that matters for real-world performance. Failing to account for them in the test means the test results aren’t fully demonstrating what the team wants to learn.
What if you tried it anyway?
We did! So, for the sake of the experiment, we assumed that it is possible to run an iteration in as little as 20 seconds. It should be noted, however, that your system may not be able to do this even under ideal/baseline conditions, which would render this entire experiment invalid.
We tested this hypothesis—that we can reduce concurrent virtual users to 1/3 of the ideal number by reducing think times—and found that we needed to ask what effect this change of think time will have when server response times degrade over time. Remember, we chose think time as our throttling mechanism to try to control throughput (the amount of transactions produced over time during the test), but our actual throughput is ultimately throttled by server response times that we cannot control.
When we changed the think times and target concurrent virtual users from 1500 to 500, we began to see an unexpected impact to our total transaction throughput. When the servers ran faster than expected, we overshot our transaction volumes by large numbers, and inversely, when server response times slowed down, we greatly undershot our transaction volume target.
What this showed us is that think time is not a reliable variable to compensate for lowered concurrency of distinct virtual users. Transaction response times have a greater impact on system throughput than think time does in this model.
So, what have we learned?
We’ve discovered exactly why you shouldn’t use reduced think time to cut corners in performance testing. There are some cases where think time can be used to effectively throttle load, such as stress testing — the goal being to empirically observe what the servers can do under heavy load — but that is very different from trying to recreate an observed load condition or hit a specific set of KPI requirements.
Think time provides more than just a simulation of human behavior, it is a constructive buffer for your tests that effectively creates a cushion effect when server stress fluctuates. If you remove or dramatically reduce your think times to chase throughput, you will eventually reach a point where you have nowhere else to go but to add concurrent virtual users, so why not just do that right from the start?
How can I find performance testing that I can rely on?
Here at Foulk Consulting, we are committed to helping with the testing methods and metrics that give you a clear look at how your systems and applications will respond to real-world use. We don’t cut corners here, and our industry-proven performance engineering experts can help you with anything from ad-hoc consulting to mature performance engineering processes to regular performance validation. We’re here to give you the power to identify and solve your performance problems using the best technology and processes available. Contact us today to share your story and learn about the solutions we can provide.