It depends on the type of software you are benchmarking.
Example 1
Let's imagine that the piece of software is a mathematical computation, that is, moreover, pretty straightforward. By straightforward, I mean that:
It will take exactly N seconds to compute a result for a given input on a given piece of hardware, using the CPU at 100%, with memory increasing linearly from M₁ to M₂ during the process.
It will take 0.52 × N seconds and 0.29 × N seconds respectively when executed in parallel on respectively 2 and 4 cores.
It will take N/d + 25 seconds when executed on d machines in parallel, with 2 ≤ d ≤ 200. The 25 seconds here are essentially the time it gets for the map-reduce controller to distribute the load, and collect and combine the results.
It wouldn't be very hard to figure how much machines do you need to provision to do the computation in a given amount of time. The only issue would be that it works only for a given range of d. Put too much servers, and things would get ugly, and, eventually, adding more machines would make the process slower.
Example 2
Now let's imagine an ordinary web application. Say, an e-commerce website. You host it on a farm of ten servers and, for a week, collect all the metrics you consider relevant. It looks pretty much solid, although I notice that it's not very linear: your servers idle most of the time at 4 a.m., but you have quite an activity there at 8 p.m. Also, you got much more activity on Wednesday evening and on Saturday.
You do the math and try to forecast the usage for the next week.
And then, things start to break apart quickly.
- You got all your servers at 100% at 3 a.m. on Tuesday. That's right. There was a DOS attack. Luckily, it stopped before the next morning.
- On Thursday, you got a lot of activity. Guys from marketing didn't tell you? They spammed all the potential customers. I meant—sent promotional emails.
- On Friday, 100% again. Not because of another DOS attack—rather five of the ten servers installed an automatic update and were stuck at an infinite reboot. Absolutely theoretical case, absolutely not mentioning CrowdStrike here.
- On Saturday, server activity in the range of 0% – 1%. Actually, a new release of the website introduced a regression: every customer sees an HTTP 404 for every requested page (and yes, it doesn't take too much resources to send the HTTP 404 from cache).
If you're in this situation, you can't just have a mathematical model that would predict how the application will behave with a given number of servers. Because there is a huge number of parameters to consider, and some of them are not even known in advance—you usually cannot predict a DDOS attack or a CrowdStrike update bug, or the fact that you'll suddenly hit the front page of a popular social network as the greatest website to buy the stuff you sell.
What you can have is:
- Active monitoring that tells you: “hey, you'll have a load problem right now, and by right now, I mean... in a few seconds.”
- The way to react to this monitoring, by scaling up (and then possibly scaling down to save money). You do that either by having enough spare servers sitting around in your data center and waiting to be turned on. Or delegating this task to a cloud provider. Whatever is cheaper.