Today, I’m going to write about an equation. I’ll try to make it easy to follow, but it’s still stats and graphs. Stay tuned and I’m convinced it will be worth your while, because in my opinion, it’s the most important equation in cloud computing. It’s what drives your market, your customers, and your burn rate.
If you build a traditional data center platform for your application, you worry about three variables: The amount of traffic to your site, your capacity to handle that traffic, and the user experience they get, such as latency. The equation looks like this:
User experience = Traffic / Capacity
As traffic increases, user experience gets worse and delay goes up. This is because each visit to your site consumes resources on your infrastructure, and some users wind up waiting for the app to respond. Networks get full; databases encounter record locking; message queues back up; and so on. Ultimately, some of your visitors have a lousy experience.
On-demand computing platforms fundamentally change how you deal with this, because as far as you’re concerned, they have infinite capacity.
As traffic grows, average delay may only go up a little, but those averages are hiding something. To really understand how miserable your visitors are, you need to look at the worst subset of them — for example, how slow the application was for the slowest five percent of visitors. This is called the 95th percentile.
This graph shows an example of the relationship between traffic (requests per second) and user experience you control (host latency) for a typical web application.
What this means is that at around 180 hits per second, it’s taking 7 seconds for your servers to get a page ready for the worst-suffering 5 percent of your visitors. They’re having a bad experience, and they’re less likely to purchase, enroll, or return to your site.
As the person running the app, you can’t control traffic (hopefully, marketing is making that grow nicely.) And while you can tweak your application or add acceleration tools to help with user experience, eventually, the only thing you can do is add processing.
At some point, you look at this data (if you’re lucky enough to have it) and decide it’s time to add more machines. When you do this, you’re effectively “sliding” one of those curves to the right.
Now your new “worst 5%” curve looks better than your old one:
Before the change, at 180 hits per second, five percent of visitors had a host latency of 7 seconds. After the change, that’s down to 1.5 seconds — much better. Looking back at the equation User experience = Traffic / Capacity, you added capacity, and user experience got better.
Now let’s think about the cloud. One of the things cloud computing promises–indeed, its core promise–is elastic computing*. This means that capacity gets added for you.
If you use a “true” cloud, where you don’t see the individual machines, that capacity scaling happens without you knowing about it. On the other hand, if you’re using an instance-based cloud platform like EC2, you may be spinning up new machines yourself with a few keystrokes; but eventually, you’ll use some kind of automation tool to create new instances on demand.
How will that tool or cloud know how to create new instances? Simple. It’ll look at user experience. Sure, it may consider CPU load, or network usage, or free memory at first. But those are ultimately proxies for the only thing that matters: User experience. When the application’s response time gets too slow, it’s time to add capacity.
If you’re paying attention to that equation, this should terrify you. Because capacity is infinite.
The equation is the same, but this time, you’re solving for user experience. Let’s say you tell the cloud, “when the server takes more than a second to serve something, add a new machine.” Everyone gets a nice, fast visit to your site.
And you get a really big bill at the end of the month, because you added 20 new machines without knowing it.
For those in the managed hosting world, this is nothing new. A recent Huffington Post article looked at the surprising cost of pay-as-you-go bandwidth and storage (with, it should be noted, a happy ending.) CDNs that bill by tiers of contract have the same effect.
But in the past, you’ve been the one to decide what gets deployed in a data center. You controlled capacity; user experience was the variable. Now, the cloud controls capacity. The only knob you have is user experience.
What this means is that autonomic provisioning and elastic computing systems will need to provide tools for their customers to adjust not only target customer experience, but also target spending. And because clouds are in the business of making money, they’re unlikely to offer spending controls willingly to platform users.
Which is why the relationship between traffic (your market), user experience (your customers) and capacity (your burn rate) is the most important equation in the cloud.
** Elastic computing is term I first heard from Duncan Hill in the summer of 2001 and I remember it well. Duncan was building a company called Thinkdynamics (later sold to IBM) that automated server provisioning. It spun up new machines when needed, and we were talking because Coradiant was very good at knowing when user experience was bad enough to need it.