Web Performance, Part V: Baseline Your Data

Up to this point, the series has focused on the mundane world of calculating statistical values in order to represent your Web performance data in some meaningful way. Now we step into the more exciting (I lead a sheltered life) world of analyzing the data to make some sense from it.

When companies sign up with a Web performance company, it has been my experience that the first thing that they want to do is get in there and push all the buttons and bounce on the seats. This usually involves setting up a million different measurements, and then establishing alerting thresholds for every single one of them that is of critical importance, emailed to the pagers of the entire IT team all the time.

Well interesting, it is also a great way for people to begin to actually ignore the data because:

  1. It’s not telling them what they need to know
  2. It’s telling them stuff when they don’t need to know it.

When I speak to a company for the first time, I often ask what their key online business processes are. I usually get either stunned silence or “I don’t know” as a response. Seriously, what has been purchased is a tool, some new gadget that will supposedly make life better; but no thought has been put into how to deploy and make use of the data coming in.

I have the luxury of being able to concentrate on one set of data all the time. In most environments, the flow of data from systems, network devices, e-mail updates, patches, business data simply becomes noise to be ignored until someone starts complaining that something is wrong. Web performance data becomes another data flow to react to, not act on.

So how do you begin to corral the beast of Web performance data? Start with the simplest question: what do we NEED to measure?

If you talked to IT, Marketing and Business Management, they will likely come up with three key areas that need to be measured:

  1. Search
  2. Authentication
  3. Shopping Cart

Technology folks say, but that doesn’t cover the true complexity of our relational, P2P, AJAX-powered, social media, tagging Web 2.0 site.

Who cares! The three items listed above pay the bills and keep the lights on. If one of these isn’t working, you fix it now, or you go home.

Now, we have three primary targets. We’re set to start setting up alerts, and stuff, right?

Nope. You don’t have enough information yet.

1stday

This is your measurement after the first day. This gives you enough information to do all those bright and shiny things that you’ve heard your new Web performance tool can do, doesn’t it?

4day

Here’s the same measurement after 4 days. Subtle but important changes have occurred. The most important of these is that the first day that data was gathered happened to be on a Friday night. Most sites would agree that the performance on a Friday night is far different than what you would find on a Monday morning. Monday morning shows this site showing a noticeable performance shift upward.

And what do you do when your performance looks like this?

long-term

Baselining is the ability to predict the performance of your site under normal circumstances on an ongoing basis. This is based on the knowledge that comes from understanding how the site has performed in the past, as well as how it has behaved under situations of abnormal behavior. Until you can predict how your site should behave, you can begin to understand why it behaves the way it does.

Focusing on the three key transaction paths or business processes listed above helps you and your team wrap your head around what the site is doing right now. Once a baseline for the site’s performance exists, then you can begin to benchmark the performance of your site by comparing it to others doing the same business process.

Web Performance, Part IV: Finding The Frequency

In the last article, I discussed the aggregated statistics used most frequently to describe a population of performance data.

stats-articles

The pros and cons of each of these aggregated values has been examined, but now we come to the largest single flaw: these values attempt to assign a single value to describe an entire population of numbers.

The only way to describe a population of numbers is to do one of two things: Display every single datapoint in the population against the time it occurred, producing a scatter plot; or display the population as a statistical distribution.

The most common type of statistical distribution used in Web performance data is the Frequency Distribution. This type of display breaks the population down into measurements of a certain value range, then graphs the results by comparing the number of results in each value container.

So, taking the same population data used in the aggregated data above, the frequency distribution looks like this.

stats-articles-frequency

This gives a deeper insight into the whole population, by displaying the whole range of measurements, including the heavy tail that occurs in many Web performance result sets. Please note that a statistical heavy tail is essentially the same as Chris Anderson’s long tail, but in statistical analysis, a heavy tail represents a non-normally distributed data set, and skews the aggregated values you try and produce from the population.

As was noted in the aggregated values, the ‘average’ performance like falls between 0.88 and 1.04 seconds. Now, when you take these values and compare them to the frequency distribution, these values make sense, as the largest concentration of measurement values falls into this range.

However, the 85th Percentile for this population is at 1.20 seconds, where there is a large secondary bulge in the frequency distribution. After that, there are measurements that trickle out into the 40-second range.

As can be seen, a single aggregated number cannot represent all of the characteristics in a population of measurements. They are good representations, but that’s all they are.

So, to wrap up this flurry of a visit through the world of statistical analysis and Web performance data, always remember the old adage: Lies, Damn Lies, and Statistics.

In the next article, I will discuss the concept of performance baselining, and how this is the basis for Web performance evolution.

Web Performance, Part III: Moving Beyond Average

In the previous article in this series, I talked about the fallacy of ‘average’ performance. Now that this has been dismissed, what do I propose to replace it with. There are three aggregated values that can be used to better represent Web performance data:

The links take you to articles that better explain the math behind each of these statistics. The focus here is why you would choose to use them rather than Arithmetic Mean.

The Median is the central point in any population of data. It is equal to the calculated value of the 50th Percentile, and is the point where half of the population lies above and below. So, in a large population of data, it can provide a good estimation of where center or average performance value is, regardless of the outliers at either end of the scale.

Geometric Mean is, well, a nasty calculation that I prefer to allow programmatic functions to handle for me. The advantage that it has over the Arithmetic Mean is that is influenced less by the outliers, producing a value that is always lower than or equal to the Arithmetic Mean. In the case of Web performance data, with populations of any size, the Geometric Mean is always lower than the Arithmetic Mean.

The 85th Percentile is the level below which 85% of the population of data lies. Now, some people use the 90th or the 95th, but I tend to cut Web sites more slack by granting them a pass on 15% of the measurement population.

So, what do these values look like?

stats-articles

These aggregated performance values are extracted from the same data population. Immediately, some things become clear. The Arithmetic Mean is higher than the Median and the Geometric Mean, by more than 0.1 seconds. The 85th Percentile is 1.19 seconds and indicates that 85% of all measurements in this data set are below this value.

Things that are bad to see:

  • An Arithmetic Mean that is substantially higher than the Geometric Mean and the Median
  • An 85th Percentile that is more than double the Geometric Mean

In these two cases, it indicates that there is a high number of large values in the measurement population, and that the site is exhibiting consistency issues, a topic for a later article in this series.

In all, these three metric provide a good quick hit, a representative single number that you can present in a meeting to say how the site is performing. But they all suffer from the same flaw — you cannot represent the entire population with an entire number.

The next article will discuss Frequency Distributions, and their value in the Web performance analysis field.

Web Performance, Part II: What are you calling average?

For a decade, the holy grail of Web performance has been a low average performance time. Every company wants to have the lowest time, in some kind of chest-thumping, testosterone-pumped battle for supremacy.

Well, I am here to tell you that the numbers you have been using for the last decade have been lying. Well, lying is perhaps to strong a term. Deeply misleading is perhaps the more accurate way to describe the way that an average describes a population of results.

Now before you call your Web performance monitoring and measurement firms and tear a strip off them, let’s look at the facts. The numbers that everyone has been holding up as the gospel truth have been averages, or, more correctly, Arithmetic Means. We all learned these in elementary school: the sum of X values divided by X produces a value that approximates the average value for the entire population of X values.

Where could this go wrong in Web performance?

We wandered off course in a couple of fundamental ways. The first is based on the basic assumption of Arithmetic Mean calculations, that the population of data used is more or less Normally Distributed.

Well folks, Web performance data is not normally distributed. Some people are more stringent than I am, but my running assumption is that in a population of measurements, up to 15% are noise resulting from “stuff happens on the Internet”. This outer edge of noise, or outliers, can have a profound skewing effect on the Arithmetic Mean for that population.

“So what?”, most of you are saying. Here’s the kicker: As a result of this skew, the Arithmetic Mean usually produces a Web performance number that is higher than the real average of performance.

So why do we use it? Simple: Relational databases are really good at producing Arithmetic Means, and lousy at producing other statistical values. Short of writing your own complex function, which on most database systems equates to higher compute times, the only way to produce more accurate statistical measures is to extract the entire population of results and produce the result in external software.

If you are building an enterprise class Web performance measurement reporting interface, and you want to calculate other statistical measures, you better have deep pockets and a lot of spare computing cycles, because these multi-million row calculations will drain resources very quickly.

So, for most people, the Arithmetic Mean is the be all and end all of Web performance metrics. In the next part of this series, I will discuss how you can break free of this madness and produce values that are truer representations of average performance.

Web Performance, Part I: Fundamentals

If you ask 15 different people what the phrase Web performance means to them, you will get 30 different answers. Like all things in this technological age, the definition is in the eye of the beholder. To the Marketing person, it is delivering content to the correct audience in a manner that converts visitors into customers. To the business leader, it is the ability of a Web site to deliver on a certain revenue goal, while managing costs and creating shareholder/investor value.

For IT audiences, the mere mention of the phrase will spark a debate that would frighten the UN Security Council. Is it the Network? The Web server? The designers? The application? What is making the Web site slow?

So, what is Web performance? It is everything mentioned above, and more. Working in this industry for nine years, I have heard all facets of the debate. And all of the above positions will appear in every organization with a Web site to varying degrees.

In this ongoing series, I will examine various facets of Web performance, from the statistical measures used to truly analyze Web performance data, to the concepts that drive the evolution of a company from “Hey, we really need to know how fast our Web page loads” to “We need to accurately correlate the performance of our site to traffic volumes and revenue generation”.

Defining Web performance is much harder than it seems. It’s simplest metrics are tied into the basic concepts of speed and success rate (availability). These concepts have been around a very long time, and are understood all the way up to the highest levels of any organization.

However, this very simple state is one that very few companies manage to evolve away from. It is the lowest common denominator in Web performance, and only provides a mere scraping of the data that is available within every company.

As a company evolves and matures in its view toward Web performance, the focus shifts away from the basic data, and begins to focus on the more abstract concepts of reliability and consistency. These force organizations to step away from the aggregated and simplistic approach of speed and availability, to a place where the user experience component of performance is factored into the equation.

After tackling consistency and reliability, the final step is toward performance optimization. This is a holistic approach to Web performance, a place where speed and availability data are only one component of an integrated whole. Companies at this strata are usually generating their own performance dashboards with combinations of data sources that correlate disparate data sources in a way that provides a clear and concise view not only of the performance of their Web site, but also of the health of their entire online business.

During this series, I will refer to data and information very frequently. In today’s world, even after nearly a decade of using Web performance tools and services, most firms only rely on data. All that matters is that the measurements arrive.

The smartest companies move to the next level and take that data and turn it into information, ideas that can shape the way that they design their Web site, service their customers, and view themselves against the entire population of Internet businesses.

This series will not be a technical HOWTO on making your site faster. I cover a lot of that ground in another of my Web sites. It will also not be data heavy; again, I point you to another of my Web sites if you want only the numbers.

What this series will do is lead you through the minefield of Web performance ideas, so that when you are asked what you think Web performance is, you can present the person asking the question with a clear, concise answer.

The next article in this series will focus on Web performance measures: why and when you use them, and how to present them to a non-technical audience.

GrabPERF: Compression Performance Study, Early Results

I have been running the GrabPERF Compression and Performance study for less than a week, but I thought that I should share some of the initial results with everyone.

GrabPERF Compression Study -- Initial Results -- Aug 28 2006

As you can see above, the byte transmission savings gained by some sites is pretty astounding. Google News sends a pages with a median weight of near 31,000 bytes when compressed; but when compression is disabled on the client, this jumps to over 139,000 bytes.

What is interesting is that the performance gains don’t look truly significant. However, they compressed pages are faster, and have the added benefit of costing the site less, as bandwidth costs count by the byte (I know it’s more complicated than that, but for now, let’s assume a fantasy world).

I will continue to monitor that results and will close the measurements after 14 days and write up a final report.

Technorati Tags: , , , , ,

Never Eat Alone: The Introvert’s Review

I sat down and finally read my copy of Never Eat Alone, by Keith Ferrazzi. Well, I agonizingly got my way through 80% of the book before I threw it across the room in disgust.

What a load of crap.

There might be a message in the book somewhere. But the book is mostly about Mr. Ferrazzi’s preening ego and self-importance.

He obviously thinks that all the world’s ills can be solved by reaching out your hand and saying, “Hey, I’m important and you need to know me!”.

Keith, get over yourself and your tale of the American dream. Focus on the facts. I don’t need to listen to a celebrity gossip story every 5 pages. In fact, your approach turned me off.

Obviously, if you took the time to get to know introverts, which I am, you would find that doing something is more important than who you know. We are people who don’t care who you know; we want to know what you have done.

Introverts have very tight, very small networks. But if you REALLY need to get something done, you usually end up working with an introvert.

I can truly say that I lost my money on this book. Maybe if he took the time to explore how the over half of the population works, he would find that his approach is seen as vacuous and disingenuous.

It’s not the number you know; it’s how well you know them.

Spend a month focusing on those people who are truly close to you. Then you will never eat alone.

Technorati Tags: , , , , ,

Fisher Space Pen and Rite in the Rain: Bonus from a friendly supply sergeant

Alan at MREater hit the jackpot when he gave a supply sergeant a lift.

As a bonus he got a Fisher Space Pen. And of course…

It’s the perfect companion to the “Rite in the Rain” All-Weather Field Book the soldier also gave to me. Nothing like a friendly supply sergeant. The Field Book has paper “created to shed water and enhance the written image.” Hot damn. That book is a good piece of gear, the kind of thing you wonder how you ever lived without.

Based on his description, he got a tan Tactical Field Book.

I need to find a supply seargeant to give a lift to!

Technorati Tags: , , , ,

When I regained consciousness…

The title is a play on an old Royal Canadian Air Farce skit.

I have been having an adverse reaction to a new medication, which the doctor asked me to stop taking today. This reaction involves am itchy rash slowly spreading all over my skin.

Tonight I took 1.5 teaspoons of Benadryl to see if that would help.

Well, that’s 3 hours of my life I won’t get back from the sleep goddess.