Tag: web performance concepts

Web Performance Concepts Series – Revisited

Two years ago I created a series of five blog articles, aimed at both business and technical readers, with the goal of explaining the basic statistical concepts and methods I use when analyzing Web performance data in my role as a Web performance consultant.

Most of these ideas were core to my thinking when I developed GrabPERF in 2005-2006, as I determined that it was vital that people not only receive Web performance measurement data for their site, but they receive it in a way that allows them to inform and shape the business and technical decisions they make on a daily basis.

While I come from a strong technical background, it is critical to be able to present the data that I work with in a manner that can be useful to all components of an organization, from the IT and technology leaders who shape the infrastructure and design of a site, to the marketing and business leaders who set out the goals for the organization and interact with customers, vendors and investors.

Providing data that helps negotiate the almost religious dichotomy that divides most organizations is crucial to providing a comprehensive Web performance solution to any organization.

These articles form the core of an ongoing series of discussion focused on the the pitfalls of Web performance analysis, and how to learn and avoid the errors others have already discovered.

The series went over like a lead balloon and this left me puzzled. While the basic information in the articles was technical and focused on the role that simple statistics play in affecting Web performance technology and business decisions inside an organization, they formed the core of what I saw as an ongoing discussion that organizations need to have to ensure that an organization moves in a single direction, with a single purpose.

I have decided reintroduce this series, dredging it from the forgotten archives of this blog, to remind business and IT teams of the importance of the Web performance data they use every day. It also serves as a guide to interpreting the numbers that arise from all the measurement methodologies that companies use, a map to extract the most critical information in the raging sea of data.

The five articles are:

  1. Web Performance, Part I: Fundamentals
  2. Web Performance, Part II: What are you calling average?
  3. Web Performance, Part III: Moving Beyond Average
  4. Web Performance, Part IV: Finding The Frequency
  5. Web Performance, Part V: Baseline Your Data

I look forward to your comments and questions on these topics.

Web Performance: Optimizing Page Load Time

Aaron Hopkins posted an article detailing all of the Web performance goodness that I have been advocating for a number of years.

To summarize:

  • Use server-side compression
  • Set your static objects to be cacheable in browser and proxy caches
  • Use keep-alives / persistent connections
  • Turn your browsers’ HTTP pipelining feature on

These ideas are not new, and neither are the finding in his study. As someone who has worked in the Web performance field for nearly a decade, these are old-hat. However, it’s always nice to have someone new inject some life back into the discussion.

Web Performance, Part VII: Reliability and Consistency

In this series, the focus has been on the basic Web performance concepts, the ones that have dominated the performance management field for the last decade. It’s now time to step beyond these measures, and examine two equally important concepts, ones that allow a company to analyze their Web performance outside the constraints of performance and availability.

Reliability is often confused with availability when it is used in a Web performance context. Reliability, as a measurement and analysis concept goes far beyond the binary 0 or 1 that the term availability limits us to, and places it in the context of how availability affects the whole business.
Typical measures used in reliability include:

  • Minutes of outage
  • Number of failed measurements
  • Core business hours

Reliability is, by its very nature, a more complex way to examine the successful delivery of content to customers. It forces the business side of a company to define what times of day and days of the week affect the bottom-line more, and forces the technology side of the business to be able to account not simply for server uptime, but also for exact measures of when and why customers could not reach the site.
This approach almost always leads to the creation of a whole new metric, one that is uniquely tied to the expectations and demands of the business it was developed in. It may also force organizations to focus on key components of their online business, if a trend of repeated outages appears with only a few components of the Web site.

Consistency is uniquely paired with Reliability, in that it extends the concept of performance to beyond simple aggregates, and considers what the performance experience is like for the customer on each visit. Can a customer say that the site always responds the same way, or do you hear that sometimes your site is slow and unusable? Why is the performance of your site inconsistent?

A simple way to think of consistency is the old standby of the Standard Deviation. This gives the range in which the population of the measurements is clustered around the Arithmetic Mean. This value can depend on the number of measures in the population, as well as the properties of these unique measures.

Standard Deviation has a number of flaws, but provides a simple way to define consistency: a large standard deviation value indicates a high degree of inconsistency within the measurement population, whereas a low small standard deviation value indicates a higher degree of consistency.

The metric that is produced for consistency differs from the reliability metric in that it will always be measured in seconds or milliseconds. But the same insight may arise from consistency, that certain components of the Web site contribute more to the inconsistency of a Web transaction. Isolating these elements outside the context of the entire business process gives organizations the information they need to eliminate these issues more quickly.

Companies that have found that simple performance and availability metrics constrain their ability to accurately describe the performance of their Web site need to examine ways to integrate a formula for calculating Reliability, and a measure of Consistency into their performance management regime.

Web Performance, Part VI: Benchmarking Your Site

In the last article in this series, the concept of baselining your measurements was discussed. This is vital, in order for you and your organization to be able to identify the particular performance patterns associated with your site.

Now that’s under control, you’re done, right?

Not a chance. Remember that your site is not the only Web site your customers visit. So, how are you doing against all of those other sites?

Let’s take a simple example of the performance for one week for one of the search firms. This is simply an example; I am just too lazy to change the names to protect the innocent.

one_search-7day

Doesn’t look too bad. An easily understood pattern of slower performance during peak business hours appears in the data, presenting a predictable pattern which would serve as a great baseline for any firm. However, this baseline lacks context. If anyone tries to use a graph like this, the next question you should ask is “So what?”.

What makes a graph like this interesting but useless? That’s easy: A baseline graph is only the first step in the information process. A graph of your performance tells you how your site is doing. There is, however, no indication of whether this performance trend is good or bad.

four_search-7day

Examining the performance of the same firm within a competitive and comparative context, the predictable baseline performance still appears predictable, but not as good as it could be. The graph shows that most of the other firms in the same vertical, performing the identical search, over the same period of time, and from the same measurement locations, do not show the same daytime pattern of performance degradation.

The context provided by benchmarking now becomes a critical factor. By putting the site side-by-side with other sites delivering the same service, an organization can now question the traditional belief that the site is doing well because we can predict how it will behave.

A simple benchmark such as the one above forces a company to ask hard questions, and should lead to reflection and re-examination of what the predictable baseline really means. A benchmark result should always lead a company to ask if their performance is good enough, and if they want to get better, what will it take.
Benchmarking relies on the idea of a business process. The old approach to benchmarks only considered firms in the narrowly defined scope of the industry verticals; another approach considers company homepages without any context or reliable comparative structure in place to compensate for the differences between pages and sites.

It is not difficult to define a benchmark that allows for the comparison of a major bank to a major retailer, and a major social networking site, and a major online mail provider. By clearly defining a business process that these sites share (in this case let’s take the user-authentication process) you can compare companies across industry verticals.

This cross-discipline comparison is crucial. Your customers do this with your site every single day. They visit your site, and tens, maybe hundreds, of other sites every week. They don’t limit their comparison to sites in the same industry vertical; they perform cross-vertical business process critiques intuitively, and then share these results with others anecdotally.

In many cases, a cross-vertical performance comparison cannot be performed, as there are too many variables and differences to perform a head-to-head speed analysis. Luckily for the Web performance field, speed is only one metric that can be used for comparison. By stretching Web site performance analysis beyond speed, comparing sites with vastly different business processes and industries can be done in a way that treats all sites equally. The decade-long focus on speed and performance has allowed other metrics to be pushed aside.

Having a fast site is good. But that’s not all there is to Web performance. If you were to compare the state of Web performance benchmarking to the car-buying public, the industry has been stuck in the role of a power-hungry, horsepower-obsessed teenage boy for too long. Just as your automobile needs and requirements evolve (ok, maybe this doesn’t apply to everyone), so do your Web performance requirements.

Web Performance, Part V: Baseline Your Data

Up to this point, the series has focused on the mundane world of calculating statistical values in order to represent your Web performance data in some meaningful way. Now we step into the more exciting (I lead a sheltered life) world of analyzing the data to make some sense from it.

When companies sign up with a Web performance company, it has been my experience that the first thing that they want to do is get in there and push all the buttons and bounce on the seats. This usually involves setting up a million different measurements, and then establishing alerting thresholds for every single one of them that is of critical importance, emailed to the pagers of the entire IT team all the time.

Well interesting, it is also a great way for people to begin to actually ignore the data because:

  1. It’s not telling them what they need to know
  2. It’s telling them stuff when they don’t need to know it.

When I speak to a company for the first time, I often ask what their key online business processes are. I usually get either stunned silence or “I don’t know” as a response. Seriously, what has been purchased is a tool, some new gadget that will supposedly make life better; but no thought has been put into how to deploy and make use of the data coming in.

I have the luxury of being able to concentrate on one set of data all the time. In most environments, the flow of data from systems, network devices, e-mail updates, patches, business data simply becomes noise to be ignored until someone starts complaining that something is wrong. Web performance data becomes another data flow to react to, not act on.

So how do you begin to corral the beast of Web performance data? Start with the simplest question: what do we NEED to measure?

If you talked to IT, Marketing and Business Management, they will likely come up with three key areas that need to be measured:

  1. Search
  2. Authentication
  3. Shopping Cart

Technology folks say, but that doesn’t cover the true complexity of our relational, P2P, AJAX-powered, social media, tagging Web 2.0 site.

Who cares! The three items listed above pay the bills and keep the lights on. If one of these isn’t working, you fix it now, or you go home.

Now, we have three primary targets. We’re set to start setting up alerts, and stuff, right?

Nope. You don’t have enough information yet.

1stday

This is your measurement after the first day. This gives you enough information to do all those bright and shiny things that you’ve heard your new Web performance tool can do, doesn’t it?

4day

Here’s the same measurement after 4 days. Subtle but important changes have occurred. The most important of these is that the first day that data was gathered happened to be on a Friday night. Most sites would agree that the performance on a Friday night is far different than what you would find on a Monday morning. Monday morning shows this site showing a noticeable performance shift upward.

And what do you do when your performance looks like this?

long-term

Baselining is the ability to predict the performance of your site under normal circumstances on an ongoing basis. This is based on the knowledge that comes from understanding how the site has performed in the past, as well as how it has behaved under situations of abnormal behavior. Until you can predict how your site should behave, you can begin to understand why it behaves the way it does.

Focusing on the three key transaction paths or business processes listed above helps you and your team wrap your head around what the site is doing right now. Once a baseline for the site’s performance exists, then you can begin to benchmark the performance of your site by comparing it to others doing the same business process.

Web Performance, Part IV: Finding The Frequency

In the last article, I discussed the aggregated statistics used most frequently to describe a population of performance data.
stats-articles
The pros and cons of each of these aggregated values has been examined, but now we come to the largest single flaw: these values attempt to assign a single value to describe an entire population of numbers.

The only way to describe a population of numbers is to do one of two things: Display every single datapoint in the population against the time it occurred, producing a scatter plot; or display the population as a statistical distribution.

The most common type of statistical distribution used in Web performance data is the Frequency Distribution. This type of display breaks the population down into measurements of a certain value range, then graphs the results by comparing the number of results in each value container.

So, taking the same population data used in the aggregated data above, the frequency distribution looks like this.
stats-articles-frequency
This gives a deeper insight into the whole population, by displaying the whole range of measurements, including the heavy tail that occurs in many Web performance result sets. Please note that a statistical heavy tail is essentially the same as Chris Anderson’s long tail, but in statistical analysis, a heavy tail represents a non-normally distributed data set, and skews the aggregated values you try and produce from the population.

As was noted in the aggregated values, the ‘average’ performance like falls between 0.88 and 1.04 seconds. Now, when you take these values and compare them to the frequency distribution, these values make sense, as the largest concentration of measurement values falls into this range.

However, the 85th Percentile for this population is at 1.20 seconds, where there is a large secondary bulge in the frequency distribution. After that, there are measurements that trickle out into the 40-second range.

As can be seen, a single aggregated number cannot represent all of the characteristics in a population of measurements. They are good representations, but that’s all they are.

So, to wrap up this flurry of a visit through the world of statistical analysis and Web performance data, always remember the old adage: Lies, Damn Lies, and Statistics.
In the next article, I will discuss the concept of performance baselining, and how this is the basis for Web performance evolution.

Web Performance, Part III: Moving Beyond Average

In the previous article in this series, I talked about the fallacy of ‘average’ performance. Now that this has been dismissed, what do I propose to replace it with. There are three aggregated values that can be used to better represent Web performance data:

The links take you to articles that better explain the math behind each of these statistics. The focus here is why you would choose to use them rather than Arithmetic Mean.

The Median is the central point in any population of data. It is equal to the calculated value of the 50th Percentile, and is the point where half of the population lies above and below. So, in a large population of data, it can provide a good estimation of where center or average performance value is, regardless of the outliers at either end of the scale.

Geometric Mean is, well, a nasty calculation that I prefer to allow programmatic functions to handle for me. The advantage that it has over the Arithmetic Mean is that is influenced less by the outliers, producing a value that is always lower than or equal to the Arithmetic Mean. In the case of Web performance data, with populations of any size, the Geometric Mean is always lower than the Arithmetic Mean.

The 85th Percentile is the level below which 85% of the population of data lies. Now, some people use the 90th or the 95th, but I tend to cut Web sites more slack by granting them a pass on 15% of the measurement population.
So, what do these values look like?

stats-articles

These aggregated performance values are extracted from the same data population. Immediately, some things become clear. The Arithmetic Mean is higher than the Median and the Geometric Mean, by more than 0.1 seconds. The 85th Percentile is 1.19 seconds and indicates that 85% of all measurements in this data set are below this value.

Things that are bad to see:

  • An Arithmetic Mean that is substantially higher than the Geometric Mean and the Median
  • An 85th Percentile that is more than double the Geometric Mean

In these two cases, it indicates that there is a high number of large values in the measurement population, and that the site is exhibiting consistency issues, a topic for a later article in this series.

In all, these three metric provide a good quick hit, a representative single number that you can present in a meeting to say how the site is performing. But they all suffer from the same flaw — you cannot represent the entire population with an entire number.

The next article will discuss Frequency Distributions, and their value in the Web performance analysis field.

Web Performance, Part I: Fundamentals

If you ask 15 different people what the phrase Web performance means to them, you will get 30 different answers. Like all things in this technological age, the definition is in the eye of the beholder. To the Marketing person, it is delivering content to the correct audience in a manner that converts visitors into customers. To the business leader, it is the ability of a Web site to deliver on a certain revenue goal, while managing costs and creating shareholder/investor value.

For IT audiences, the mere mention of the phrase will spark a debate that would frighten the UN Security Council. Is it the Network? The Web server? The designers? The application? What is making the Web site slow?

So, what is Web performance? It is everything mentioned above, and more. Working in this industry for nine years, I have heard all facets of the debate. And all of the above positions will appear in every organization with a Web site to varying degrees.

In this ongoing series, I will examine various facets of Web performance, from the statistical measures used to truly analyze Web performance data, to the concepts that drive the evolution of a company from “Hey, we really need to know how fast our Web page loads” to “We need to accurately correlate the performance of our site to traffic volumes and revenue generation”.

Defining Web performance is much harder than it seems. It’s simplest metrics are tied into the basic concepts of speed and success rate (availability). These concepts have been around a very long time, and are understood all the way up to the highest levels of any organization.

However, this very simple state is one that very few companies manage to evolve away from. It is the lowest common denominator in Web performance, and only provides a mere scraping of the data that is available within every company.
As a company evolves and matures in its view toward Web performance, the focus shifts away from the basic data, and begins to focus on the more abstract concepts of reliability and consistency. These force organizations to step away from the aggregated and simplistic approach of speed and availability, to a place where the user experience component of performance is factored into the equation.

After tackling consistency and reliability, the final step is toward performance optimization. This is a holistic approach to Web performance, a place where speed and availability data are only one component of an integrated whole. Companies at this strata are usually generating their own performance dashboards with combinations of data sources that correlate disparate data sources in a way that provides a clear and concise view not only of the performance of their Web site, but also of the health of their entire online business.

During this series, I will refer to data and information very frequently. In today’s world, even after nearly a decade of using Web performance tools and services, most firms only rely on data. All that matters is that the measurements arrive.
The smartest companies move to the next level and take that data and turn it into information, ideas that can shape the way that they design their Web site, service their customers, and view themselves against the entire population of Internet businesses.

This series will not be a technical HOWTO on making your site faster. I cover a lot of that ground in another of my Web sites.

What this series will do is lead you through the minefield of Web performance ideas, so that when you are asked what you think Web performance is, you can present the person asking the question with a clear, concise answer.
The next article in this series will focus on Web performance measures: why and when you use them, and how to present them to a non-technical audience.

Is Web 2.0 Suffocating the Internet?

At my job, I get involved in trying to solve a lot of hairball problems that seem obscure and bizarre. It’s the nature of what I do.

Over the last 3 weeks, some issues that we have been investigating as independent performance-related trends merged into a single meta-issue. I can’t go into the details right now, but what is clear to me (and some of the folks I work with are slowly starting to ascribe to this view) is that the background noise of Web 2.0 services and traffic have started to drown out, and, in some cases, overwhelm the traditional Internet traffic.

Most of the time, you can discount my hare-brained theories. But this one is backed by some really unusual trends that we found yesterday in the publicly available statistics from the Public Exchange points.

I am no network expert, but I am noticing a VERY large upward trend in the volume of traffic going into and out of these locations around the world. And these are simply the public peering exchanges; it would be interesting to see what the traffic statistics at some of the Tier 1 and Tier 2 private peering locations, and at some of the larger co-location facilities looks like.

Now to my theory.

The background noise generated by the explosion of Web 2.0 (i.e. “Always Online”) applications (RSS aggregators, Update pings, email checkers, weather updates, Adsense stats, etc., etc.) are starting to really cause a significant impact on the overall performance of the Internet as a whole.

Some of the coal-mine canaries, organizations that have extreme sensitivity to changes in overall Internet performance, are starting to notice this. Are there other anecdotal/quantitative results that people can point to? Have people trended their performance/traffic data over the last 1 to 2 years?

I may be blowing smoke, but I think that we may be quietly approaching an inflection point in the Internet’s capacity, one that sheer bandwidth itself cannot overcome. In many respects, this is a result of the commercial aspects of the Internet being attached to a notoriously inefficient application-level protocol, built on top of a best-effort delivery mechanism.

The problems with HTTP are coming back to haunt us, especially in the area of optimization. About two years ago, I attended a dinner run by an analyst firm where this subject was discussed. I wasn’t as sensitive to strategic topics as I am now, but I can see now that the topics being raised have now come to pass.

How are we going to deal with this? We can start with the easy stuff.

  • Persistent Connections
  • HTTP Compression
  • Explicit Caching
  • Minimize Bytes

The hard stuff comes after: how to we fix the underlying network? What application is going to relace HTTP?

Comments? Questions?

GrabPERF: Search Index Weekly Results (Aug 29-Sep 4, 2005)

The weekly GrabPERF Search Index Results are in. Sorry for the delay.
Week of August 29, 2005 – September 4, 2005

TEST                     RESULT  SUCCESS  ATTEMPTS
--------------------  ---------  -------  --------
PubSub - Search       0.4132426    99.95      5561
Google - Search       0.5546451   100.00      5570
MSN - Search          0.7807107    99.87      5572
Yahoo - Search        0.7996602    99.98      5571
eBay - Search         0.9371296   100.00      5571
Feedster - Search     1.1738754    99.96      5569
Newsgator - Search    1.2168921    99.96      5569
BlogLines - Search    1.2857559    99.71      5571
BestBuy.com - Search  1.4136253    99.98      5572
Blogdigger - Search   1.8896126    99.74      5462
BENCHMARK RESULTS     1.9096960    99.79     75419
Amazon - Search       1.9795655    99.84      3123
Technorati - Search   2.7727073    99.60      5566
IceRocket - Search    5.0256308    99.43      5571
Blogpulse - Search    6.5206247    98.98      5571

These results are based on data gathered from three remote measurement locations in North America. Each location takes a measurement approximately once every five minutes.

The measurements are for the base HTML document only. No images or referenced files are included.

Copyright © 2024 Performance Zen

Theme by Anders NorenUp ↑