Category: Web Performance

Web Performance Concepts Series – Revisited

Two years ago I created a series of five blog articles, aimed at both business and technical readers, with the goal of explaining the basic statistical concepts and methods I use when analyzing Web performance data in my role as a Web performance consultant.

Most of these ideas were core to my thinking when I developed GrabPERF in 2005-2006, as I determined that it was vital that people not only receive Web performance measurement data for their site, but they receive it in a way that allows them to inform and shape the business and technical decisions they make on a daily basis.

While I come from a strong technical background, it is critical to be able to present the data that I work with in a manner that can be useful to all components of an organization, from the IT and technology leaders who shape the infrastructure and design of a site, to the marketing and business leaders who set out the goals for the organization and interact with customers, vendors and investors.

Providing data that helps negotiate the almost religious dichotomy that divides most organizations is crucial to providing a comprehensive Web performance solution to any organization.

These articles form the core of an ongoing series of discussion focused on the the pitfalls of Web performance analysis, and how to learn and avoid the errors others have already discovered.

The series went over like a lead balloon and this left me puzzled. While the basic information in the articles was technical and focused on the role that simple statistics play in affecting Web performance technology and business decisions inside an organization, they formed the core of what I saw as an ongoing discussion that organizations need to have to ensure that an organization moves in a single direction, with a single purpose.

I have decided reintroduce this series, dredging it from the forgotten archives of this blog, to remind business and IT teams of the importance of the Web performance data they use every day. It also serves as a guide to interpreting the numbers that arise from all the measurement methodologies that companies use, a map to extract the most critical information in the raging sea of data.

The five articles are:

  1. Web Performance, Part I: Fundamentals
  2. Web Performance, Part II: What are you calling average?
  3. Web Performance, Part III: Moving Beyond Average
  4. Web Performance, Part IV: Finding The Frequency
  5. Web Performance, Part V: Baseline Your Data

I look forward to your comments and questions on these topics.

The Dichotomy of the Web: Andy King's Website Optimization

Andy King's Website Optimization, O'Reilly 2008The Web is a many-splendored thing, with a very split personality. One side is drive to find ways to make the most money possible, while the other is driven to implement cool technology in an effective and efficient manner (most of the time).

Andy King, in Website Optimization (O’Reilly), tries to address these two competing forces in a way that both can understand. This is important because, as we all know from our own lives, most of the time these two competing parts of the same whole are right; they just don’t understand the other side.

I have seen this trend repeated throughout my nine years in the Web performance industry, five years as a consultant. Companies torn asunder, viewing the Business v. Technology interaction as a Cold War, one that occasionally flares up in odd places which serve as proxies between the two.

Website Optimization appears at first glance to be torn asunder by this conflict. With half devoted to optimizing the site for business and the other to performance and design optimization, there will be a cry from the competing factions that half of this book is a useless waste of time.

These are the organizations and individuals who will always be fighting to succeed in this industry. These are the people and companies who don’t understand that success in both areas is critical to succeeding in a highly competitive Web world.
The first half of the book is dedicated to the optimization of a Web site, any Web site, to serve a well-defined business purpose. Discussing terms such as SEO, PPC, and CRO can curdle the blood of any hardcore techie, but they are what drive the design and business purpose of a Web site. Without a way to get people to a site, and use the information on the site to do business or complete the tasks that they need to, there is no need to have a technological infrastructure to support it.

Conversely, a business with lofty goals and a strategy that will change the marketplace will not get a chance to succeed if the site is slow, the pages are large, and design makes cat barf look good. Concepts such HTTP compression, file concatenation, caching, and JS/CSS placement drive this side of the personality, as well as a number of application and networking considerations that are just too far down the rat hole to even consider in a book with as broad a scope as this one.

Although on the surface, the concepts discussed in this book will see many people put it down as it isn’t business or techie enough, those who do buy the book will show that they have a grasp of the wider perspective, the one that drives all successful sites to stand tall in a sea of similarity.

See the Website Optimization book companion site for more information, chapter summaries and two sample chapters.

GrabPERF: Yahoo issues today

Netcraft noted that Yahoo encountered a bit of a headache today. So I fired up my handy-dandy little performance system and had a look.

yahoo issues july 06 2007

Although for an organization and infrastructure the size of Yahoo’s this may have been a big event, in my experience, this was a “stuff happens on the Internet” sort of thing.

Move along people; there’s nothing to see. It is not the apocalyptic event that Netcraft is making it out to be. Google burps and barfs all the time, and everyone grumbles. But there is no need to run in circles and scream and shout.

Yeesh!

Dear Apache Software Foundation: FIX THE MSIE SSL KEEPALIVE SETTINGS!

Dear Apache Software Foundation, and the developers of the Apache Web server:

I would like to thank you for developing a great product. I rely on it daily to host my own sites, and a large number of people on the Internet seem to share my love of this software.

However, it appears that you seem to want to maintain a simple flaw in your logic that continues to make me crazy. I am a Web performance analyst, and at least once a week I sigh, and shake my head whenever I stoop to use Microsoft Internet Explorer (MSIE) to visit secure sites.

I seems that in your SSL configurations, you continue to assume that ALL versions of MSIE can’t handle persistent connections under SSL/TLS.
Is this true? Is a bug initially caught in MSIE 5.x (5.0??) still valid for MSIE 6.0/7.0?

The short answer is: I don’t know.

It seems that no one in the Apache server team has bothered to go back and see if the current versions of MSIE — we are trying to track down the last three people use MSIE 5.x and help them — still share this problem.

In the meantime, can you change your SSL exclusion RegEx to something more, relevant for 2007?

Current RegEx:
SetEnvIf User-Agent ".*MSIE.*" nokeepalive
	ssl-unclean-shutdown
	downgrade-1.0 force-response-1.0
Relvant, updated REGEX:
SetEnvIf User-Agent ".*MSIE [1-5].*"
	nokeepalive ssl-unclean-shutdown
	downgrade-1.0 force-response-1.0

SetEnvIf User-Agent ".*MSIE [6-9].*"
	ssl-unclean-shutdown

Please? PLEASE? It’s so easy…and would solve so many performance problems…

Please?

Thank you.

Web Performance: Optimizing Page Load Time

Aaron Hopkins posted an article detailing all of the Web performance goodness that I have been advocating for a number of years.

To summarize:

  • Use server-side compression
  • Set your static objects to be cacheable in browser and proxy caches
  • Use keep-alives / persistent connections
  • Turn your browsers’ HTTP pipelining feature on

These ideas are not new, and neither are the finding in his study. As someone who has worked in the Web performance field for nearly a decade, these are old-hat. However, it’s always nice to have someone new inject some life back into the discussion.

Performance Improvement From Caching and Compression

This paper is an extension of the work done for another article that highlighted the performance benefits of retrieving uncompressed and compressed objects directly from the origin server. I wanted to add a proxy server into the stream and determine if proxy servers helped improve the performance of object downloads, and by how much.
Using the same series of objects in the original compression article[1], the CURL tests were re-run 3 times:

  1. Directly from the origin server
  2. Through the proxy server, to load the files into cache
  3. Through the proxy server, to avoid retrieving files from the origin.[2]

This series of three tests was repeated twice: once for the uncompressed files, and then for the compressed objects.[3]
As can be seen clearly in the plots below, compression caused web page download times to improve greatly, when the objects were retrieved from the source. However, the performance difference between compressed and uncompressed data all but disappears when retrieving objects from a proxy server on a corporate LAN.

uncompressed_pages
compressed_pages

Instead of the linear growth between object size and download time seen in both of the retrieval tests that used the origin server (Source and Proxy Load data), the Proxy Draw data clearly shows the benefits that accrue when a proxy server is added to a network to assist with serving HTTP traffic.

 MEAN DOWNLOAD TIME
Uncompressed Pages
Total Time Uncompressed — No Proxy0.256
Total Time Uncompressed — Proxy Load0.254
Total Time Uncompressed — Proxy Draw0.110
Compressed Pages
Total Time Compressed — No Proxy0.181
Total Time Compressed — Proxy Load0.140
Total Time Compressed — Proxy Draw0.104

The data above shows just how much of an improvement is gained by adding a local proxy server, explicit caching descriptions and compression can add to a Web site. For sites that do force a great of requests to be returned directly to the origin server, compression will be of great help in reducing bandwidth costs and improving performance. However, by allowing pages to be cached in local proxy servers, the difference between compressed and uncompressed pages vanishes.

Conclusion

Compression is a very good start when attempting to optimize performance. The addition of explicit caching messages in server responses which allow proxy servers to serve cached data to clients on remote local LANs can improve performance to even a greater extent than compression can. These two should be used together to improve the overall performance of Web sites.


[1]The test set was made up of the 1952 HTML files located in the top directory of the Linux Documentation Project HTML archive.

[2]All of the pages in these tests announced the following server response header indicating its cacheability:

Cache-Control: max-age=3600

[3]A note on the compressed files: all compression was performed dynamically by mod_gzip for Apache/1.3.27.

mod_gzip Compile Instructions

The last time I attempted to compile mod_gzip into Apache, I found that the instructions for doing so were not documented clearly on the project page. After a couple of failed attempts, I finally found the instructions buried at the end of the ChangeLog document.

I present the instructions here to preserve your sanity.

Before you can actually get mod_gzip to work, you have to uncomment it in the httpd.conf file module list (Apache 1.3.x) or add it to the module list (Apache 2.0.x).


Now there are two ways to build mod_gzip: statically compiled into Apache and a DSO-File for mod_so. If you want to compile it statically into Apache, just copy the source to Apache src/modules directory and there into a subdirectory named ‘gzip’. You can activate it via a parameter of the configure script.

 ./configure --activate-module=src/modules/gzip/mod_gzip.a
 make
 make install

This will build a new Apache with mod_gzip statically built in.

The DSO-Version is much easier to build.

 make APXS=/path/to/apxs
 make install APXS=/path/to/apxs
 /path/to/apachectl graceful

The apxs script is normally located inside the bin directory of Apache.

Copyright © 2025 Performance Zen

Theme by Anders NorenUp ↑