Month: September 2008

Thoughts on Web Performance at the Browser

2008-09-09 / spierzchala / 0 Comments

Last week, lost in the preternatural shriek that emerged from the Web community around the release of Google Chrome, John Resig posted a thoughtful post on resources usage at the browser. In it, he states that the use of the Process Manager in Chrome will change how people see Web performance. In his words:

The blame of bad performance or memory consumption no longer lies with the browser but with the site.

Coming to the discussion from the realm of Web performance measurement, I realize that the firms I have worked with and for have not done a good job of analyzing this , and, in the name of science have tried to eliminate the variability of Web page processing from the equation.
The company I currently work for has realized that this is a gap and has released a product that measures the performance of a page in the browser.
But all of this misses the point, and goes to one of the reasons why I gave up on Chrome on my older, personal-use computer: Chrome exposes the individual load that a page places on a Web browser.
Resig highlights that browser that make use of shared resources shift the blame about poor performance out to the browser and away from the design of the page. Technologies that modern designers lean on (Flash, AJAX, etc.) all require substantially greater resource-consumption in a browser. Chrome, for good or ill, exposes this load to the user be instantiating a separate, sand-boxed process for each tab, clearing indicating which page is the culprit.
It will be interesting if designers take note of this, or ignore in pursuit of the latest shiny toy that gets released. While designers assume that all visitors run the cutting edge of machine, I can show them that a laptop that is still plenty useful is completely locked up when their page is handled in isolation.

Browser Wars II: Why I returned to Firefox

2008-09-07 / spierzchala / 1 Comment

Since the release of Google Chrome on September 2, I have been using it as my day-to-day browser. Spending up to 80% of my computer time in a browser means that this was decision which affected a huge portion of my online experience.

I can say that I put Chrome through its paces, on a wide-variety of sites, from the simple to the extremely content-rich. From the mainstream, to the questionable.
This morning I migrated back to Firefox, albeit the latest Minefield/Firefox 3.1alpha.

The reasons listed below are mine. Switching back is a personal decision and everyone is likely to have their own reasons to do it, or to stay.

Advertising

I mentioned a few times during my initial use of Chrome that I was having to become used to the re-appearance of advertising in my browsing experience [here and here]. From their early release as extensions to Firefox, I have used AdBlock and AdBlock Plus to remove the annoyance and distraction of online ads from my browsing experience.

When I moved to Chrome, I had to accept that I would see ads. I mean, we were dealing with a browser distributed by one of the largest online advertising agencies. It could only be expected that they were not going to allow people to block ads out of the gate, if ever.

As the week progressed, I realized that I was finding the ads to be a distraction from my browsing experience. Ads impede my ability to find the information I need quickly.

Older Machines

My primary machine for online experiences at home is a Latitude D610. This is a 3-4 year-old laptop, with a single core. It is still far more computing power than most people actually need to enjoy the Web.

While cruising with Chrome, I found that Flash locked up the entire machine on a very regular basis. Made it unsuable. This doesn’t happen on my much more powerful Latitude D630, provided by my work. However, as I have a personal laptop, I am not going to use my work computer for my personal stuff, especially at home.

I cannot have a browser that locks up a machine when I simply close a tab. It appears that the vaunted QA division at Google overlooked the fact that people don’t all run the latest and greatest machines in the real world.

Auto-Complete

I am completely reliant on form auto-completes. Firefox has been doing this for me for a long time, and it is very handy to simply start typing and have Firefox say “Hey! This form element is called email. Here are some of the other things you have put into form elements called email.”

If you can build something as complex as the OmniBox, surely you can add form auto-completes.

The OmniBox

I hate it. I really do. I like having my search and addresses separate. I also like an address bar that remembers complete URLs (including those pesky parameters!), rather than simply the top-level domain name.

It is a cool idea, but it needs some refining, and some customer-satisfaction focus groups.

I Don’t Use Desktop-replacing Web Applications

I do almost all of my real work in desktop-installed Web applications. I have not made the migration to Web applications. I may in the future. But until then, I do not need a completely clean browsing experience. I mentioned that the battle between Chrome and Firefox will come down to the Container v. the Desktop – a web application container, or a desktop-replacing Web experience application.
In the last 48 hours, I have fallen back into the Web-desktop camp.

Summary

In the future, I will continue to use Chrome to see how newer builds advance, and how it evolves as more people begin dictating the features that should be available to it.

For my personal use, Chrome takes away too much from, and injects too much noise into, my daily Web experience to continue to use as the default browser. To quote more than a few skeptics of Chrome when it was relased – “It’s just another browser”.

DNS: Without it, your site does not exist

2008-09-05 / spierzchala / 0 Comments

In my presentations and consultations on Web performance, I emphasize the importance of a correctly configured DNS system with the phrase: “If people can’t resolve your hostname, your site is dead in the water”.

Yesterday, it appears that the large anti-virus and security firm Sophos discovered this lesson the hard way.

Of course hindsight is perfect, so I won’t dwell for too long on this single incident. The lesson to be learned here is that DNS is complex and critical, yet is sometimes overlooked when considered the core issues of Web performance and end-user experience.

This complexity means that if an organization is not comfortable managing their own DNS, or want to broaden and deepen their DNS infrastructure, there are a large number of firms who will assist with this process. These firms whose entire business is based on managing large-scale DNS implementations for organizations.

DNS is critical. Never take it for granted.

Joost: A change to the program

2008-09-05 / spierzchala / 0 Comments

In April 2007, I tried out the Joost desktop client.Â Â [More on Joost here and here]
I was underwhlemed by the performance, and the fact that the application completely maxxed out my dual core CPU, my 2G of RAM, and my high-speed home broadband. I do remember thinking at the time that it seemed weird to have a Desktop Client in the first place. Well, as Om Malik reports this morning, it seems that I was not alone.
After this week’s hoopla over Chrome, moving in the direction of the browser seems like a wise thing to do. But I definitely hear far more buzz over Hulu than I do for Joost on the intertubes.

Update

Michael Arrington and TechCrunch weigh into the discussion.

Web Performance, Part IX: Curse of the Single Metric

2008-09-05 / spierzchala / 0 Comments

While this post is aimed at Web performance, the curse of the single metric affects our everyday lives in ways that we have become oblivious to.

When you listen to a business report, the stock market indices are an aggregated metric used to represent the performance of a set group of stocks.

When you read about economic indicators, these values are the aggregated representations of complex populations of data, collected from around the country, or the world.

Sport scores are the final tally of an event, but they may not always represent how well each team performed during the match.

The problem with single metrics lies in their simplicity. When a single metric is created, it usually attempts to factor in all of the possible and relevant data to produce an aggregated value that can represent a whole population of results.
These single metrics are then portrayed as a complete representation of this complex calculation. The presentation of this single metric is usually done in such a way that their compelling simplicity is accepted as the truth, rather than as a representation of a truth.

In the area of Web performance, organizations have fallen prey to this need for the compelling single metric. The need to represent a very complex process in terms that can be quickly absorbed and understand by as large a group of people as possible.

The single metrics most commonly found in the Web performance management field are performance (end-to-end response time of the tested business process) and availability (success rate of the tested business process). These numbers are then merged and transformed by data from a number of sources (external measurements, hit counts, conversions, internal server metrics, packet loss), and this information is bubbled up in an organization. By the time senior management and decision-makers receive the Web performance results, that are likely several steps removed from the raw measurement data.

An executive will tell you that information is a blessing, but only when it speeds, rather than hinders, the decision-making process. A Web performance consultant (such as myself) will tell that basing your decisions on a single metric that has been created out of a complex population of data is madness.

So, where does the middle-ground lie between the data wonks and the senior leaders? The rest of this post is dedicated to introducing a few of the metrics that will, in a small subset of metrics, give a senior leaders better information to work from when deciding what to do next.

A great place to start this process is to examine the percentile distribution of measurement results. Percentiles are known to anyone who has children. After a visit to the pediatrician, someone will likely state that “My son/daughter is in the XXth percentile of his/her age group for height/weight/tantrums/etc”. This means that XX% of the population of children that age, as recorded by pediatricians, report values at or below the same value for this same metric.

Percentiles are great for a population of results like Web performance measurement data. Using only a small set of values, anyone can quickly see how many visitors to a site could be experiencing poor performance.

If at the median (50th percentile), the measured business process is 3.0 seconds, this means that 50% of all of the measurements looked at are being completed in 3.0 seconds or less.

If the executive then looks up to the 90th percentile and sees that it’s at 16.0 seconds, it can be quickly determined that something very bad has happened to affect the response times collected for the 40% of the population between these two points. Immediately, everyone knows that for some reason, an unacceptable number of visitors are likely experiencing degraded and unpredictable performance when they visit the site.

A suggestion for enhancing averages with percentiles is to use the 90th percentile value as a trim ceiling for the average. Then side-by-side comparisons of the untrimmed and trimmed averages can be compared. For sites with a larger number of response time outliers, the average will decrease dramatically when it is trimmed, while sites with more consistent measurement results will find their average response time is similar with and without the trimmed data.

It is also critical to examine the application’s response times and success rates throughout defined business cycles. A single response time or success rate value eliminates

variations by time of day
variations by day of week
variations by month
variations caused by advertising and marketing

An average is just an average. If at peak buiness hours, response times are 5.0 seconds slower than the average, then the average is meaningless, as business is being lost to poor performance which has been lost in the focus on the single metric.

All of these items have also fallen prey to their own curse of the single metric. All of the items discussed above aggregate the response time of the business process into a single metric. The process of purchasing items online is broken down into discrete steps, and different parts of this process likely take longer than others. And one step beyond the discrete steps are the objects and data that appear to the customer during these steps.

It is critical to isolate the performance for each step of the process to find the bottlenecks to performance. Then the components in those steps that cause the greatest response time or success rate degradation must be identified and targeted for performance improvement initiatives. If there are one or two poorly performing steps in a business process, focusing performance improvement efforts on these is critical, otherwise precious resources are being wasted in trying to fix parts of the application that are working well.

In summary, a single metric provides a sense of false confidence, the sense that the application can be counted on to deliver response times and success rates that are nearly the same as those simple, single metrics.

The average provides a middle ground, a line that says that is the approximate mid-point of the measurement population. There are measurements above and below this average, and you have to plan around the peaks and valleys, not the open plains. It is critical never to fall victim to the attractive charms that come with the curse of the single metric.

GrabPERF: State of the System

2008-09-04 / spierzchala / 0 Comments

This is actually a short post to write, as the state of the GrabPERF system is currently very healthy. There was an eight-hour outage in early August 2008, but that was a fiber connectivity issue, not a system issue.
Over the history of ther service, we have been steadily increasing the number of measurements we take each day.

The large leap occurred when a very large number of tests were added to the system on a single day. But based on this data, the system is gathering more than 900,000 measurements every day.
Thanks to all of the people who volunteer their machines and bandwidths to support this effort!

Chrome v. Firefox – The Container and The Desktop

2008-09-04 / spierzchala / 0 Comments

The last two days of using Chrome have had me thinking about the purpose of the Web browser in today’s world. I’ve talked about how Chrome and Firefox have changed how we see browsers, treating them as interactive windows into our daily life, rather than the uncontrolled end of an information firehose.
These applications, that on the surface seem to serve the same purpose, have taken very different paths to this point. Much has been made about Firefox growing out of the ashes of Netscape, while Chrome is the Web re-imagined.
It’s not just that.
Firefox, through the use of extensions and helper applications, has grown to become a Desktop replacement. Back when Windows for Workgroups was the primary end-user OS (and it wasn’t even an OS), Norton Desktop arrived to provide all of the tools that didn’t ship with the OS. It extended and improved on what was there, and made WFW a better place.
Firefox serves that purpose in the browser world. With its massive collections of extensions, it adds the ability to customize and modify the Web workspace. These extensions even allow the incoming content to be modified and reformatted in unique ways to suit the preferences of each individual. These features allowed the person using Firefox to feel in control, empowered.
You look at the Firefox installs of the tech elite, and no two installed versions will be configured in the same way. Firefox extends the browser into an aggregator of Web data and information customization.
But it does it at the Desktop.
Chrome is a simple container. There is (currently) no way to customize the look and feel, extend the capabilities, or modify the incoming or outgoing content. It is a simple shell designed to perform two key functions: search for content and interact with Web applications.
There are, of course, the hidden geeky functions that they have built into the app. But those don’t change what it’s core function is: request, receive, and render Web pages as quickly and efficiently as possible. Unlike Firefox’s approach, which places the app being the center of the Web, Chrome places the Web at the center of the Web.
There is no right or wrong approach. As with all things in this complicated world we are in, it depends. It depends on what you are trying to accomplish and how you want to get there.
The conflict that I see appearing over the next few months is not between IE and Firefox and Safari and Opera and Chrome. It is a conflict over what the people want from an application that they use all the time. Do they want a Web desktop or a Web container?

Chrome and Advertising – Google's Plan

2008-09-03 / spierzchala / 1 Comment

Since I downloaded and started using Chrome yesterday, I have had to rediscover the world of online advertising. Using Firefox and Adblock Plus for nearly three years has shielded from their existence for the most part.
Stephen Noble, in a post on the Forrester Blog for Interactive Marketing Professionals, seems to discover that Chrome will be a source for injecting greater personalization and targeting into the online advertising market.
This is the key reason Chrome exists, right now.
While their may be discussions about the online platform and hosted applications, there are only a small percentage of Internet users who rely on hosted desktop-like applications, excluding email, in their daily work and life.
However, Google’s biggest money-making ventures are advertising and search. With control of AdSense and DoubleClick, there is no doubt that Google controls a vast majority of the targeted and contextual advertising market, around the world.
One of the greatest threats to this money-making is a lack of control of the platform through which ads are delivered. There is talk of IE8 blocking ads (well, non-Microsoft ads anyway), and one of the more popular extensions for Firefox is Adblock Plus. While Safari doesn’t have this ability natively built in, it can be supported by any number of applications that, in the name of Internet security, filter and block online advertisers using end-user proxies.
This threat to Google’s core revenue source was not ignored in the development of Chrome. One of the options is the use of DNS pre-fetching. Now I haven’t thrown up a packet sniffer, but what’s to prevent a part of the pre-fetching algorithm to go beyond DNS for certain content, and pre-fetch the whole object, so that the ads load really fast, and in that way are seen as less intrusive.
Ok, so I am noted for having a paraoid streak.
However, using the fastest rendering engine and a rocket-ship fast Javascript VM is not only good for the new generation of online Web applications, but plays right into the hands of improved ad-delivery.
So, while Chrome is being hailed as the first Web application environment, it is very much a context Web advertising environment as well.
It’s how it was built.

Hit Tracking with PHP and MySQL

2008-09-03 / spierzchala / 7 Comments

Recently there was an outage at a hit-tracking vendor I was using to track the hits on my externally hosted blog, leaving me with a gap in my visitor data several hours long. While this was an inconvenience for me, I realized that this could be mission critical failure to an online business reliant on this data.
To resolve this, I used the PHP HTTP environment variables and the built-in function for converting IP addresses to IP numbers to create my own hit-tracker. It is a rudimentary tracking tool, but it provides me with the basic information I need to track visitors.
To begin, I wrote a simple PHP script to insert tracking data into a MySQL database. How do you do that? You use the gd features in PHP to draw an image, and insert the data into the database.

header ("Content-type: image/png");
include("dbconnect_logger.php");
$logtime = date("YmdHis");
$ipquery = sprintf("%u",ip2long($_SERVER['REMOTE_ADDR']));
        $query2 = "INSERT into logger.blog_log values \
               ($logtime,$ipquery,'$HTTP_USER_AGENT','$HTTP_REFERER')";
        mysql_query($query2) or die("Log Insert Failed");
mysql_close($link);
$im = @ImageCreate (1, 1)
or die ("Cannot Initialize new GD image stream");
$background_color = ImageColorAllocate ($im, 224, 234, 234);
$text_color = ImageColorAllocate ($im, 233, 14, 91);
// imageline ($im,$x1,$y1,$x2,$y2,$text_color);
imageline ($im,0,0,1,2,$text_color);
imageline ($im,1,0,0,2,$text_color);
ImagePng ($im);
?>

Next, I created the database table.

DROP TABLE IF EXISTS `blog_log`;
CREATE TABLE `blog_log` (
  `date` timestamp NOT NULL default '0000-00-00 00:00:00',
  `ip_num` double NOT NULL default '0',
  `uagent` varchar(200) default NULL,
  `visited_page` varchar(200) NOT NULL default '',
  UNIQUE KEY `date` (`date`,`ip_num`,`visited_page`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

It’s done. I can now log any request I want using this embedded tracker.
Data should begin flowing to your database immediately. This sample snippet of code will allow you to pull data for a selected day and list each individual hit.

$query1 = "SELECT
                bl.ip_num,
                DATE_FORMAT(bl.date,'%d/%b/%Y %H:%i:%s') AS NEW_DATE,
                bl.uagent,
                bl.visited_page
        FROM blog_log bl
        WHERE
                DATE_FORMAT(bl.date,'%Y%m%d') ='$YMD'
		and uagent not REGEXP '(.*bot.*|.*crawl.*|.*spider.*|^-$|.*slurp.*|.*walker.*|.*lwp.*|.*teoma.*|.*aggregator.*|.*reader.*|.*libwww.*)'
        ORDER BY bl.date ASC";
print "<table border=\"1\">\n";
print "<tr><td>IP</td><td>DATE</td><td>USER-AGENT</td><td>PAGE VIEWED</td></tr>";
while ($row = mysql_fetch_array($result1)) {
        $visitor = long2ip($row[ip_num]);
        print "<tr><td>$visitor</td><td nowrap>$row[NEW_DATE]</td><td nowrap>$row[uagent]</td><td>";
	if ($row[visited_page] == ""){
    	    print " --- </td></tr>\n";
	} else {
    	    print "<a href=\"$row[visited_page]\" target=\_blank\">$row[visited_page]</a></td></tr>\n";
	}
}
mysql_close($link);

And that’s it. A few lines of code and you’re done. With a little tweaking, you can integrate the IP number data with a number of Geographic IP databases available for purchase to track by country and ISP, and using graphics applications for PHP, you can add graphs.
For my own purposes, this is an extension of the Geographic IP database I created a number of years ago. This application extracts IP address information from the five IP registrars, and inserts it into a database. Using the log data collected by the tracking bug above and the lookup capabilities of the Geographic IP database, I can quickly track which countries and ISP drive the most visitors to my site, and use this for general interest purposes, as well as the ability to isolate any malicious visitors to the site.

Browsers: The Window and The Firehose

2008-09-03 / spierzchala / 1 Comment

Three years ago, in a post on this blog, I stated that I thought that the browser was becoming less important as more data moved into streams of data through RSS and aggregated feeds, as well as a raft of other consumer-oriented Web services.
This position was based on the assumption that the endpoint, in the form of installed applications, wouldcontinue to serve as the focus for user interactions, that these applications would be the points where data was accumulated and processed by users. This could be best described as the firehose: The end-user desktop would be at the end of a flood of data being pushed to it a never-ending flood.
Firefox and Chrome have changed all of that.
The browser has, instead, become the window through which we view and manipulate our data. It’s now ok, completely acceptable in fact, to use online applications as replacements for installed applications, stripping away a profit engine that has fed so many organizations over the years.
The endpoint has been shown to be the access point to our applications, to our data. Data is not brought and stored locally: It is stored remotely and manipulated like a marionette from afar.
While Chrome and Firefox are not perfect, they serve as powerful reminders of what the Web is, and why the browser exists. The Browser is not the end of a flod of incoming data, it is the window through which we see our online world.
While some complain that there is still an endless stream of data, we control and manipulate it. It doesn’t flood us.