Category: Uncategorized

  • GrabPERF: State of the System

    This is actually a short post to write, as the state of the GrabPERF system is currently very healthy. There was an eight-hour outage in early August 2008, but that was a fiber connectivity issue, not a system issue.
    Over the history of ther service, we have been steadily increasing the number of measurements we take each day.

    grabperf-measurements-per-day 

    The large leap occurred when a very large number of tests were added to the system on a single day. But based on this data, the system is gathering more than 900,000 measurements every day.
    Thanks to all of the people who volunteer their machines and bandwidths to support this effort!

  • Chrome v. Firefox – The Container and The Desktop

    The last two days of using Chrome have had me thinking about the purpose of the Web browser in today’s world. I’ve talked about how Chrome and Firefox have changed how we see browsers, treating them as interactive windows into our daily life, rather than the uncontrolled end of an information firehose.
    These applications, that on the surface seem to serve the same purpose, have taken very different paths to this point. Much has been made about Firefox growing out of the ashes of Netscape, while Chrome is the Web re-imagined.
    It’s not just that.
    Firefox, through the use of extensions and helper applications, has grown to become a Desktop replacement. Back when Windows for Workgroups was the primary end-user OS (and it wasn’t even an OS), Norton Desktop arrived to provide all of the tools that didn’t ship with the OS. It extended and improved on what was there, and made WFW a better place.
    Firefox serves that purpose in the browser world. With its massive collections of extensions, it adds the ability to customize and modify the Web workspace. These extensions even allow the incoming content to be modified and reformatted in unique ways to suit the preferences of each individual. These features allowed the person using Firefox to feel in control, empowered.
    You look at the Firefox installs of the tech elite, and no two installed versions will be configured in the same way. Firefox extends the browser into an aggregator of Web data and information customization.
    But it does it at the Desktop.
    Chrome is a simple container. There is (currently) no way to customize the look and feel, extend the capabilities, or modify the incoming or outgoing content. It is a simple shell designed to perform two key functions: search for content and interact with Web applications.
    There are, of course, the hidden geeky functions that they have built into the app. But those don’t change what it’s core function is: request, receive, and render Web pages as quickly and efficiently as possible. Unlike Firefox’s approach, which places the app being the center of the Web, Chrome places the Web at the center of the Web.
    There is no right or wrong approach. As with all things in this complicated world we are in, it depends. It depends on what you are trying to accomplish and how you want to get there.
    The conflict that I see appearing over the next few months is not between IE and Firefox and Safari and Opera and Chrome. It is a conflict over what the people want from an application that they use all the time. Do they want a Web desktop or a Web container?

  • Chrome and Advertising – Google's Plan

    Since I downloaded and started using Chrome yesterday, I have had to rediscover the world of online advertising. Using Firefox and Adblock Plus for nearly three years has shielded from their existence for the most part.
    Stephen Noble, in a post on the Forrester Blog for Interactive Marketing Professionals, seems to discover that Chrome will be a source for injecting greater personalization and targeting into the online advertising market.
    This is the key reason Chrome exists, right now.
    While their may be discussions about the online platform and hosted applications, there are only a small percentage of Internet users who rely on hosted desktop-like applications, excluding email, in their daily work and life.
    However, Google’s biggest money-making ventures are advertising and search. With control of AdSense and DoubleClick, there is no doubt that Google controls a vast majority of the targeted and contextual advertising market, around the world.
    One of the greatest threats to this money-making is a lack of control of the platform through which ads are delivered. There is talk of IE8 blocking ads (well, non-Microsoft ads anyway), and one of the more popular extensions for Firefox is Adblock Plus. While Safari doesn’t have this ability natively built in, it can be supported by any number of applications that, in the name of Internet security, filter and block online advertisers using end-user proxies.
    This threat to Google’s core revenue source was not ignored in the development of Chrome. One of the options is the use of DNS pre-fetching. Now I haven’t thrown up a packet sniffer, but what’s to prevent a part of the pre-fetching algorithm to go beyond DNS for certain content, and pre-fetch the whole object, so that the ads load really fast, and in that way are seen as less intrusive.
    Ok, so I am noted for having a paraoid streak.
    However, using the fastest rendering engine and a rocket-ship fast Javascript VM is not only good for the new generation of online Web applications, but plays right into the hands of improved ad-delivery.
    So, while Chrome is being hailed as the first Web application environment, it is very much a context Web advertising environment as well.
    It’s how it was built.

  • Hit Tracking with PHP and MySQL

    Recently there was an outage at a hit-tracking vendor I was using to track the hits on my externally hosted blog, leaving me with a gap in my visitor data several hours long. While this was an inconvenience for me, I realized that this could be mission critical failure to an online business reliant on this data.
    To resolve this, I used the PHP HTTP environment variables and the built-in function for converting IP addresses to IP numbers to create my own hit-tracker. It is a rudimentary tracking tool, but it provides me with the basic information I need to track visitors.
    To begin, I wrote a simple PHP script to insert tracking data into a MySQL database. How do you do that? You use the gd features in PHP to draw an image, and insert the data into the database.


    header ("Content-type: image/png");
    include("dbconnect_logger.php");
    $logtime = date("YmdHis");
    $ipquery = sprintf("%u",ip2long($_SERVER['REMOTE_ADDR']));
            $query2 = "INSERT into logger.blog_log values \
                   ($logtime,$ipquery,'$HTTP_USER_AGENT','$HTTP_REFERER')";
            mysql_query($query2) or die("Log Insert Failed");
    mysql_close($link);
    $im = @ImageCreate (1, 1)
    or die ("Cannot Initialize new GD image stream");
    $background_color = ImageColorAllocate ($im, 224, 234, 234);
    $text_color = ImageColorAllocate ($im, 233, 14, 91);
    // imageline ($im,$x1,$y1,$x2,$y2,$text_color);
    imageline ($im,0,0,1,2,$text_color);
    imageline ($im,1,0,0,2,$text_color);
    ImagePng ($im);
    ?>

    Next, I created the database table.


    DROP TABLE IF EXISTS `blog_log`;
    CREATE TABLE `blog_log` (
      `date` timestamp NOT NULL default '0000-00-00 00:00:00',
      `ip_num` double NOT NULL default '0',
      `uagent` varchar(200) default NULL,
      `visited_page` varchar(200) NOT NULL default '',
      UNIQUE KEY `date` (`date`,`ip_num`,`visited_page`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

    It’s done. I can now log any request I want using this embedded tracker.
    Data should begin flowing to your database immediately. This sample snippet of code will allow you to pull data for a selected day and list each individual hit.


    $query1 = "SELECT
                    bl.ip_num,
                    DATE_FORMAT(bl.date,'%d/%b/%Y %H:%i:%s') AS NEW_DATE,
                    bl.uagent,
                    bl.visited_page
            FROM blog_log bl
            WHERE
                    DATE_FORMAT(bl.date,'%Y%m%d') ='$YMD'
    		and uagent not REGEXP '(.*bot.*|.*crawl.*|.*spider.*|^-$|.*slurp.*|.*walker.*|.*lwp.*|.*teoma.*|.*aggregator.*|.*reader.*|.*libwww.*)'
            ORDER BY bl.date ASC";
    print "<table border=\"1\">\n";
    print "<tr><td>IP</td><td>DATE</td><td>USER-AGENT</td><td>PAGE VIEWED</td></tr>";
    while ($row = mysql_fetch_array($result1)) {
            $visitor = long2ip($row[ip_num]);
            print "<tr><td>$visitor</td><td nowrap>$row[NEW_DATE]</td><td nowrap>$row[uagent]</td><td>";
    	if ($row[visited_page] == ""){
        	    print " --- </td></tr>\n";
    	} else {
        	    print "<a href=\"$row[visited_page]\" target=\_blank\">$row[visited_page]</a></td></tr>\n";
    	}
    }
    mysql_close($link);

    And that’s it. A few lines of code and you’re done. With a little tweaking, you can integrate the IP number data with a number of Geographic IP databases available for purchase to track by country and ISP, and using graphics applications for PHP, you can add graphs.
    For my own purposes, this is an extension of the Geographic IP database I created a number of years ago. This application extracts IP address information from the five IP registrars, and inserts it into a database. Using the log data collected by the tracking bug above and the lookup capabilities of the Geographic IP database, I can quickly track which countries and ISP drive the most visitors to my site, and use this for general interest purposes, as well as the ability to isolate any malicious visitors to the site.

  • Browsers: The Window and The Firehose

    Three years ago, in a post on this blog, I stated that I thought that the browser was becoming less important as more data moved into streams of data through RSS and aggregated feeds, as well as a raft of other consumer-oriented Web services.
    This position was based on the assumption that the endpoint, in the form of installed applications, wouldcontinue to serve as the focus for user interactions, that these applications would be the points where data was accumulated and processed by users. This could be best described as the firehose: The end-user desktop would be at the end of a flood of data being pushed to it a never-ending flood.
    Firefox and Chrome have changed all of that.
    The browser has, instead, become the window through which we view and manipulate our data. It’s now ok, completely acceptable in fact, to use online applications as replacements for installed applications, stripping away a profit engine that has fed so many organizations over the years.
    The endpoint has been shown to be the access point to our applications, to our data. Data is not brought and stored locally: It is stored remotely and manipulated like a marionette from afar.
    While Chrome and Firefox are not perfect, they serve as powerful reminders of what the Web is, and why the browser exists. The Browser is not the end of a flod of incoming data, it is the window through which we see our online world.
    While some complain that there is still an endless stream of data, we control and manipulate it. It doesn’t flood us.

  • Google Chrome: First Impressions

    Google Chrome is out. And from first impressions, it is stinking fast. However, i do have some gripes.

    1. Comes with link underlining enabled. I hate this. It’s the first think I disable in Firefox and any browser that supports disabling underlining
    2. Where’s the “get your hands dirty under the hood” option list? I love the Firefox about:config list. Chrome needs this.
    3. Ads. I know. There is little chance for built in ad-blocking, but it’s on my wish-list.

    Otherwise, it’s good…so far. And the memory usage is, well, definitely less intrusive.
    I plan to use this for a while and see what happens. I will likely find something that drives me back to Firefox eventually.
    Ok, found a weirdness when you use a <li> tag in the WordPress editor. It seems that it starts injecting <div> tags to differentiate paragraphs after you close out the list.

  • Web Performance: TechCrunch Goes Crunch

    It’s the first day back after the last long weekend of the summer. There is a a great amount of news flooding the intertubes, and what happens?
    TechCrunch has a small issue.

    techcrunch-crunch-sep022008

    It’s likely they’ll be back soon, but it’s still an interesting thing to see.

    Update – 09:17 EDT (13:17 GMT)

    TechCrunch is back up as of 08:49 EDT (12:49 GMT).

  • Google Chrome: See No Evil, Do No Evil – An Internet Performance Perspective

    The intertubes of the Web are abuzz with talk of the new, open-source Google Chrome browser [two articles here and here]. I will not presume to wade into the debate of whether it is necessary, or what strategic business goals Google has set that rely on having its own browser. I will limit my comments to the area of Web performance.

    Open-Source Browser: Ours or Theirs?

    When I read that Google Chrome was an open-source browser, the first thought was: is it theirs or a re-branded Firefox? No one knows at this point, but that will have a direct effect on how the browser performs, and how extensible it will be.

    HTTP Standards

    Unlike other standards, HTTP standards set out how a browser uses the underlying TCP stack. MSIE6/7 have very broken implementations, and MSIE8 is building on those by increasing the number of connections per host to 6, up from 2 set out in RFC 2616.
    Firefox can be configured to mangle this as well, but by default it plays by the standard, adding the option of HTTP pipelining into its mix of persistent HTTP connections.
    It will be VERY interesting to see how Google Chrome comes configured out of the box, and how much control users have over the HTTP behaviour of this new browser.

    (X)HTML/CSS/JS Standards

    This area is a mess. No browser implements this standards in a way that is completely consistent with the written text, and page designers have to use a variety of page testing products (such as BrowserCam) prior to release to ensure that their design is somewhat presentable in all browsers on all platforms.
    The rendering of Javascript will be crucial in this new browser, as so much of the new Web is built on applications that are almost completely Javascript-driven.
    I am sure that there will be sites that will be completely mangled by the new browser, but, knowing Google, we will be getting a 2.0 release, the 1.0 release being used within Google for a while now to test it under real-world conditions.

    Caching

    As a few sites in the world do use cache-control headers properly, it will be interesting to see how a browser created by one of the major ad-serving and search providers on the Web tracks page objects. Will it follow explicit/implicit caching rules? Or will it impose a heavy penalty on bandwidth by downloading objects more frequently than other production browsers do?

    Proxies, and the Debacle of the Google Web Accelerator

    Back in 2005, Google launched a badly designed and gighly flawed product called the Google Web Accelerator. This product proxied Web traffic through the Google network and allowed the company to develop a pattern of user browsing habits and search selections that would allow them to better target their ad products.
    I have a great fear that this will be an integrated part of the Google browser project. If it is, it should be a configurable option, not an out-of-the box standard.
    I am sure that there will be a few performance conversations that occur around the Google Chrome browser in the weeks ahead. I look forward to hearing what the community has to say about this new addition to the browser wars.

  • Thoughts on the China Market

    At the The China Vortex, Paul Denlinger discusses how there is no unified “China market”, no monolithic, simplistic, single-minded Goliath that the rest of the world is trying to deal with. While I do not have the depth of on the ground experience that Mr. Denlinger has (I have not yet been blessed with the opportunity to visit or do business in China), I can see the truth he brings to the discussion.
    One of the great pits that Western culture falls into when dealing with the China problem is just that: It is seen as a problem, not an opportunity to expand and learn from a culture that deals with life, philosophy, and business in a very different manner.
    This should come as no surprise to any astute student of History, or even modern geopolitics, as the way that nations deal with perceived threats or challenges is to create a national culture of The Other, the us-v-them foreign policy.
    When Japan was the country du jour in the 1980s, the Western World respected it, in a very shallow way, as a fellow industrial nation with a strong warrior culture. However, it was treated in a simple way, with Western media portrayals that strengthened perceived stereotypes, and plastered over the profound differences that exist within Japan, and within the Japanese people.
    China is even more of a victim of this Politics of the Other, having spent more than 50 years as one of the adversaries in the Cold War, being vilified and portrayed in the least flattering light possible. Even without the base Human interpretation of simplistic interpretations of the Other, the West is crippled from the start in its attempts to understand a nation as large, diverse, and fractured as China.
    China is far more than Beijing, Shanghai, Hong Kong and small cadre of smaller, but no less important industrial / post-industrial metropolitan areas.
    Drawing on my experience in trying to interpret Internet performance data from within this nation, it is clear to even the casual observer that the Chinese Internet does not simply exist in the major cities. It extends into the far reaches of the country, fractured by the internal conflicts of the connectivity providers, government officials at a many levels, and the unstoppable drive and creativity of the people who see the Internet as an opportunity to make their way in their world.
    Cultural and national stereotypes are the way that humans ineffectively deal with the differences that exist. But just as the terms “All Brits..”, “All the French..”, “All Germans…”, “All Argentinians..”, et al. should be treated with disdain and seen as a sign of ignorance, using the words “All Chinese…” or “All of China…” should be quickly quashed and carted off to the dustbin of simplistic paranoia and xenophobia.
    There is no such thing as a threat. As it is often stated in other contexts, a threat is simply an opportunity that is hidden by your own prejudices.

  • A US Presidential Election Survey…for Immigrants and Visa-Holders

    This is a poll designed for those of us who are here legally, but who cannot influence the outcome of this election which will affect us so profoundly. Tell us here at Newest Industry what scares you the most.

    [poll id=”4″]