Tag: php

GrabPERF: What and Why

Why GrabPERF?

About four years ago, I had a bright idea that I would like to learn more about how to build and scale a small Web performance measurement platform. I’ve worked in the Web performance industry for nearly a decade now, and this was an experimental platform for me to examine and encounter many of the challenges that I see on a daily basis.

The effort was so successful and garnered enough attention during the initial blogging boom that I was able to sell the whole platform for a tiny (that is not a typo) sum to Technorati.

The name is taken from another experimental tool I wrote called GrabIT2 which uses the PHP cURL libraries to capture timings and HTML data for HTTP requests. It is an extension of my articles and writings on Web performance that started at Webperformance.org, and that have since moved to this blog.

What is GrabERF?

GrabPERF is a multi-location measurement platform, based on PERL, cURL, PHP, and MySQL that is designed to

  • Measure the base HTML or a single-object target using HTTP or HTTPS
  • Report the data to a central database (located in the San Francisco Area)
  • Report the data using a GUI or through text based download

Why not Full Pages with all Objects?

Reason 1: I work for a company that already does that. Lawyers and MBAs among you, do the math.

Reason 2: I am an analyst, not a programmer. The best I can say about my measurement script is hack job.

Why is the GrabPERF interface so clunky?

See reason 2 above.

If you want to write your own interface to the data, let me know.

Why has the interface not changed in nearly three years?

The current interface works. It’s simple, clean, and delivers the data that I and the regular users need to analyze performance issues. If there is something more that you would like to see, let me know!

I like what I see. How can I host a measurement location?

Just contact me, and I can provide you with a list of PERL modules you will need to install on your linux server. In return, I need a static IP address of the machine hosting the measurement agent.

How stable is GrabPERF?

Most of the time, I forget it’s even running. I have logged onto the servers and typed in uptime and discovered that it’s been 6 months or more since the servers have been re-booted.

It was designed to be simple, because that’s all I know how to do. The lack of complexity makes it effectively self-managing.

Shouldn’t all systems be that way?

What if my question isn’t asked / answered here?

Your should know the answer to this by now: contact me.

Compressing Web Output Using mod_deflate and Apache 2.0.x


In a previous paper, the use of mod_gzip to dynamically compress the output from an Apache server. With the growing use of the Apache 2.0.x family of Web servers, the question arises of how to perform a similar GZIP-encoding function within this server. The developers of the Apache 2.0.x servers have included a module in the codebase for the server to perform just this task.

mod_deflate is included in the Apache 2.0.x source package, and compiling it in is a simple matter of adding it to the configure command.

	./configure --enable-modules=all --enable-mods-shared=all --enable-deflate

When the server is made and installed, the GZIP-encoding of documents can be enabled in one of two ways: explicit exclusion of files by extension; or by explcit inclusion of files by MIME type. These methods are specified in the httpd.conf file.


Explicit Exclusion

SetOutputFilter DEFLATE
DeflateFilterNote ratio
SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
SetEnvIfNoCase Request_URI .(?:exe|t?gz|zip|bz2|sit|rar)$ no-gzip dont-vary
SetEnvIfNoCase Request_URI .pdf$ no-gzip dont-vary

Explicit Inclusion

DeflateFilterNote ratio
AddOutputFilterByType DEFLATE text/*
AddOutputFilterByType DEFLATE application/ms* application/vnd* application/postscript

Both methods enable the automatic GZIP-encoding of all MIME-types, except image and PDF files, as they leave the server. Image files and PDF files are excluded as they are already in a highly compressed format. In fact, PDFs become unreadable by Adobe’s Acrobat Reader if they are further compressed by mod_deflate or mod_gzip.

On the server used for testing mod_deflate for this article, no Windows executables or compressed files are served to visitors. However, for safety’s sake, please ensure that compressed files and binaries are not GZIP-encoded by your Web server application.

For the file-types indicated in the exclude statements, the server is told explicitly not to send the Vary header. The Vary header indicates to any proxy or cache server which particular condition(s) will cause this response to Vary from other responses to the same request.

If a client sends a request which does not include the Accept-Encoding: gzip header, then the item which is stored in the cache cannot be returned to the requesting client if the Accept-Encoding headers do not match. The request must then be passed directly to the origin server to obtain a non-encoded version. In effect, proxy servers may store 2 or more copies of the same file, depending on the client request conditions which cause the server response to Vary.

Removing the Vary response requirement for objects not handled means that if the objects do not vary due to any other directives on the server (browser type, for example), then the cached object can be served up without any additional requests until the Time-To-Live (TTL) of the cached object has expired.

In examining the performance of mod_deflate against mod_gzip, the one item that distinguished the two modules in versions of Apache prior to 2.0.45 was the amount of compression that occurred. The examples below demonstrate that the compression algorithm for mod_gzip produces between 4-6% more compression than mod_deflate for the same file.[1]

Table 1 – /compress/homepage2.html

CompressionSizeCompression %
No compression56380 bytesn/a
Apache 1.3.x/mod_gzip16333 bytes29% of original
Apache 2.0.x/mod_deflate19898 bytes35% of original

Table 2 – /documents/spierzchala-resume.ps

CompressionSizeCompression %
No Compression63451 bytesn/a
Apache 1.3.x/mod_gzip19758 bytes31% of original
Apache 2.0.x/mod_deflate23407 bytes37% of original

Attempts to increase the compression ratio of mod_deflate in Apache 2.044 and lower using the directives provided for this module produced no further decrease in transferred file size. A comment from one of the authors of the mod_deflate module stated that the module was written specifically to ensure that server performance was not degraded by using this compression method. The module was, by default, performing the fastest compression possible, rather than a mid-range compromise between speed and final file size.

Starting with Apache 2.0.45, the compression level of mod_deflate is configurable using the DeflateCompressionLevel directive. This directive accepts values between 1 (fastest compression speed; lowest compression ratio) and 9 (slowest compression speed; highest compression ratio), with the default value being 6. This simple change makes the compression in mod_deflate comparable to mod_gzip out of the box.

Using mod_deflate for Apache 2.0.x is a quick and effective way to decrease the size of the files that are sent to clients. Anything that can produce between 50% and 80% in bandwidth savings with so little effort should definitely be considered for any and all Apache 2.0.x deployments wishing to use the default Apache codebase.


[1] A note on the compression in mod_deflate for Apache 2.044 and lower: The level of compression can be modified by changing the ZLIB compression setting in mod_deflate.c from Z_BEST_SPEED (equivalent to “gzip -1”) to Z_BEST_COMPRESSION (equivalent to “gzip -9”). These defaults can also be replaced with a numeric value between 1 and 9.

More info on hacking mod_deflate for Apache 2.0.44 and lower can be found here.

Making PHP-to-MySQL Connections Persistent

I have been seeing these bursts of traffic, mainly from spambot morons, that have suddenly been crushing my server. The main cause: excessive database connections.

This was quickly remedied today when I changed all of the mysql_connect statements to mysql_pconnect statements. This allows PHP to use an existing connection to the MySQL database to serve requests from the same Apache child process.

Now the truly geeky among you are going “DOH! Wadda ya mean you were opening a new connection for every request?”. Well, believe it or not, I will bet you dollars to doughnuts that your blog app doesn’t persist database connections. Not a big deal if your database is on the same machine, and you are using local named pipes to make requests. However, if that database is located on another machine, if you do a netstat, you will see a large number of connection on port 3306.

Persisting database connections is particularly important for large hosted services. A great deal of TCP overhead, and kernel space memory can be saved by simply not letting the Web server saturate the database with individual database connections for every page request.

Without persistent database connections, eventually the TCP queue will be full of database connections and no one will be able to connect to the server, or they will get a lovely “can’t connect to database error”.

When the tracking site goes down…

So, what is a geek to do when SiteMeter goes down? He writes his own tracking code and embeds it on his blog page!

Really simple PHP Code:

<?php
include([DATABASE CONNECTION INCLUDE]);
$logtime = date("YmdHis");
$ipquery = sprintf("%u",ip2long($REMOTE_ADDR));

if ($REMOTE_ADDR != [EXCLUDE SOME IPS]){
        $query2 = "INSERT into logger.blog_log
values ($logtime,$ipquery,'$HTTP_USER_AGENT','$HTTP_REFERER')";
        mysql_query($query2) or die("Log Insert Failed");
        mysql_close($link);
}

print "<META HTTP-EQUIV=Refresh CONTENT="0; URL=[IMAGE FILE]"/>";

?>

Then I create a table in my logging database to trap the results. Once I have that, I created an IFRAME call in an MT Typelist and away we go!

There is always a geeky solution to a customer service issue. If this works, I will cancel my SiteMeter subscription.

Copyright © 2024 Performance Zen

Theme by Anders NorenUp ↑