Hit Tracking with PHP and MySQL

Recently there was an outage at a hit-tracking vendor I was using to track the hits on my externally hosted blog, leaving me with a gap in my visitor data several hours long. While this was an inconvenience for me, I realized that this could be mission critical failure to an online business reliant on this data.
To resolve this, I used the PHP HTTP environment variables and the built-in function for converting IP addresses to IP numbers to create my own hit-tracker. It is a rudimentary tracking tool, but it provides me with the basic information I need to track visitors.
To begin, I wrote a simple PHP script to insert tracking data into a MySQL database. How do you do that? You use the gd features in PHP to draw an image, and insert the data into the database.


header ("Content-type: image/png");
include("dbconnect_logger.php");
$logtime = date("YmdHis");
$ipquery = sprintf("%u",ip2long($_SERVER['REMOTE_ADDR']));
        $query2 = "INSERT into logger.blog_log values \
               ($logtime,$ipquery,'$HTTP_USER_AGENT','$HTTP_REFERER')";
        mysql_query($query2) or die("Log Insert Failed");
mysql_close($link);
$im = @ImageCreate (1, 1)
or die ("Cannot Initialize new GD image stream");
$background_color = ImageColorAllocate ($im, 224, 234, 234);
$text_color = ImageColorAllocate ($im, 233, 14, 91);
// imageline ($im,$x1,$y1,$x2,$y2,$text_color);
imageline ($im,0,0,1,2,$text_color);
imageline ($im,1,0,0,2,$text_color);
ImagePng ($im);
?>

Next, I created the database table.


DROP TABLE IF EXISTS `blog_log`;
CREATE TABLE `blog_log` (
  `date` timestamp NOT NULL default '0000-00-00 00:00:00',
  `ip_num` double NOT NULL default '0',
  `uagent` varchar(200) default NULL,
  `visited_page` varchar(200) NOT NULL default '',
  UNIQUE KEY `date` (`date`,`ip_num`,`visited_page`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

It’s done. I can now log any request I want using this embedded tracker.
Data should begin flowing to your database immediately. This sample snippet of code will allow you to pull data for a selected day and list each individual hit.


$query1 = "SELECT
                bl.ip_num,
                DATE_FORMAT(bl.date,'%d/%b/%Y %H:%i:%s') AS NEW_DATE,
                bl.uagent,
                bl.visited_page
        FROM blog_log bl
        WHERE
                DATE_FORMAT(bl.date,'%Y%m%d') ='$YMD'
		and uagent not REGEXP '(.*bot.*|.*crawl.*|.*spider.*|^-$|.*slurp.*|.*walker.*|.*lwp.*|.*teoma.*|.*aggregator.*|.*reader.*|.*libwww.*)'
        ORDER BY bl.date ASC";
print "<table border=\"1\">\n";
print "<tr><td>IP</td><td>DATE</td><td>USER-AGENT</td><td>PAGE VIEWED</td></tr>";
while ($row = mysql_fetch_array($result1)) {
        $visitor = long2ip($row[ip_num]);
        print "<tr><td>$visitor</td><td nowrap>$row[NEW_DATE]</td><td nowrap>$row[uagent]</td><td>";
	if ($row[visited_page] == ""){
    	    print " --- </td></tr>\n";
	} else {
    	    print "<a href=\"$row[visited_page]\" target=\_blank\">$row[visited_page]</a></td></tr>\n";
	}
}
mysql_close($link);

And that’s it. A few lines of code and you’re done. With a little tweaking, you can integrate the IP number data with a number of Geographic IP databases available for purchase to track by country and ISP, and using graphics applications for PHP, you can add graphs.
For my own purposes, this is an extension of the Geographic IP database I created a number of years ago. This application extracts IP address information from the five IP registrars, and inserts it into a database. Using the log data collected by the tracking bug above and the lookup capabilities of the Geographic IP database, I can quickly track which countries and ISP drive the most visitors to my site, and use this for general interest purposes, as well as the ability to isolate any malicious visitors to the site.

Categories: Uncategorized

7 Comments

  1. Im in desperate need of help!Im doing my Honours Project in web traffic analysis and it must be done in PHP. Can sessions in PHP only hold clickstreams , i.e Forward/back – and page references? Or can they hold more?For example , a button or something – to particularly evaluate usability I need to look deeper at web statistics and see page usage and what is being used on the page itself – along with a normal web-logging tool. Anyone have any ideas? or CODE!!lol 🙂 – Cant seem to find much

  2. Why do you convert the IP using ip2long – are there any benefits to it?

  3. Using ip2long is a holdover from my Geographic IP Database, where performing IP searches was far easier with a numeric value than an IP address. Most application development languages have a way to convert these numbers back to IPs quickly. As well, this conversion can occur natively in MySQL SLECET statements.You could add a column to the database table that added the raw IP address with very little change to the method described in this article.

  4. What about scalability issues? The database size will expand quickly with traffic and it adds another hit on the server with each page load.

  5. This is definitely a solution for a small web site, and was designed to be small and light.I started using this more when I was hosting my blog outside of the tracking infrastructure, so there is no increase in traffic on either infrastructure.Steps could also be taken to aggregate the data based on patterns in the data and then archive the raw results outside the DB.Many ways this can be improved upon. But it gets folks thinking.

  6. What about scalability issues? The database size will expand quickly with traffic and it adds another hit on the server with each page load.

  7. This is definitely a solution for a small web site, and was designed to be small and light.I started using this more when I was hosting my blog outside of the tracking infrastructure, so there is no increase in traffic on either infrastructure.Steps could also be taken to aggregate the data based on patterns in the data and then archive the raw results outside the DB.Many ways this can be improved upon. But it gets folks thinking.

Leave a Reply

Your email address will not be published. Required fields are marked *

15 − 9 =

Copyright © 2024 Performance Zen

Theme by Anders NorenUp ↑