Category: GrabIP

IP Registry Statistics – August 2007

My system has a daily job to collect and aggregate the IP Blocks distributed by the five registrars into a single database, and then provide high-level WHOIS information for this data. If you want to try this yourself, the interface here.
On an extremely irregular basis, I aggregate the statistics from this data, and present it to the masses for the examination. I might actually automate this data someday!So, for August 2007 (as of August 21, 2007), here are the aggregated IP distribution statistics broken down by registrar and country.

Starting with the country data, I figured that a sane cutoff for values would be 1,000,000 IP addresses. The list here is mostly unchanged, except for the addition of the European Union (EU). The EU is a recent addition, as it was not listed in the Country Code lookup table I had been using, and had to be added manually on August 20, 2007.

Continue reading

Geographic IP database using PERL, PHP and MySQL — UPDATE: September 16 2008

Updated September 16 2008 to reflect the numerous changes that have resulted since the original article was posted in 2005 – smp


Targeting Web site content to the specific visitors who view the site is a very important marketing advantage. Being able to track incoming visitors by the country that they originate from is an additional item that can assist companies in ensuring that visitors are presented with relevant content. This may seem like a daunting task, but it can be achieved with a high degree of accuracy using publicly accessible data, and Open Source software.

IP to Country Mapping

The idea for IP to Country mapping is one that has started to appear more frequently on the Internet in recent months. All GeoIP systems do warn users that they are not 100% accurate. The accuracy of GeoIP mapping can be affected by things such as large corporate or ISP networks where traffic is routed out a small number of public access points, regardless of the traffic’s point of origin.

Making IP Addresses Searchable

The first issue that needs to be addressed is how to determine if an IP Address is in one of the ranges that is defined as originating from a distinct country. The simplest way to range-match IP Addresses is to abandon the dotted-quad notation we are all familiar with, and convert the IP Address to an IP Number.
All IP Addresses can be converted into decimal numbers that fall into a known range between 0 and (2^32)-1 (4294967295). In reality, the range is even smaller than that, as public IP Addresses fall between 0.0.0.0 (IP Number: 0) and 223.255.255.255 (IP Number: 3758096383).

A quick search of the Web showed me that there is a way to create functions to convert IP Addresses to their numerical (IP Number) equivalent and reverse the process.

sub ip2long {
	return sprintf("%u",unpack("l*", pack("l*", unpack("N*", inet_aton(shift)))));
}
sub long2ip {
	return inet_ntoa(pack("N*", shift));
}
<--snip-->
if ($start_ip =~ /\d+\.\d+\.\d+\.\d+/) {
         ## my $ip_address = shift($start_ip);
         chomp($start_ip);
         $long_start = ip2long($start_ip);
         ## print "$ip_address converts to $ip_number\n";
 }

If you are using PHP in your applications, this conversion process is made even easier by native function calls.

Convert IP Address to IP Number
$ip_number = sprintf("%u",ip2long($ip_address));

IP Address Location Data

Now that we have settled on a format for the IP data to be used in the database, we now have to find IP data that allows us to map IP Addresses to countries. This is easier than it sounds, as this data is centrally held by the 5 regional IP Registries — ARIN, RIPE, APNIC, LACNIC, and AFRINIC. After poking around in the depths of their Web sites, I found that they actually provide text formatted versions of the allocated and assigned IP ranges that they are responsible for. All of the registries use the same format, which makes parsing the these files a simple process.

As of March, 2005, the distribution of IPV4 networks in the database by registry is:

+----------+--------------+
| registry | num_networks |
+----------+--------------+
| arin     |        36073 |
| ripencc  |        14813 |
| apnic    |        10474 |
| lacnic   |         1460 |
| afrinic  |          443 |
+----------+--------------+
5 rows in set (0.52 sec)

I chose to use the PERL module WWW::CURL to retrieve the files. You could re-write the application to use LWP or some other method on systems where cURL is not supported, as it is a simple file download over FTP. I update the data once a day, which may at first appear excessive. However, I have seen upwards of 40-50 new rows added to the database in a single day.

Some may ask why I chose to write the downloaded files to a file rather than immediately inserting them into the database. Using this two-step process gives me the ability to manually rollback to an older database if there is a problem retrieving one of the registry files. I have set an arbitrary limit of 75,000 lines for the entire aggregated file; if the file is less than that, the remainder of the process is aborted.

The data retrieved from the registries is in the following format.
Registry Raw Data Format

<snip>
apnic|CN|ipv4|202.127.4.0|256|19950610|assigned
apnic|BN|ipv4|202.160.0.0|2048|19950610|allocated
apnic|NP|asn|4613|1|19950611|allocated
apnic|LK|ipv4|203.143.0.0|1024|19950612|allocated
apnic|MO|asn|4609|1|19950615|allocated
apnic|KR|asn|4670|1|19950616|allocated
apnic|SB|ipv4|202.63.254.0|512|19950618|assigned
apnic|JP|ipv4|202.232.0.0|262144|19950618|allocated
apnic|SG|ipv4|203.127.192.0|8192|19950618|allocated
apnic|PK|asn|4615|1|19950629|allocated
apnic|HK|asn|4614|1|19950704|allocated
</snip>

The fields are all “|” (pipe-character) separated, and are described below.

COLUMN		VALUES
---------------------------------------------------------------------
REGISTRY:	apnic,arin,ripencc,lacnic,iana
COUNTRY_CODE:	One of 240 unique 2-character country codes or "*"
ADDRESS_TYPE:	asn,ipv4,ipv6
ADDRESS:	Either the starting IP Address or AS Number or "*"
NUMBER:		Number of IPs in range or "1" if ADDRESS_TYPE is "asn"
DATE:		Date IP range or AS Number was added to database or "*"
RANGE_TYPE:	"allocated" -> borrowed; "assigned" -> owned

Storing the Data in MySQL

To store the data, I created a two-table MySQL database named “ip_registry”, using the script below.

Database creation statement for ‘ip_registry’

CREATE DATABASE ip_registry;

Table structure for table ‘country_code’

CREATE TABLE ip_registry.country_code (
code char(2) default NULL,
country varchar(50) default NULL,
UNIQUE KEY code (code)
);

Table structure for table ‘ip_map’

CREATE TABLE ip_registry.ip_map (
code char(2) default NULL,
registry char(10) default NULL,
ip_from double default NULL,
ip_to double default NULL,
UNIQUE KEY registry (registry,ip_from,ip_to)
);

As of March 2005 September 2008, the “ip_map” data table for my system runs to 63,263 88,000+ rows. This value will change daily, and may decrease suddenly at times. The registries make an effort to aggregate as many IP networks as possible into the largest possible contiguous block, and this aggregation process will reduce the number of individual entries by 2,000 – 3,000 rows in a single day.

The recognized standard for country codes is ISO 3166. In this standard, each nation is assigned a unique, two-character code. The ONLY exception I found to this rule is that, for historical reasons, the IP registries have entries for the United Kingdom listed with two country codes (GB and UK). I could have corrected this in the Perl script by standardizing on a single country code, but I preferred the solution of adding another row to the “country_code” table.

From the raw Registry data, I determined that only four of the fields useful for the project that I was working on: REGISTRY, COUNTRY_CODE, ADDRESS, and NUMBER. I then wrote PERL code to read the raw IP Registry data from the data file I created previously, convert the starting IP address to a number, use this starting IP Number that to generate the end IP Number, and then insert the rows into a database.

PERL: IP Number conversion and database insert

<--snip-->
if ($line_count >= 115000) {
## Establish Database Connection
print "\n\nOpening database connection";
my $dbh = DBI->connect("DBI:mysql:host=[host];database=logger","[username]","[password]",{PrintError=>0});
## Remove existing values
my $sth = $dbh->do("TRUNCATE TABLE ip_map");
print " --> Data from ip_map table dropped";
$sth = $dbh->prepare("INSERT into ip_map values (?,?,?,?,?,?)");
$count = 0;
print " --> Completed\n";
## Insert Data Into Database
print "Inserting data into the database";
open (PROCESS, "<$file");
  while ($line =
) {
        chomp ($line);
        if (($line =~ m/\|ipv4\|/) and ($line !~ m/\|\*\|/)) {
                ($registrar,$country_code,$item_type,$start_ip,$num_ip,$entry_date,$registry_type) = split(/\|/, $line);
                $long_start = 0;
                if ($start_ip =~ /\d+\.\d+\.\d+\.\d+/) {
                        ## my $ip_address = shift($start_ip);
                        chomp($start_ip);
                        $long_start = ip2long($start_ip);
                        ## print "$ip_address converts to $ip_number\n";
                        $long_end = $long_start + ($num_ip-1);
                        $count += $sth->execute($country_code,$registrar,$long_start,$long_end,$num_ip,$start_ip);
                }
        }
  }
}
<--snip-->

The “TRUNCATE” statement in the script has the affect of dropping the table and re-creating it using the column names and types defined in the initial create statement. It is easier to rebuild the data table each time new data is inserted to ensure that duplicates and overlaps do not enter into the database.
Why is the value of “$long_end” defined by “$long_start + ($num_ip-1)”? The IP address ranges delivered by the registries count the starting value as one of the items in the set — i.e. counting using ordinal numbers.

START_IP:   12.236.236.0
END_IP:     12.236.236.255
NUMBER_IP:  256

If cardinal numbering is used to calculate the address range, incorrect IP addresses will be generated.

IP Number Calculations
WRONG!	216853760 = 216853504 + 256 	-> END_IP = 12.236.237.0
RIGHT!	216853759 = 216853504 + (256-1)	-> END_IP = 12.236.236.255

I have also added a sanity-check that stops the insertion process if the number of lines in the data file is less than 75,0000 115,000. This would prevents the creation of a truncated database if one of the registries does not update their data files or the script is unable to retrieve the data files. The value of 75,000 115,000 appears high, but the data files that the data is extracted from also contain autonomous system and IPV6 data, as well as the IPV4 data that is inserted into the database. Currently, the data file runs to over 90,000 130,000 lines, so the 75,000 115,000 line barrier seems very reasonable to prevent inserting a broken dataset.

Querying the Database

Now that the database is constructed, we can start to run queries against it.

mysql> select ip.code,ip.registry, ip.ip_from, ip.ip_to, co.country
-> from ip_map ip, country_code co
-> where (ip.code = 'IS') and (ip.code = co.code);
+------+----------+------------+------------+---------+
| code | registry | ip_from    | ip_to      | country |
+------+----------+------------+------------+---------+
| IS   | ripencc  | 1049722880 | 1049731071 | ICELAND |
| IS   | ripencc  | 1359937536 | 1359970303 | ICELAND |
| IS   | ripencc  | 1383088128 | 1383096319 | ICELAND |
| IS   | ripencc  | 1385447424 | 1385455615 | ICELAND |
| IS   | ripencc  | 1390215168 | 1390280703 | ICELAND |
| IS   | ripencc  | 1403846656 | 1403863039 | ICELAND |
| IS   | ripencc  | 1433681920 | 1433690111 | ICELAND |
| IS   | ripencc  | 1439023104 | 1439039487 | ICELAND |
| IS   | ripencc  | 1440481280 | 1440514047 | ICELAND |
| IS   | ripencc  | 2644312064 | 2644377599 | ICELAND |
| IS   | ripencc  | 3238264832 | 3238330367 | ICELAND |
| IS   | ripencc  | 3245150208 | 3245154303 | ICELAND |
| IS   | ripencc  | 3261718528 | 3261726719 | ICELAND |
| IS   | ripencc  | 3264217088 | 3264282623 | ICELAND |
| IS   | ripencc  | 3556884480 | 3556892671 | ICELAND |
| IS   | ripencc  | 3558785024 | 3558793215 | ICELAND |
| IS   | ripencc  | 3565084672 | 3565092863 | ICELAND |
| IS   | ripencc  | 3584524288 | 3584532479 | ICELAND |
| IS   | ripencc  | 3585114112 | 3585122303 | ICELAND |
| IS   | ripencc  | 3585433600 | 3585441791 | ICELAND |
| IS   | ripencc  | 3586023424 | 3586031615 | ICELAND |
| IS   | ripencc  | 3587538944 | 3587547135 | ICELAND |
| IS   | ripencc  | 3587981312 | 3587997695 | ICELAND |
| IS   | ripencc  | 3641278464 | 3641282559 | ICELAND |
| IS   | ripencc  | 3642535936 | 3642540031 | ICELAND |
| IS   | ripencc  | 3650592768 | 3650596863 | ICELAND |
| IS   | ripencc  | 3650596864 | 3650600959 | ICELAND |
| IS   | ripencc  | 3651915776 | 3651919871 | ICELAND |
+------+----------+------------+------------+---------+
28 rows in set (0.09 sec)

So the database structure is sound. It is important to build the file using all four five registries; even though Iceland is now covered by RIPE, older IP allocations and assignments were been handled by both RIPE and ARIN.

Having the registry information helps build in the flexibility to add a WHOIS functionality using this database, something that I have done on for my own IP Tracking and logging tool. This allows for further drilldowns on the data, beyond the scope of this article.

The main item that will be of interest to most Web site administrators is that they can now build dynamic pages using a data source which tracks their visitors’ announced IP address to the country of origin with a high degree of accuracy. This is particulary useful if you are attempting to distribute users to geographically diverse mirror sites. You can also do fun things, such as displaying the flag of the country that the visitor is coming from.

A generic copy of the registry retrieval and database insertion script can be yours just by contacting me.

GrabIP: This week’s IP Registrar Stats

Here are the IP Block and Address allocation by registrar, as well as the number of officially allocated IP Addresses in each country (with 65536 Hosts or more) as of August 17, 2005 00:30 UTC.

This data has been extracted from the GrabIP database


IP REGISTRAR BREAKDOWN

registry   NUM_HOSTS  NUM_BLOCK
--------  ----------  ---------
arin      1481414400      36355
ripencc    361039644      18864
apnic      285163264      11402
lacnic      38433536       1534
afrinic     10534912        473

NUMBER OF ALLOCATED IP ADDRESSES BY COUNTRY (65536 Hosts or more)**

COUNTRY                                     COUNTRY_CODE   NUM_HOSTS  NUM_BLOCKS
------------------------------------------  ------------  ----------  ----------
UNITED STATES                               US            1313949696       31216
JAPAN                                       JP             140817920        1502
EUROPEAN UNION                              EU             113810588        4595
CHINA                                       CN              69473536         871
CANADA                                      CA              65890048        4921
GERMANY                                     DE              50366576        1651
FRANCE                                      FR              41834624        1188
UNITED KINGDOM                              UK              39048024        1630
KOREA, REPUBLIC OF                          KR              37513472         350
UNITED KINGDOM                              GB              33562368          25
AUSTRALIA                                   AU              26410240        4743
NETHERLANDS                                 NL              16426144         995
SPAIN                                       ES              16301856         417
ITALY                                       IT              16213856         728
TAIWAN, PROVINCE OF CHINA                   TW              15157248         322
BRAZIL                                      BR              15075072         261
SWEDEN                                      SE              12033664         526
MEXICO                                      MX              11030528         154
RUSSIAN FEDERATION                          RU              10389536        1150
SOUTH AFRICA                                ZA               8184064         251
POLAND                                      PL               7318686         704
FINLAND                                     FI               7100800         260
SWITZERLAND                                 CH               6875584         754
DENMARK                                     DK               6839072         274
HONG KONG                                   HK               6124800         534
INDIA                                       IN               5891072         352
AUSTRIA                                     AT               5667232         383
NORWAY                                      NO               5054240         242
BELGIUM                                     BE               4309760         290
NEW ZEALAND                                 NZ               4274944        1126
TURKEY                                      TR               3664064         167
CHILE                                       CL               3002112         314
ARGENTINA                                   AR               2923520         237
IRELAND                                     IE               2903200         114
THAILAND                                    TH               2853376         295
CZECH REPUBLIC                              CZ               2804480         186
ROMANIA                                     RO               2764160         274
ISRAEL                                      IL               2726272         118
SINGAPORE                                   SG               2677760         304
PORTUGAL                                    PT               2604864          96
MALAYSIA                                    MY               2569472         191
HUNGARY                                     HU               2220288         131
GREECE                                      GR               1766400         161
VENEZUELA                                   VE               1753088          79
UKRAINE                                     UA               1354458         618
BULGARIA                                    BG               1352192         191
EGYPT                                       EG               1280768          55
INDONESIA                                   ID               1267200         224
COSTA RICA                                  CR               1254144          12
PHILIPPINES                                 PH               1136384         186
SLOVAKIA                                    SK                975872          88
UNITED ARAB EMIRATES                        AE                964608          14
SAUDI ARABIA                                SA                902400          79
COLOMBIA                                    CO                864768         102
LATVIA                                      LV                798528         101
LITHUANIA                                   LT                788352          66
IRAN, ISLAMIC REPUBLIC OF                   IR                778752          52
SLOVENIA                                    SI                667392         100
PERU                                        PE                646656          58
ESTONIA                                     EE                645376          44
VIET NAM                                    VN                625664          38
ICELAND                                     IS                507904          28
PANAMA                                      PA                467712          30
YUGOSLAVIA                                  YU                347392          35
PAKISTAN                                    PK                345600          77
CROATIA                                     HR                338240          32
LUXEMBOURG                                  LU                307968          50
KUWAIT                                      KW                297728          24
CYPRUS                                      CY                274720          58
BOLIVIA                                     BO                263424          20
KAZAKHSTAN                                  KZ                242688          27
ECUADOR                                     EC                242432         125
MALTA                                       MT                240384          31
MOROCCO                                     MA                233728           9
PUERTO RICO                                 PR                218880          29
URUGUAY                                     UY                206592          25
JORDAN                                      JO                191744          29
BANGLADESH                                  BD                178688          52
QATAR                                       QA                176128           6
PALESTINIAN TERRITORY, OCCUPIED             PS                173824          17
BRUNEI DARUSSALAM                           BN                172544          11
SRI LANKA                                   LK                167936          30
OMAN                                        OM                163840           5
NIGERIA                                     NG                159744          24
EL SALVADOR                                 SV                156416          12
GUATEMALA                                   GT                145152          19
MACEDONIA, THE FORMER YUGOSLAV REPUBLIC OF  MK                143360          11
DOMINICAN REPUBLIC                          DO                128768          21
MACAO                                       MO                127232          15
BOSNIA AND HERZEGOVINA                      BA                124928          23
BELARUS                                     BY                119552          16
GEORGIA                                     GE                118784          18
LEBANON                                     LB                112640          16
KENYA                                       KE                108800          23
TUNISIA                                     TN                106752           4
CUBA                                        CU                 95232          12
AZERBAIJAN                                  AZ                 92160          11
ALGERIA                                     DZ                 90880          16
FIJI                                        FJ                 88832          10
ARMENIA                                     AM                 86016          10
NICARAGUA                                   NI                 81152           9
UZBEKISTAN                                  UZ                 77824           9
GHANA                                       GH                 74496          13
BERMUDA                                     BM                 66048          18
KYRGYZSTAN                                  KG                 65536           9

I will try and post these weekly.

**Please note that the total number of IP Addresses for the United Kingdom is the sum of totals for the two unique country codes used when registering IP Blocks — UK and GB.

IP-Country Database Information Offline Permanently

I have been getting a great deal of interest in the IP-to-Country data that I have been working on over the last few years. However, I have had to take this data down from my site.

  1. The bandwidth costs are starting to become noticeable
  2. My old hardware is starting to creak under the weight
  3. There are commercial sources for this data, and frankly, I say more power to them if they can make money of this; I sure haven’t

It’s been a depressing week here in the Newest Industry Factory, and we are starting to have to dim the lights in some corners, turn off some of the machines, and [gulp!] consider other drastic measures.

To paraphrase the Bare Naked Ladies: “All this food ain’t free”.

I apologize for the inconvenience, and hope that I can re-activate these pages in the future.


UPDATE: GrabIP is now back up, but limited to 5 requests in a 24-hour period.

Copyright © 2024 Performance Zen

Theme by Anders NorenUp ↑