Category: GrabIP

  • IP Registry Statistics – August 2007

    My system has a daily job to collect and aggregate the IP Blocks distributed by the five registrars into a single database, and then provide high-level WHOIS information for this data. If you want to try this yourself, the interface here.
    On an extremely irregular basis, I aggregate the statistics from this data, and present it to the masses for the examination. I might actually automate this data someday!So, for August 2007 (as of August 21, 2007), here are the aggregated IP distribution statistics broken down by registrar and country.

    Starting with the country data, I figured that a sane cutoff for values would be 1,000,000 IP addresses. The list here is mostly unchanged, except for the addition of the European Union (EU). The EU is a recent addition, as it was not listed in the Country Code lookup table I had been using, and had to be added manually on August 20, 2007.

    (more…)
  • Geographic IP database using PERL, PHP and MySQL — UPDATE: September 16 2008

    Updated September 16 2008 to reflect the numerous changes that have resulted since the original article was posted in 2005 – smp


    Targeting Web site content to the specific visitors who view the site is a very important marketing advantage. Being able to track incoming visitors by the country that they originate from is an additional item that can assist companies in ensuring that visitors are presented with relevant content. This may seem like a daunting task, but it can be achieved with a high degree of accuracy using publicly accessible data, and Open Source software.

    IP to Country Mapping

    The idea for IP to Country mapping is one that has started to appear more frequently on the Internet in recent months. All GeoIP systems do warn users that they are not 100% accurate. The accuracy of GeoIP mapping can be affected by things such as large corporate or ISP networks where traffic is routed out a small number of public access points, regardless of the traffic’s point of origin.

    Making IP Addresses Searchable

    The first issue that needs to be addressed is how to determine if an IP Address is in one of the ranges that is defined as originating from a distinct country. The simplest way to range-match IP Addresses is to abandon the dotted-quad notation we are all familiar with, and convert the IP Address to an IP Number.
    All IP Addresses can be converted into decimal numbers that fall into a known range between 0 and (2^32)-1 (4294967295). In reality, the range is even smaller than that, as public IP Addresses fall between 0.0.0.0 (IP Number: 0) and 223.255.255.255 (IP Number: 3758096383).

    A quick search of the Web showed me that there is a way to create functions to convert IP Addresses to their numerical (IP Number) equivalent and reverse the process.

    sub ip2long {
    	return sprintf("%u",unpack("l*", pack("l*", unpack("N*", inet_aton(shift)))));
    }
    sub long2ip {
    	return inet_ntoa(pack("N*", shift));
    }
    <--snip-->
    if ($start_ip =~ /\d+\.\d+\.\d+\.\d+/) {
             ## my $ip_address = shift($start_ip);
             chomp($start_ip);
             $long_start = ip2long($start_ip);
             ## print "$ip_address converts to $ip_number\n";
     }

    If you are using PHP in your applications, this conversion process is made even easier by native function calls.

    Convert IP Address to IP Number
    $ip_number = sprintf("%u",ip2long($ip_address));

    IP Address Location Data

    Now that we have settled on a format for the IP data to be used in the database, we now have to find IP data that allows us to map IP Addresses to countries. This is easier than it sounds, as this data is centrally held by the 5 regional IP Registries — ARIN, RIPE, APNIC, LACNIC, and AFRINIC. After poking around in the depths of their Web sites, I found that they actually provide text formatted versions of the allocated and assigned IP ranges that they are responsible for. All of the registries use the same format, which makes parsing the these files a simple process.

    As of March, 2005, the distribution of IPV4 networks in the database by registry is:

    +----------+--------------+
    | registry | num_networks |
    +----------+--------------+
    | arin     |        36073 |
    | ripencc  |        14813 |
    | apnic    |        10474 |
    | lacnic   |         1460 |
    | afrinic  |          443 |
    +----------+--------------+
    5 rows in set (0.52 sec)

    I chose to use the PERL module WWW::CURL to retrieve the files. You could re-write the application to use LWP or some other method on systems where cURL is not supported, as it is a simple file download over FTP. I update the data once a day, which may at first appear excessive. However, I have seen upwards of 40-50 new rows added to the database in a single day.

    Some may ask why I chose to write the downloaded files to a file rather than immediately inserting them into the database. Using this two-step process gives me the ability to manually rollback to an older database if there is a problem retrieving one of the registry files. I have set an arbitrary limit of 75,000 lines for the entire aggregated file; if the file is less than that, the remainder of the process is aborted.

    The data retrieved from the registries is in the following format.
    Registry Raw Data Format

    <snip>
    apnic|CN|ipv4|202.127.4.0|256|19950610|assigned
    apnic|BN|ipv4|202.160.0.0|2048|19950610|allocated
    apnic|NP|asn|4613|1|19950611|allocated
    apnic|LK|ipv4|203.143.0.0|1024|19950612|allocated
    apnic|MO|asn|4609|1|19950615|allocated
    apnic|KR|asn|4670|1|19950616|allocated
    apnic|SB|ipv4|202.63.254.0|512|19950618|assigned
    apnic|JP|ipv4|202.232.0.0|262144|19950618|allocated
    apnic|SG|ipv4|203.127.192.0|8192|19950618|allocated
    apnic|PK|asn|4615|1|19950629|allocated
    apnic|HK|asn|4614|1|19950704|allocated
    </snip>

    The fields are all “|” (pipe-character) separated, and are described below.

    COLUMN		VALUES
    ---------------------------------------------------------------------
    REGISTRY:	apnic,arin,ripencc,lacnic,iana
    COUNTRY_CODE:	One of 240 unique 2-character country codes or "*"
    ADDRESS_TYPE:	asn,ipv4,ipv6
    ADDRESS:	Either the starting IP Address or AS Number or "*"
    NUMBER:		Number of IPs in range or "1" if ADDRESS_TYPE is "asn"
    DATE:		Date IP range or AS Number was added to database or "*"
    RANGE_TYPE:	"allocated" -> borrowed; "assigned" -> owned

    Storing the Data in MySQL

    To store the data, I created a two-table MySQL database named “ip_registry”, using the script below.

    Database creation statement for ‘ip_registry’

    CREATE DATABASE ip_registry;

    Table structure for table ‘country_code’

    CREATE TABLE ip_registry.country_code (
    code char(2) default NULL,
    country varchar(50) default NULL,
    UNIQUE KEY code (code)
    );

    Table structure for table ‘ip_map’

    CREATE TABLE ip_registry.ip_map (
    code char(2) default NULL,
    registry char(10) default NULL,
    ip_from double default NULL,
    ip_to double default NULL,
    UNIQUE KEY registry (registry,ip_from,ip_to)
    );

    As of March 2005 September 2008, the “ip_map” data table for my system runs to 63,263 88,000+ rows. This value will change daily, and may decrease suddenly at times. The registries make an effort to aggregate as many IP networks as possible into the largest possible contiguous block, and this aggregation process will reduce the number of individual entries by 2,000 – 3,000 rows in a single day.

    The recognized standard for country codes is ISO 3166. In this standard, each nation is assigned a unique, two-character code. The ONLY exception I found to this rule is that, for historical reasons, the IP registries have entries for the United Kingdom listed with two country codes (GB and UK). I could have corrected this in the Perl script by standardizing on a single country code, but I preferred the solution of adding another row to the “country_code” table.

    From the raw Registry data, I determined that only four of the fields useful for the project that I was working on: REGISTRY, COUNTRY_CODE, ADDRESS, and NUMBER. I then wrote PERL code to read the raw IP Registry data from the data file I created previously, convert the starting IP address to a number, use this starting IP Number that to generate the end IP Number, and then insert the rows into a database.

    PERL: IP Number conversion and database insert

    <--snip-->
    if ($line_count >= 115000) {
    ## Establish Database Connection
    print "\n\nOpening database connection";
    my $dbh = DBI->connect("DBI:mysql:host=[host];database=logger","[username]","[password]",{PrintError=>0});
    ## Remove existing values
    my $sth = $dbh->do("TRUNCATE TABLE ip_map");
    print " --> Data from ip_map table dropped";
    $sth = $dbh->prepare("INSERT into ip_map values (?,?,?,?,?,?)");
    $count = 0;
    print " --> Completed\n";
    ## Insert Data Into Database
    print "Inserting data into the database";
    open (PROCESS, "<$file");
      while ($line =
    ) {
            chomp ($line);
            if (($line =~ m/\|ipv4\|/) and ($line !~ m/\|\*\|/)) {
                    ($registrar,$country_code,$item_type,$start_ip,$num_ip,$entry_date,$registry_type) = split(/\|/, $line);
                    $long_start = 0;
                    if ($start_ip =~ /\d+\.\d+\.\d+\.\d+/) {
                            ## my $ip_address = shift($start_ip);
                            chomp($start_ip);
                            $long_start = ip2long($start_ip);
                            ## print "$ip_address converts to $ip_number\n";
                            $long_end = $long_start + ($num_ip-1);
                            $count += $sth->execute($country_code,$registrar,$long_start,$long_end,$num_ip,$start_ip);
                    }
            }
      }
    }
    <--snip-->

    The “TRUNCATE” statement in the script has the affect of dropping the table and re-creating it using the column names and types defined in the initial create statement. It is easier to rebuild the data table each time new data is inserted to ensure that duplicates and overlaps do not enter into the database.
    Why is the value of “$long_end” defined by “$long_start + ($num_ip-1)”? The IP address ranges delivered by the registries count the starting value as one of the items in the set — i.e. counting using ordinal numbers.

    START_IP:   12.236.236.0
    END_IP:     12.236.236.255
    NUMBER_IP:  256

    If cardinal numbering is used to calculate the address range, incorrect IP addresses will be generated.

    IP Number Calculations
    WRONG!	216853760 = 216853504 + 256 	-> END_IP = 12.236.237.0
    RIGHT!	216853759 = 216853504 + (256-1)	-> END_IP = 12.236.236.255

    I have also added a sanity-check that stops the insertion process if the number of lines in the data file is less than 75,0000 115,000. This would prevents the creation of a truncated database if one of the registries does not update their data files or the script is unable to retrieve the data files. The value of 75,000 115,000 appears high, but the data files that the data is extracted from also contain autonomous system and IPV6 data, as well as the IPV4 data that is inserted into the database. Currently, the data file runs to over 90,000 130,000 lines, so the 75,000 115,000 line barrier seems very reasonable to prevent inserting a broken dataset.

    Querying the Database

    Now that the database is constructed, we can start to run queries against it.

    mysql> select ip.code,ip.registry, ip.ip_from, ip.ip_to, co.country
    -> from ip_map ip, country_code co
    -> where (ip.code = 'IS') and (ip.code = co.code);
    +------+----------+------------+------------+---------+
    | code | registry | ip_from    | ip_to      | country |
    +------+----------+------------+------------+---------+
    | IS   | ripencc  | 1049722880 | 1049731071 | ICELAND |
    | IS   | ripencc  | 1359937536 | 1359970303 | ICELAND |
    | IS   | ripencc  | 1383088128 | 1383096319 | ICELAND |
    | IS   | ripencc  | 1385447424 | 1385455615 | ICELAND |
    | IS   | ripencc  | 1390215168 | 1390280703 | ICELAND |
    | IS   | ripencc  | 1403846656 | 1403863039 | ICELAND |
    | IS   | ripencc  | 1433681920 | 1433690111 | ICELAND |
    | IS   | ripencc  | 1439023104 | 1439039487 | ICELAND |
    | IS   | ripencc  | 1440481280 | 1440514047 | ICELAND |
    | IS   | ripencc  | 2644312064 | 2644377599 | ICELAND |
    | IS   | ripencc  | 3238264832 | 3238330367 | ICELAND |
    | IS   | ripencc  | 3245150208 | 3245154303 | ICELAND |
    | IS   | ripencc  | 3261718528 | 3261726719 | ICELAND |
    | IS   | ripencc  | 3264217088 | 3264282623 | ICELAND |
    | IS   | ripencc  | 3556884480 | 3556892671 | ICELAND |
    | IS   | ripencc  | 3558785024 | 3558793215 | ICELAND |
    | IS   | ripencc  | 3565084672 | 3565092863 | ICELAND |
    | IS   | ripencc  | 3584524288 | 3584532479 | ICELAND |
    | IS   | ripencc  | 3585114112 | 3585122303 | ICELAND |
    | IS   | ripencc  | 3585433600 | 3585441791 | ICELAND |
    | IS   | ripencc  | 3586023424 | 3586031615 | ICELAND |
    | IS   | ripencc  | 3587538944 | 3587547135 | ICELAND |
    | IS   | ripencc  | 3587981312 | 3587997695 | ICELAND |
    | IS   | ripencc  | 3641278464 | 3641282559 | ICELAND |
    | IS   | ripencc  | 3642535936 | 3642540031 | ICELAND |
    | IS   | ripencc  | 3650592768 | 3650596863 | ICELAND |
    | IS   | ripencc  | 3650596864 | 3650600959 | ICELAND |
    | IS   | ripencc  | 3651915776 | 3651919871 | ICELAND |
    +------+----------+------------+------------+---------+
    28 rows in set (0.09 sec)

    So the database structure is sound. It is important to build the file using all four five registries; even though Iceland is now covered by RIPE, older IP allocations and assignments were been handled by both RIPE and ARIN.

    Having the registry information helps build in the flexibility to add a WHOIS functionality using this database, something that I have done on for my own IP Tracking and logging tool. This allows for further drilldowns on the data, beyond the scope of this article.

    The main item that will be of interest to most Web site administrators is that they can now build dynamic pages using a data source which tracks their visitors’ announced IP address to the country of origin with a high degree of accuracy. This is particulary useful if you are attempting to distribute users to geographically diverse mirror sites. You can also do fun things, such as displaying the flag of the country that the visitor is coming from.

    A generic copy of the registry retrieval and database insertion script can be yours just by contacting me.

  • GrabIP: This week’s IP Registrar Stats

    Here are the IP Block and Address allocation by registrar, as well as the number of officially allocated IP Addresses in each country (with 65536 Hosts or more) as of August 17, 2005 00:30 UTC.

    This data has been extracted from the GrabIP database


    IP REGISTRAR BREAKDOWN

    registry   NUM_HOSTS  NUM_BLOCK
    --------  ----------  ---------
    arin      1481414400      36355
    ripencc    361039644      18864
    apnic      285163264      11402
    lacnic      38433536       1534
    afrinic     10534912        473

    NUMBER OF ALLOCATED IP ADDRESSES BY COUNTRY (65536 Hosts or more)**

    COUNTRY                                     COUNTRY_CODE   NUM_HOSTS  NUM_BLOCKS
    ------------------------------------------  ------------  ----------  ----------
    UNITED STATES                               US            1313949696       31216
    JAPAN                                       JP             140817920        1502
    EUROPEAN UNION                              EU             113810588        4595
    CHINA                                       CN              69473536         871
    CANADA                                      CA              65890048        4921
    GERMANY                                     DE              50366576        1651
    FRANCE                                      FR              41834624        1188
    UNITED KINGDOM                              UK              39048024        1630
    KOREA, REPUBLIC OF                          KR              37513472         350
    UNITED KINGDOM                              GB              33562368          25
    AUSTRALIA                                   AU              26410240        4743
    NETHERLANDS                                 NL              16426144         995
    SPAIN                                       ES              16301856         417
    ITALY                                       IT              16213856         728
    TAIWAN, PROVINCE OF CHINA                   TW              15157248         322
    BRAZIL                                      BR              15075072         261
    SWEDEN                                      SE              12033664         526
    MEXICO                                      MX              11030528         154
    RUSSIAN FEDERATION                          RU              10389536        1150
    SOUTH AFRICA                                ZA               8184064         251
    POLAND                                      PL               7318686         704
    FINLAND                                     FI               7100800         260
    SWITZERLAND                                 CH               6875584         754
    DENMARK                                     DK               6839072         274
    HONG KONG                                   HK               6124800         534
    INDIA                                       IN               5891072         352
    AUSTRIA                                     AT               5667232         383
    NORWAY                                      NO               5054240         242
    BELGIUM                                     BE               4309760         290
    NEW ZEALAND                                 NZ               4274944        1126
    TURKEY                                      TR               3664064         167
    CHILE                                       CL               3002112         314
    ARGENTINA                                   AR               2923520         237
    IRELAND                                     IE               2903200         114
    THAILAND                                    TH               2853376         295
    CZECH REPUBLIC                              CZ               2804480         186
    ROMANIA                                     RO               2764160         274
    ISRAEL                                      IL               2726272         118
    SINGAPORE                                   SG               2677760         304
    PORTUGAL                                    PT               2604864          96
    MALAYSIA                                    MY               2569472         191
    HUNGARY                                     HU               2220288         131
    GREECE                                      GR               1766400         161
    VENEZUELA                                   VE               1753088          79
    UKRAINE                                     UA               1354458         618
    BULGARIA                                    BG               1352192         191
    EGYPT                                       EG               1280768          55
    INDONESIA                                   ID               1267200         224
    COSTA RICA                                  CR               1254144          12
    PHILIPPINES                                 PH               1136384         186
    SLOVAKIA                                    SK                975872          88
    UNITED ARAB EMIRATES                        AE                964608          14
    SAUDI ARABIA                                SA                902400          79
    COLOMBIA                                    CO                864768         102
    LATVIA                                      LV                798528         101
    LITHUANIA                                   LT                788352          66
    IRAN, ISLAMIC REPUBLIC OF                   IR                778752          52
    SLOVENIA                                    SI                667392         100
    PERU                                        PE                646656          58
    ESTONIA                                     EE                645376          44
    VIET NAM                                    VN                625664          38
    ICELAND                                     IS                507904          28
    PANAMA                                      PA                467712          30
    YUGOSLAVIA                                  YU                347392          35
    PAKISTAN                                    PK                345600          77
    CROATIA                                     HR                338240          32
    LUXEMBOURG                                  LU                307968          50
    KUWAIT                                      KW                297728          24
    CYPRUS                                      CY                274720          58
    BOLIVIA                                     BO                263424          20
    KAZAKHSTAN                                  KZ                242688          27
    ECUADOR                                     EC                242432         125
    MALTA                                       MT                240384          31
    MOROCCO                                     MA                233728           9
    PUERTO RICO                                 PR                218880          29
    URUGUAY                                     UY                206592          25
    JORDAN                                      JO                191744          29
    BANGLADESH                                  BD                178688          52
    QATAR                                       QA                176128           6
    PALESTINIAN TERRITORY, OCCUPIED             PS                173824          17
    BRUNEI DARUSSALAM                           BN                172544          11
    SRI LANKA                                   LK                167936          30
    OMAN                                        OM                163840           5
    NIGERIA                                     NG                159744          24
    EL SALVADOR                                 SV                156416          12
    GUATEMALA                                   GT                145152          19
    MACEDONIA, THE FORMER YUGOSLAV REPUBLIC OF  MK                143360          11
    DOMINICAN REPUBLIC                          DO                128768          21
    MACAO                                       MO                127232          15
    BOSNIA AND HERZEGOVINA                      BA                124928          23
    BELARUS                                     BY                119552          16
    GEORGIA                                     GE                118784          18
    LEBANON                                     LB                112640          16
    KENYA                                       KE                108800          23
    TUNISIA                                     TN                106752           4
    CUBA                                        CU                 95232          12
    AZERBAIJAN                                  AZ                 92160          11
    ALGERIA                                     DZ                 90880          16
    FIJI                                        FJ                 88832          10
    ARMENIA                                     AM                 86016          10
    NICARAGUA                                   NI                 81152           9
    UZBEKISTAN                                  UZ                 77824           9
    GHANA                                       GH                 74496          13
    BERMUDA                                     BM                 66048          18
    KYRGYZSTAN                                  KG                 65536           9
    

    I will try and post these weekly.

    **Please note that the total number of IP Addresses for the United Kingdom is the sum of totals for the two unique country codes used when registering IP Blocks — UK and GB.

  • IP-Country Database Information Offline Permanently

    I have been getting a great deal of interest in the IP-to-Country data that I have been working on over the last few years. However, I have had to take this data down from my site.

    1. The bandwidth costs are starting to become noticeable
    2. My old hardware is starting to creak under the weight
    3. There are commercial sources for this data, and frankly, I say more power to them if they can make money of this; I sure haven’t

    It’s been a depressing week here in the Newest Industry Factory, and we are starting to have to dim the lights in some corners, turn off some of the machines, and [gulp!] consider other drastic measures.

    To paraphrase the Bare Naked Ladies: “All this food ain’t free”.

    I apologize for the inconvenience, and hope that I can re-activate these pages in the future.


    UPDATE: GrabIP is now back up, but limited to 5 requests in a 24-hour period.