Public Suffix

I have written a simple PHP class to parse, store and access the Public Suffix List, including a pluggable storage interface for caching the list in a  database.

It is easy to find the top level domain (TLD) for a domain, simply grab the bit after the last dot (e.g., ‘www.example.com’ => ‘com’).  However, for countries like the UK, it is the second level domain, the ‘.co.uk’ or ‘ac.uk’ bit, which is acting more like the principle suffix.  The Public Suffix List is an initiative by Mozilla and other browser vendors, to track theses suffixes.The full list is freely available (under GNU licence) and has a simple well documented ASCII file format (pretty much just a list!).

The parsing/access code has its own microsite where you can try out a live demonstrator to check a domain against the list, and download the source code.  The microsite also includes code examples and documentation.

Here’s some example code that creates an instance of the ManageSLD class, attaches a database store to it, checks to see if the list has been re-read within the last hour, if so re-reads it into the database, and finally checks a few domain names against the list.  This is pretty much the code that sits behind the Check a Domain demonstration page.

require LIB_DIR . 'manage_sld.class.php';
define( 'CACHE_REFRESH_PERIOD', 3600 );  // refresh once an hour

// setup store
$store = new NosqliteRuleStoreSLD( $nosqlite ); // or other user defined store
$manageSLD = new ManageSLD($store);

// get meta information (when last read)
$meta = $store->getMeta();
$now = time();
$usecache = ( $meta && ($now - $meta['timestamp']) > CACHE_REFRESH_PERIOD );

if ( ! $usecache ) {
     // if necessary re-read from public list
    $manageSLD->parseToStore();
    $parseOK = $manageSLD->parseFile();
    if ( $parseOK ) $store->save($manageSLD);
}

// check domains against the list

list( $sld, $label, $rest, $registerable, $pattern, $flags )
                  =  $manageSLD->lookup( 'www.talis.com' )
// returns ( 'com', 'talis', 'www', 'talis.com', 'com', 0 )

list( $sld, $label, $rest, $registerable, $pattern, $flags )
                  =  $manageSLD->lookup( 'www.cs.bham.ac.uk' )
// returns ( 'ac.uk', 'bham', 'www.cs', 'bham.ac.uk', 'ac.uk', 0 )