using the Public Suffix list

On a number of occasions I have wanted to decompose domain names, for example in the URL recogniser in Snip!t.  However, one problem has always been the bit at the end.  It is clear that ‘com’ and ‘ac.uk’ are the principle suffixes of ‘www.alandix.com’ and ‘www.cs.bham.ac.uk’ respectively.  However, while I know that for UK domains it is the last two components that are important (second level domains), I never knew how to work this out in general for other countries.  Happily, Mozilla and other browser vendors have an initiative called the Public Suffix List , which provides a list of just these important critical second level (and deeper level) suffixes.

I recently found I needed this again as part of my Talis research.  There is a Ruby library and a Java sourceforge project for reading the Public Suffix list, and an implementation by the DKIM Reputation project, that transforms the list into generated tables for C, PHP and Perl.  However, nothing for easily and automatically maintaining access to the list.  So I have written a small PHP class to parse, store and access the Public Suffix list. There is an example in the public suffix section of the ‘code’ pages in this blog, and it also has its own microsite including more examples, documentation and a live demo to try.

2 thoughts on “using the Public Suffix list

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>