open data: for all or the few?

On Twitter Jeni Tennison asked:

Question: aside from personally identifiable data, is there any data that *should not* be open?  @JenT 11:19 AM – 14 Jul 12

This sparked a Twitter discussion about limits to openness: exposure of undercover agents, information about critical services that could be exploited by terrorists, etc.   My own answer was:

maybe all data should be open when all have equal ability to use it & those who can (e.g. Google) make *all* processed data open too   @alanjohndix 11:34 AM – 14 Jul 12

That is, it is not clear that just because data is open to all, it can be used equally by everyone.  In particular it will tend to be the powerful (governments and global companies) who have the computational facilities and expertise to exploit openly available data.

In India statistics about the use of their own open government data1 showed that the majority of access to the data was by well-off males over the age of 50 (oops that may include me!) – hardly a cross section of society.  At  a global scale Google makes extensive use of open data (and in some cases such as orphaned works or screen-scraped sites seeks to make non-open works open), but, quite understandably for a profit-making company, Google regards the amalgamated resources as commercially sensitive, definitely not open.

Open data has great potential to empower communities and individuals and serve to strengthen democracy2.  However, we need to ensure that this potential is realised, to develop the tools and education that truly make this resource available to all3.  If not then open data, like unregulated open markets, will simply serve to strengthen the powerful and dis-empower the weak.

  1. I had a reference to this at one point, but can’t locate it, does anyone else have the source for this.[back]
  2. For example, see my post last year “Private schools and open data” about the way Rob Cowen @bobbiecowman used UK government data to refute the government’s own education claims.[back]
  3. In fact there are a variety of projects and activities that work in this area: hackathons, data analysis and visualisation websites such as IBM Many Eyes, data journalism such as Guardian Datablog and some government and international agencies go beyond simply publishing data and offer tools to help users interpret it (I recall Enrico Bertini, worked on this with one of the UN bodies some years go). Indeed there will be some interesting data for mashing at the next Tiree Tech Wave in the autumn.[back]

  1. Thanks Barry, it was not the article I had seen before, which had been more plain stats, but the one you reference is really interesting as Tom Slee gives a concrete example of how the exploitation of open data can pan out in practice.

    It reminds me a little of the classic ‘success’ story for small-scale technology in developing countries reported in initially in a paper by Jensen and often repeated. Fishing boats in Kerala, India started to use mobile phones to allow them to come to land at the ports with the highest fish prices. The fishing boats were shown to earn more on average (marginally), but much more significantly the price in each market became more stable. Classic economics says that price stability is a sign of a good market; however, in a paper with Sri Subramanian , we point out that this increased profit must come from somewhere as the efficiency of the fishing was not obviously improved. Although we could not be sure, it seems likely that the increased profitability of the fishing boats was at the expense of the poorest of the consumers, but moreover benefited the richer consumers more – a net transfer from poor to rich 🙁

  2. After writing this I read Enrico Bertini’s recent blog post “What Is Progress In Visualization?”

    This touches a number of areas, but in particular notes that “Progress As Education and Adoption” is maybe “the most neglected kind of progress”. It is precisely this and other forms of education that are essential to create a data literate society that is able to benefit from, rather than be drowned by, the open data revolution.

