PPIG2008 and the twenty first century coder

Last week I was giving a keynote at the annual workshop PPIG2008 of the Psychology of Programming Interest Group.   Before I went I was politely pronouncing this pee-pee-eye-gee … however, when I got there I found the accepted pronunciation was pee-pig … hence the logo!

My own keynote at PPIG2008 was “as we may code: the art (and craft) of computer programming in the 21st century” and was an exploration of the changes in coding from 1968 when Knuth published the first of his books on “the art of computer programming“.  On the web site for the talk I’ve made a relatively unstructured list of some of the distinctions I’ve noticed between 20th and 21st Century coding (C20 vs. C21); and in my slides I have started to add some more structure.  In general we have a move from more mathematical, analytic, problem solving approach, to something more akin to a search task, finding the right bits to fit together with a greater need for information management and social skills. Both this characterisation and the list are, of course, a gross simplification, but seem to capture some of the change of spirit.  These changes suggest different cognitive issues to be explored and maybe different personality types involved – as one of the attendees, David Greathead, pointed out, rather like the judging vs. perceiving personality distinction in Myers-Briggs1.

One interesting comment on this was from Marian Petre, who has studied many professional programmers.  Her impression, and echoed by others, was that the heavy-hitters were the more experienced programmers who had adapted to newer styles of programming, whereas  the younger programmers found it harder to adapt the other way when they hit difficult problems.  Another attendee suggested that perhaps I was focused more on application coding and that system coding and system programmers were still operating in the C20 mode.

The social nature of modern coding came out in several papers about agile methods and pair programming.  As well as being an important phenomena in its own right, pair programming gives a level of think-aloud  ‘for free’, so maybe this will also cast light on individual coding.

Margaret-Anne Storey gave a fascinating keynote about the use of comments and annotations in code and again this picks up the social nature of code as she was studying open-source coding where comments are often for other people in the community, maybe explaining actions, or suggesting improvements.  She reviewed a lot of material in the area and I was especially interested in one result that showed that novice programmers with small pieces of code found method comments more useful than class comments.  Given my own frequent complaint that code is inadequately documented at the class or higher level, this appeared to disagree with my own impressions.  However, in discussion it seemed that this was probably accounted for by differences in context: novice vs. expert programmers, small vs large code, internal comments vs. external documentation.  One of the big problems I find is that the way different classes work together to produce effects is particularly poorly documented.  Margaret-Anne described one system her group had worked on2 that allowed you to write a tour of your code opening windows, highlighting sections, etc.

I sadly missed some of the presentations as I had to go to other meetings (the danger of a conference at your home site!), but I did get to some and  was particularly fascinated by the more theoretical/philosophical session including one paper addressing the psychological origins of the notions of objects and another focused on (the dangers of) abstraction.

The latter, presented by Luke Church, critiqued  Jeanette Wing‘s 2006 CACM paper on Computational Thinking.  This is evidently a ‘big thing’ with loads of funding and hype … but one that I had entirely missed :-/ Basically the idea is to translate the ways that one thinks about computation to problems other than computers – nerds rule OK. The tenet’s of computational thinking seem to overlap a lot with management thinking and also reminded me of the way my own HCI community and also parts of the Design (with capital D) community in different ways are trying to say they we/they are the universal discipline  … well if we don’t say it about our own discipline who will …the physicists have been getting away with it for years 😉

Luke (and his co-authors) argument is that abstraction can be dangerous (although of course it is also powerful).  It would be interesting perhaps rather than Wing’s paper to look at this argument alongside  Jeff Kramer’s 2007 CACM article “Is abstraction the key to computing?“, which I recall liking because it says computer scientists ought to know more mathematics 🙂 🙂

I also sadly missed some of Adrian Mackenzie‘s closing keynote … although this time not due to competing meetings but because I had been up since 4:30am reading a PhD thesis and after lunch on a Friday had begin to flag!  However, this was no reflection an Adrian’s talk and the bits I heard were fascinating looking at the way bio-tech is using the language of software engineering.  This sparked a debate relating back to the overuse of abstraction, especially in the case of the genome where interactions between parts are strong and so the software component analogy weak.  It also reminded me of yet another relatively recent paper3 on the way computation can be seen in many phenomena and should not be construed solely as a science of computers.

As well as the academic content it was great to be with the PPIG crowd they are a small but very welcoming and accepting community – I don’t recall anything but constructive and friendly debate … and next year they have PPIG09 in Limerick – PPIG and Guiness what could be better!

  1. David has done some really interesting work on the relationship between personality types and different kinds of programming tasks.  I’ve seen him present before about debugging and unfortunately had to miss his talk at PPIG on comprehension.  Given his work has has shown clearly that there are strong correlations between certain personality attributes and coding, it would be good to see more qualitative work investigating the nature of the differences.   I’d like to know whether strategies change between personality types: for example, between systematic debugging and more insight-based scan and see it bug finding. [back]
  2. but I can’t find on their website :-([back]
  3. Perhaps 2006/2007 in either CACM or Computer Journal, if anyone knows the one I mean please remind me![back]

Firefox 3 seems to have fixed memory problems

I had been reluctantly considering giving up using Firefox as it crawled to a halt so often on so many sites. To be fair I think it is because I keep lots of tabs open and Firefox did not seem to deal will with pages with many refreshing elements … many air and train ticketing sites were particular problems. However, Firefox 3 has been running continuously for some time now and looking at ‘top’ in terminal window has about 1/3 the real memory footprint compared with Firefox 2 … now it is comparable with Word, Dreamweaver, etc. … I had been sticking with Firefox largely because the Firefox Snip!t bookmarklet works better than the Safari one, so now I can continue to do so without the machine crawling to a halt – well done team Mozilla 🙂

local URIs … mashing up the desktop

I’ve worried for a while about desktop URLs.

Within the web it is easy to link things together. If I want to refer to my home page I just add a link like this. However, on the desktop things are not so simple and I end up copying chunks of mail messages into the notes field in iCal rather than simply being able to link to the mail message where I arranged the meeting.

Links from the desktop to the web are easy … just use the URL … many desktop applications including mail clients and word processors will allow you to embed clickable links. Indeed it is often easier to link to a web page than to another object on the desktop! However, things get more difficult if you want to link the other way round, from a web page to a local file or resource. In my browser’s favourites I have several links to local files, but you cannot easily do the same if your bookmarks are in a web service like del.icio.us or even my own Snip!t. It is hard to seamlessly weave your desktop into the global web.

A couple of events brought this issue to a head for me.

First at the CHI workshop on PIM entitled the Disappearing Desktop, I asked if anyone knew of work in the area and I heard from Leo Sauermann that they had made some progress on this as part of the Gnowsis project. Their proposal for a Desktop URI Scheme (edited by Leo) is targeted principally at the first of the scenarios above, being able to link between things within the desktop.

The second event was at the AVI workshop on designing multi-touch interaction techniques for coupled public and private displays. During discussions abut touch-based interactions such as the Microsoft Surface or Apple iPhone, we considered scenarios where peole got together for a meeting (as we were) in a hotel bar (where we split for small group discussion) and had screens on table tops and walls, laptops, tablets, phones … and wanted to seamlessly move material between devices. Clearly an essential requirement for which is some way to identify resources across ad hoc collections of devices.

Finally I was in Athens working with George Lepouras, Akrivi Katifori and others. George had developed a Thunderbird extension to allow Snip!t to snip from mail messages … but while we could snip the text there was no way for the Snip!t page to link back to the mail message. We need full round trip URIs that link desktop and web with no distinction – URIs that can be embedded in a web page and (assuming you have the right permissions and are in an appropriate place) can be clicked and the appropriate mail message, calendar entry or whatever is opened.

Based on this and discussions we had, I drafted a discussion document on globally accessible local URIs. Any feedback very welcome.

Over the summer we hope to put together a demonstrator / reference implementation – if anyone is interested let me know.

It-ness and identity: FOAF, RDF and RDMS

Issues of ‘sameness’ are the underpinnings of any common understanding; if I talk about America, bananas or Caruso, we need to know we are talking about the ‘same’ thing.

Codd’s relational calculus was unashamedly phenomenological – if two things have the same attributes they are the same. Of course in practice, we often have things which look the same and yet we know are different: two cans of beans, two employees called David Jones. So many practical SQL database designs use unique ids as the key field of a table effectively making sure that otherwise identical rows are distinct1.

The id gives a database record identity – it is a something independent of its attributes.

I usually call this quality ‘it-ness’ and struggled to find appropriate (probably German) philosophical term to refer to it. Before we can point at something and say ‘it is a chair’, it must be an ‘it’ something we can refer to. This it-ness must be there before we consider the proeprties of ‘ot’ (legs, seat, etc.). It-ness is related to the substance/accident distinction important in medieval scholastic debate on transubstantiation, but different as the bread needs to be an ‘it’ before we can say that its real nature (substance) is different from its apparent nature (accidents).

In contrast RDF takes identity, as embodied in a URI, as its starting point. The origins of RDF are in web meta-data – talking about web pages … that is RDF is about talking about something else, and that something else has some form of (unique) identity. Although the word ‘ontology’ seems to be misused almost beyond recognition in computer science, here we are talking about true ontology. RDF assumes as a starting point it is discussing things that are, that exist, that have being. Given this of course several distinct things may have similar attributes2.

Whilst RDMS have problems talking about identity, and we often have to add artifices (like the id), to establish identity, in RDF the opposite problem arises. Often we do not have unique names even for web entities, and even less when we have RDF descriptions of people, places … or books. Nad discusses some of the problems of cleaning up book data (MARC, RDF and FRMR), part of which is establishing unique names … and really books are ‘easy’ as librarians have soent a long time thinking about idetifying them already.

FOAF (friend of a friend) is now widely used to represent personal relationships. In this WordPress blog, when I add blogroll entries it prompts for FOAF information: is this a work colleague, family, friend (but not foe or competitor … FOAF is definitely about being friendly!).

FOAF has an RDF format, but examples, both in practice … and in the XMLNS RDF specification, are not full of “rdf:about” links as are typical RDF documents. This is because, while people clearly do have unique identity, there is thankfully no URI scheme that uniquely and universally defies us3.

In practice FOAF says things like “there is a person whose name is John Doe”, or “the blog VirtualChaos is by a person who is a friend and colleague of the author of this blog”.

In terms of identity this is a blank node “the person who …”. The computational representation of the person is a placeholder, or a variable waiting to be associated with other placeholders.

In terms of phenomenological attributes, the values either do not uniquely identify an individual (here may be many John Doe’s) and the individual may have several potential values for a given attribute (John Doe may not be the body’s only name,and a person may have several email addresses).

In order to match individuals in FOAF, we typically need to make assumption: while I may have several email addresses, they are all personal, so if two people have the same email address they are the same person. Of course such reasoning is defeasible: some families share an email address, but serves as a way of performing partial and approximate matching.

I think to the semantic web purist the goal would be to have the unique personal URI. However, to my mind the incomplete, often vague and personally defined FOAF is closer to the way the real world works even when ontologically there is a unique entity in the world that is the subject. FOAF challenges simplistic assumptions and representations of both a phenomenological and ontological nature.

  1. Furthermore if you do not specify a key, RDMS are likely to treat a relation as bag rather than a set of tuples! Try inserting the same record twice.[back]
  2. For those who know their quantum mechanics RDMS records are like Fermions and obey Pauli exclusion principle, whilst RDF entities are like Bosons and several entities can exist with identical attributes.[back]
  3. As it says in The Prisoner “I am not a number” … although maybe one day soon we will all be biometrically identified and have a global URI :-/[back]

modelling entities and the history of ideas

A few weeks ago I was watching my friend Nad and some of his colleagues map out the key semantic entities for a domain ready for creating an open repository on the Talis Platform. Then this morning, by chance, I just came across an entry in the Portland Pattern Repository on “Stars: A Pattern Language for Query Optimized Schema1. This described a form of entity relationship modelling where one identifies “while business entities” (key business things like transactions) and then ‘dimensions’ that relate to them looking specifically for “people, places and things” … and time as a special case. The idea is that information about, say, a product, a customer or a salesperson tends to be scattered in different tables, linked at best implicitly by shared values (e.g. , a product code.) which may not actually be the key to any specifoc table. By giving these key entities their own tables, the linkage between them becomes obvious and queries are easy to form.

This sounded just like the focus in Semantic Web ontologies on having a shared set of classes and a unique identifier for entities in those classes to enable “linked data”. In SemWeb we have a class and the entity URI, but serving the same goal as the table and row key did in the “Stars” pattern, more than a decade ago.

However, this then reminded me again of the similarities between current ontologies and Extended Entity Relationship Models which were popular at least 20 years ago (I recall colleagues at York using this in the Aspect IPSE project2 in the Alvey programme). There were variants of EERM, but key concepts were (like standard ER) to have relationships explicitly defined – usually in tables containing ONLY foreign keys, and (unlike standard ER) to allow sub-typing/sub-classes (e.g. person > employee > academic). Like Semantic Web ontologies, the relationships were reified into tables and became first class, unlike SemWeb, ternary and higher order relationships were allowed, not just binary ones.

I was half way through writing Nad a mail with the Stars link, and was referring to the EERM and did a Google to find the correct acronym … and then that mail turned into this blog, partly because it was getting long and partly because it was so hard to find good references to EERM. There was no Wikipedia page, minimal entries in a few online dictionaries and even hard to find good paper references. The best link seems to be a 1994 book by Martin Gogolla3, but that was at least 8 years after I knew it was popular, but I guess when it had become stable. A bit more Googling unearthed a 1986 Computer Surveys article4 (whch I could NOT find using “extended entity-relationship model” in ACM DL’s own search) and eventually a 1981 paper5, although I’m not sure the latter uses the term in the same way as later EERM.

It was interesting and alarming to find, yet again, how difficult it is to find certain things in the web even in computing… like anything that happened more than 10 years ago! Some of this is to do with effective searching (like the ACM DL search), some because it is not there. However, as we rely more and more on online search, the recency effects produced by combinations of search ranking and availability can only get worse. This feels more like the attributes of pre-literate culture … the new digital stone age?

  1. also reported in: Peterson, S. 1995. Stars: a pattern language for query-optimized schemas. In Pattern Languages of Program Design, J. O. Coplien and D. C. Schmidt, Eds. ACM Press/Addison-Wesley Publishing Co., New York, NY, 163-177. [entry@ACM DL][back]
  2. Hitchcock, P. 1989. The process model of the aspect IPSE. SIGSOFT Softw. Eng. Notes 14, 4 (May. 1989), 76-78. DOI= http://doi.acm.org/10.1145/75111.75120[back]
  3. M. Gogolla. An Extended Entity Relationship Model. Fundamentals and Pragmatics. Springer, Berlin, LNCS 767, 1994[back]
  4. Teorey, T. J., Yang, D., and Fry, J. P. 1986. A logical design methodology for relational databases using the extended entity-relationship model. ACM Comput. Surv. 18, 2 (Jun. 1986), 197-222. DOI= http://doi.acm.org/10.1145/7474.7475[back]
  5. P. De, A. Sen and E. Gudes, An Extended Entity-Relationship Model with Multi Level External Views, in Proceedings of ER’81, North-Holland, 1981, pp. 455-472.[back]

practical RDF

I just came across D2RQ, a notation (plus implementation) for mapping relational databases to RDF, developed over the last four years by Chris Bizer, Richard Cyganiak and others at Freie Universität Berlin. In a previous post, “digging ourselves back from the Semantic Web mire“, I worried about the ghetto-like nature of RDF and the need for “abstractions that make non-triple structures more like the Semantic Web”, D2RQ is exactly the sort of thing, allowing existing relational databases to be accessed (but not updated) as if they were and RDF triple stores, including full SPARQL queries.

As D2RQ has clearly been around for years, I tried to do a bit of a web search to find things the other way around – more programmer-friendly layers on top of RDF (or XML) allowing it to be manipulated with IDL-like or other abstractions closer to ‘normal’ programming. ECMAScript for XML (E4X) seems to be just this allowing reasonably easy access to XML (but I guess RDF would be ‘flat’ in this). E4X has been around a few years (standard since 2005), but as far as I can see not yet in IE (surprise!). I guess for really practical XML it would be JSON, and there’s a nice discussion of different RDF in JSON representation issues on the n2 wiki “RDF JSON Brainstorming“. However, both E4X and RDF in JSON still are just accessing RDF nicely not adding higher level structure.

Going back to the beginning I was wondering about any tools that represent RDF as SQL / RDMS in order to make it available to ‘old technology’ … but then remembered that SPARQL creates tuples not triples so, I guess, one could say that is exactly what it does :-/

HCI and CSCW – is your usability too small

Recently heard some group feedback on our HCI textbook. Nearly all said that they did NOT want any CSCW. I was appalled as considering any sort of user interaction without its surrounding social and organisational settings seems as fundamentally misbegotten as considering a system without its users.

Has the usability world gone mad or is it just that our conception of HCI has become too narrow?

Continue reading

puzzle with pictures

As it was new Years Day and it was too wet to go shift earth in the garden I thought I’d play a bit with Professor Alan’s puzzle square. I’ve had a ‘make your own’ version for years, but you had to chop an image into bits give them special names, etc. Now it works much more easily with any image (try it yourself). This are a couple I made with my own photos:

needs Javascript   needs Javascript

The key is that it is I am now using the CSS clip property which allows you to show selected parts of an image (or in fact any HTML element). This was made a little more complicated due to the fact that the W3C pages for clip give running examples for every other kind of visual effect … but not clip! Googling was a nightmare as it turns up page after page in forums saying “I can’t get clip to work”!

Happily I found seifi.org (a blog that looks like a really great web resource) and a post on Creating Thumbnails Using the CSS Clip Property. This was full of meticulously laid out examples … Mojo Seifi, you are a star!

Continue reading

Christmas lights and crackers

FireFly in at CityLabThe lights are on in Lancaster City Centre. After three years FireFly has its first real public outing. You can see them now and for the next few months in Dalton Square in the centre of Lancaster.

The initial idea for FireFly came after a meeting where we had been discussing various forms of public displays. For some reason, a thought struck – those tiny lights Individual lightsthat you see in trees outside hotels and in city centres throughout the year; what if you could turn those into a display?

That was in early 2004 and now after several years of Joe Finney’s ingenuity and Angie Chandler’s hard work, they are here. Behind each of the three thousand lights in the CityLab display is a microprocessor as fast as my first PC! If you are near Lancaster in the next few months do go see them, they are just by the City Hall, under the beady eye of Queen Victoria’s statue.

facebookIn another bit of Christmas technology, this year I ported crackers to facebook. The good old virtual crackers are still there, but now you can also send facebook crackers 🙂 Mostly this was a straight port, but there are interesting differences because of the way facebook works as a platform. I’ve been watching the stats I collect and it is interesting to see that whilst around 2/3 of email crackers are opened only about 1 in 5 are opened in facebook. I guess with so much food being thrown and the like, facebook users are far more used to ignoring things :-/

Links: Lights in City Lab | Crackers on Facebook | Email Crackers

Physicality and Middle Ages Tech Support

Ansgarr needing help to use a bookOn the forum of our MRes course at Lancaster one of the students posted a link to Middle Ages Tech Support on YouTube. It shows Ansgarr a Mediaeval monk struggling with his first book.

I first saw this video when I was giving a talk at University of Peloponnese in Tripolis. Georgios1 showed the video before I started, just because he thought it was fun. I was talking a little bit about physicality and the video brought up some really interesting issues relating to this and usability. Although it is a comic video we can unpack it and ask which of the problems that Ansgarr has as he changes technology from scroll to book would actually happen and which are more our own anachronistic view of the past.

Continue reading

  1. one of my hosts there as part of the TIM project[back]