databases as people think – dabble DB

I was just looking at Enrico Bertini‘s blog Visuale for the first time for ages. In particular at his December entry on DabbleDB & Magic/Replace. Dabble DB allows web-based databases and in some ways sits in similar ground with Freebase, Swivel or even Google docs spreadsheet, all ways to share data of different forms on/through the web.

The USP for Dabble DB amongst other online data sharing apps, is that it appears to really be a complete database solution online … and its USB amongst conventional databses is the way they seem to have really thought about real use.  This focus on real use by ordinary users includes dynamically altering the structure of the data as you gradually understand it more.  The model they have is that you start with plain table data from a spreadsheet or other document and gradually add structure as opposed to the “first analyse and then enter” model of traditional DBs.

As I read Enrico’s blog I remembered that he had mailed me about the ‘magic/replace‘ feature ages ago.  This lets you tidy up  data during import (but apparently not data already imported … wonder why?), using a ‘by example’ approach and is a really nice example of all that ‘programming by example‘ and related work that was so hot 15 years ago eventually finding its way into real products.

The downside to Dabble DB is that editing is via forms only … it is often so much easier to enter data in a spreadsheet view, the API is quite limited, and while they have a ‘Dabble DB Commons‘ for public data (rather like Swivel), there is no directory or other way to see what people have put up 🙁

I was particularly hoping the API was better as it would have been nice to link it into my web version of Query-by-Browsing. or even integrate with the Query-through-Drilldown approach for constructing complex table joins that Damon Oram implemented more recently.

In general, while the DB and (many) UI features are strong it is not really looking outwards to creating shared linked data (in the broadest sense of the term, not just pure SemWeb world linked data), … so still room there for the absolute killer shared data app!

just hit search

For years I have heard anecdotal stories of how users are increasingly unaware of the URL itself (and certainly the term,  ‘web address’ is sometimes better).  I recall having a conversation at a university meeting (non-computing) and it soon became obvious that  the term ‘browser’ was also not one they were familiar with even though they of course used it daily.  I guess like the mechanics of the car engine, the mechanics of the web are invisible.

I came across the Google Zeitgeist 2008 page that analyses the popular and the rising search terms of 2008.  The rising ones reveal things in the media “sarah palin” way in there above “obama” in the global stats.  … if Google searches were votes!  However, the ‘most popular’ searches reveal longer term habits.  For the UK the 10 most popular searches are:

  1. facebook
  2. bbc
  3. youtube
  4. ebay
  5. games
  6. news
  7. hotmail
  8. bebo
  9. yahoo
  10. jobs

Some of these terms ‘games’, ‘news’, and ‘jobs’ (no Steve, not you) are generic categories … and suggests that people approach these from the search box, not a portal.  However, of these top 10, seven of them are simply domain names of popular sites.  Instead of typing this into the address bar (which certainly on Firefox autocompletes if I type any I’ve visited before), many users just Google it (and I’m sure the same is true for LiveSearch and others).

I was told some years ago that AOL browsers swapped the relative sizes (and locations I think) of the built-in search box and address bar on the assumption that their users rarelt tyoed in URLs (although I knew of AOL users who accidentally typed URLs into the search box).  Also recalling the company that used to sell net keywords that were used by Netscape (and possibly others) if you entered terms rather than a URL into the adders bar.

… of course if I try that now … FireFox  redirects me through Google “I feel lucky” … of course

Incidentally I came to this as I was trailing back the source of the, now shown to be incorrect, Sunday Times news story that said two Google seaches used the same electricity as boiling an electric kettle.  This got challenged in a TechCrunch blog, refuted by Google, and was effectively (but not explcitly) retracted in subsequent Times online item.  The source turns out to be a junior Harvard physicist, Alex Wissner-Gross, whose own source was a blog by Rolf Kersten, one of the Sun Green Team (Sun the computer manufacturer not Sun the newspaper!), so actually not an unreasonable basis.

In fact Rolf Kersten’s estimate, which was prepared for a talk in 2007, seemed to be based on sensible calculations, although he has recently posted a blog saying the figure was out by a factor of 35 … yes it actually takes 70 Google searches to boil that kettle.  Looking deeper the cause of the discrepancy appears to be the figure he used for the number of Google searches per day.  He took 2005 data about the size of the Google server farm and used a figure of 40 million searches per day.  Although Google did not publish their full workings in their response, it is clearly this figure of 40 million hits that was way too low for 2005 as a Feb 2001 Google press release quoted 60 million searches per day in 2000.  Actually with a moment’s reflection it is clear that 40 million hits per day (500 per second) would hardly have justified a major server farm and the figure is clearly in the billions.  However, it is surprisingly difficult to find the true figure and if you Google “google searches per day” you simply find lots of people asking the same question.  In fact, it was through looking for further Google press releases to find a more up-to-date figure that got me to the Zeitgeist page!

A Eamonn Fitzgerald’s Rainy Day blog nicely lays out the timeline of this story and sees it as a triumph of the power of media consumers to challenge the authority of the press due to what Jay Rosen refers to as  ‘audience atomization‘.   Fitzgerald also sees the paradox that the story itself was sourced from the somewhat broken sources on the internet; in the past the press would have perhaps used more authoritative sources … and as I noted couple of years ago at a Memories for Life panel at the British Library, the move from BBC to YouTube could be read as mass democratisation … or simply signal the end of history.

There is another lesson though, one that I picked up in a blog “keeping track of history” not long after the Memories for Life meeting, just how hard it is to find pretty straightforward information on the web.  At that point I was after Tony Blair’s statement about the execution of Saddam Husssein, in this case trying to find out the number of Google search hits.  Neither are secret, propriety or obscure, but both difficult to track down.

… but we still trust that single hit of a search button

PPIG2008 and the twenty first century coder

Last week I was giving a keynote at the annual workshop PPIG2008 of the Psychology of Programming Interest Group.   Before I went I was politely pronouncing this pee-pee-eye-gee … however, when I got there I found the accepted pronunciation was pee-pig … hence the logo!

My own keynote at PPIG2008 was “as we may code: the art (and craft) of computer programming in the 21st century” and was an exploration of the changes in coding from 1968 when Knuth published the first of his books on “the art of computer programming“.  On the web site for the talk I’ve made a relatively unstructured list of some of the distinctions I’ve noticed between 20th and 21st Century coding (C20 vs. C21); and in my slides I have started to add some more structure.  In general we have a move from more mathematical, analytic, problem solving approach, to something more akin to a search task, finding the right bits to fit together with a greater need for information management and social skills. Both this characterisation and the list are, of course, a gross simplification, but seem to capture some of the change of spirit.  These changes suggest different cognitive issues to be explored and maybe different personality types involved – as one of the attendees, David Greathead, pointed out, rather like the judging vs. perceiving personality distinction in Myers-Briggs1.

One interesting comment on this was from Marian Petre, who has studied many professional programmers.  Her impression, and echoed by others, was that the heavy-hitters were the more experienced programmers who had adapted to newer styles of programming, whereas  the younger programmers found it harder to adapt the other way when they hit difficult problems.  Another attendee suggested that perhaps I was focused more on application coding and that system coding and system programmers were still operating in the C20 mode.

The social nature of modern coding came out in several papers about agile methods and pair programming.  As well as being an important phenomena in its own right, pair programming gives a level of think-aloud  ‘for free’, so maybe this will also cast light on individual coding.

Margaret-Anne Storey gave a fascinating keynote about the use of comments and annotations in code and again this picks up the social nature of code as she was studying open-source coding where comments are often for other people in the community, maybe explaining actions, or suggesting improvements.  She reviewed a lot of material in the area and I was especially interested in one result that showed that novice programmers with small pieces of code found method comments more useful than class comments.  Given my own frequent complaint that code is inadequately documented at the class or higher level, this appeared to disagree with my own impressions.  However, in discussion it seemed that this was probably accounted for by differences in context: novice vs. expert programmers, small vs large code, internal comments vs. external documentation.  One of the big problems I find is that the way different classes work together to produce effects is particularly poorly documented.  Margaret-Anne described one system her group had worked on2 that allowed you to write a tour of your code opening windows, highlighting sections, etc.

I sadly missed some of the presentations as I had to go to other meetings (the danger of a conference at your home site!), but I did get to some and  was particularly fascinated by the more theoretical/philosophical session including one paper addressing the psychological origins of the notions of objects and another focused on (the dangers of) abstraction.

The latter, presented by Luke Church, critiqued  Jeanette Wing‘s 2006 CACM paper on Computational Thinking.  This is evidently a ‘big thing’ with loads of funding and hype … but one that I had entirely missed :-/ Basically the idea is to translate the ways that one thinks about computation to problems other than computers – nerds rule OK. The tenet’s of computational thinking seem to overlap a lot with management thinking and also reminded me of the way my own HCI community and also parts of the Design (with capital D) community in different ways are trying to say they we/they are the universal discipline  … well if we don’t say it about our own discipline who will …the physicists have been getting away with it for years 😉

Luke (and his co-authors) argument is that abstraction can be dangerous (although of course it is also powerful).  It would be interesting perhaps rather than Wing’s paper to look at this argument alongside  Jeff Kramer’s 2007 CACM article “Is abstraction the key to computing?“, which I recall liking because it says computer scientists ought to know more mathematics 🙂 🙂

I also sadly missed some of Adrian Mackenzie‘s closing keynote … although this time not due to competing meetings but because I had been up since 4:30am reading a PhD thesis and after lunch on a Friday had begin to flag!  However, this was no reflection an Adrian’s talk and the bits I heard were fascinating looking at the way bio-tech is using the language of software engineering.  This sparked a debate relating back to the overuse of abstraction, especially in the case of the genome where interactions between parts are strong and so the software component analogy weak.  It also reminded me of yet another relatively recent paper3 on the way computation can be seen in many phenomena and should not be construed solely as a science of computers.

As well as the academic content it was great to be with the PPIG crowd they are a small but very welcoming and accepting community – I don’t recall anything but constructive and friendly debate … and next year they have PPIG09 in Limerick – PPIG and Guiness what could be better!

  1. David has done some really interesting work on the relationship between personality types and different kinds of programming tasks.  I’ve seen him present before about debugging and unfortunately had to miss his talk at PPIG on comprehension.  Given his work has has shown clearly that there are strong correlations between certain personality attributes and coding, it would be good to see more qualitative work investigating the nature of the differences.   I’d like to know whether strategies change between personality types: for example, between systematic debugging and more insight-based scan and see it bug finding. [back]
  2. but I can’t find on their website :-([back]
  3. Perhaps 2006/2007 in either CACM or Computer Journal, if anyone knows the one I mean please remind me![back]

local URIs … mashing up the desktop

I’ve worried for a while about desktop URLs.

Within the web it is easy to link things together. If I want to refer to my home page I just add a link like this. However, on the desktop things are not so simple and I end up copying chunks of mail messages into the notes field in iCal rather than simply being able to link to the mail message where I arranged the meeting.

Links from the desktop to the web are easy … just use the URL … many desktop applications including mail clients and word processors will allow you to embed clickable links. Indeed it is often easier to link to a web page than to another object on the desktop! However, things get more difficult if you want to link the other way round, from a web page to a local file or resource. In my browser’s favourites I have several links to local files, but you cannot easily do the same if your bookmarks are in a web service like del.icio.us or even my own Snip!t. It is hard to seamlessly weave your desktop into the global web.

A couple of events brought this issue to a head for me.

First at the CHI workshop on PIM entitled the Disappearing Desktop, I asked if anyone knew of work in the area and I heard from Leo Sauermann that they had made some progress on this as part of the Gnowsis project. Their proposal for a Desktop URI Scheme (edited by Leo) is targeted principally at the first of the scenarios above, being able to link between things within the desktop.

The second event was at the AVI workshop on designing multi-touch interaction techniques for coupled public and private displays. During discussions abut touch-based interactions such as the Microsoft Surface or Apple iPhone, we considered scenarios where peole got together for a meeting (as we were) in a hotel bar (where we split for small group discussion) and had screens on table tops and walls, laptops, tablets, phones … and wanted to seamlessly move material between devices. Clearly an essential requirement for which is some way to identify resources across ad hoc collections of devices.

Finally I was in Athens working with George Lepouras, Akrivi Katifori and others. George had developed a Thunderbird extension to allow Snip!t to snip from mail messages … but while we could snip the text there was no way for the Snip!t page to link back to the mail message. We need full round trip URIs that link desktop and web with no distinction – URIs that can be embedded in a web page and (assuming you have the right permissions and are in an appropriate place) can be clicked and the appropriate mail message, calendar entry or whatever is opened.

Based on this and discussions we had, I drafted a discussion document on globally accessible local URIs. Any feedback very welcome.

Over the summer we hope to put together a demonstrator / reference implementation – if anyone is interested let me know.

HCI and CSCW – is your usability too small

Recently heard some group feedback on our HCI textbook. Nearly all said that they did NOT want any CSCW. I was appalled as considering any sort of user interaction without its surrounding social and organisational settings seems as fundamentally misbegotten as considering a system without its users.

Has the usability world gone mad or is it just that our conception of HCI has become too narrow?

Continue reading

Usabilty and Web2.0

Nad did a brilliant guest lecture for our undergraduate HCI class at Lancaster on Monday. His slides and blog about the lecture are at Virtual Chaos. He touched on issues of democracy vs. authority of information, dynamic content vs. accessibility and of course increasing issues of privacy on social networking sites. He also had awesome slides to using loads of Flickr photos under creative commons … community content in action not just words! Of course also touched on Web3.0 and future convergence between emergent community phenomena and structured Semantic Web technologies.

digging ourselves back from the Semantic Web mire

Discussions on the Talis Platform Advisory Group prompted me to look at some of the APIs of new Semantic-Web-like services such as Freebase1.

Freebase is interesting as its underlying representation is graph/relationship based like RDF, but its Metaweb Query Language (MQL) uses JSON which is a more programming-like whole and parts representation with arrays and slots. Facebook’s new Data Store API also has objects and associations, but does not use RDF or other obvious web technologies.

So the question is – if the closest things to Semantic Web apps on the internet don’t use SemWeb techology like RDF, SPARQL etc. … are these SemWeb techologies fit for purpose or indeed useful at all?

I think the answer is that (i) partly they are not fit for purpose – caught in a backwater by their history, but (ii) that is like all things and they are what we have got, and (iii) we can use some of the tools of computing to make them work …

Continue reading

  1. listen to Talis’ pod cast interview with Jamie Taylor Freebase’s ‘Minister of Information’ (sic).[back]

Google’s Vint Cerf avoiding responsibility

Yesterday morning I was on my way into Lancaster and listening to the Today programme. Google’s ‘internet evangelist’ Vint Cerf was being interviewed by John Humphrys and the topic was ‘should the internet be regulated like other media’.1

Not surprisingly Vint Cerf thought not, but I was surprised how well he avoided actually saying so. John Humphrys is experienced and politicians fear him in these early morning interviews, but to be honest he was completely outclassed by Vint Cerf who sidestepped, avoided and generally never addressed the question.

Web 2.0 was the heart of the issue. With end-user content now dominating the internet do service providers such as YouTube (of course owned by Google) have any responsibility for the kinds of material hosted?

This was in the context of videos of ‘happy slappers’ and other violent attacks being posted, but more generally that whereas TV in many countries is limited in the kinds of material it can show, particularly early in the evening when children are more likely to be watching, is limited by a mixture of voluntary and satutory codes. Why not the internet?

Vint Cerf repeatedly re-iterated the same message “Google is law abiding” if content is not legal it is removed. Implicitly the message was “if it is not illegal it is OK”, but as I said he carefully avoided saying so.

The closest point to actually addressing the question was when John Humphrys suggested that technologies could be misused like research for atomic power being used for nuclear weapons (strange I thought it went the other way round?). Vent Cerf’s response was, the standard neutrality of technology stance, that the makers of roads were not responsible for car deaths, strip development … the same argument used by arms dealers, manufacturers of gas guzzling cars, and scientists in every repressive regime in recent history.

According to Cerf if you are a worried parent you need to buy good filtering software; the solution is at the edges of the net … and of course does not involve the likes of Google … who it appears from the context is at the centre?
Now there are very good arguments against regulation both ethical (freedom of expression) and practical (volume of material, international access). The disappointing, and worrying, aspect of this interview was that Google’s key public face was unwilling or unable to constructively enter the debate at all.

  1. “The 0810 Interview: Godfather of the Internet”, BBC4, Today Programme, Wednesday, 29th August 2007[back]

online image manipulation

I was looking around for online image manipulation programs. I found a few and all seemed quite basic (no real turbocharged web2.0 one), but iaza caught my eye … basic but lots of pictures of the site developer’s Tibetan spaniel Niro!
IAZA - image manipulation and cute doggies!

The other one that seemed interesting was wiredness which is quite basic in functionality, but clearly aiming to be zappy in its interface … and has an API so you can send images to it from other sites for manipulation (what I was looking for).