endings and beginnings: cycling, HR and Talis

It is the end of the summer, the September rush starts (actually at the end of August) and on Friday I’ll be setting off on the ferry and be away from home for all of September and October 🙁  Of course I didn’t manage to accomplish as much as I wanted over the summer, and didn’t get away on holiday … except of course living next to the sea is sort of like holiday every day!  However, I did take some time off when Miriam visited, joining her on cycle rides to start her training for her Kenyan challenge — neither of us had been on a bike for 10 years!  Also this last weekend saw the world come to Tiree when a group of asylum seekers and refugees from the St Augustine Centre in Halifax visited the Baptist Church here — kite making, songs from Zimbabwe and loads of smiling faces.

In September I also hand over departmental personnel duty (good luck Keith :-)).  I’d taken on the HR role before my switch to part-time at the University, and so most of it stayed with me through the year 🙁 (Note, if you ever switch to part-time, better to do so before duties are arranged!). Not sorry to see it go, the people bit is fine, but so much paper filling!

… and beginnings … in September (next week!) I also start to work part-time with Talis.  Talis is a remarkable story.  A library information systems company that re-invented itself as a Semantic Web company and now, amongst other things, powering the Linked Data at data.gov.uk.

I’ve known Talis as a company from its pre-SemWeb days when aQtive did some development for them as part of our bid to survive the post-dot.com crash.   aQtive did in the end die, but Talis had stronger foundations and has thrived1.  In the years afterwards two ex-aQtive folk, Justin and Nad, went to Talis and for the past couple of years I have also been on the external advisory group for their SemWeb Platform.  So I will be joining old friends as well as being part of an exciting enterprise.

  1. Libraries literally need very strong foundations.  I heard of one university library that had to be left half empty because the architect had forgotten to take account of the weight of books.  As the shelves filled the whole building began to sink into the ground.[back]

Italian conferences: PPD10, AVI2010 and Search Computing

I got back from trip to Rome and Milan last Tuesday, this included the PPD10 workshop that Aaron, Lucia, Sri and I had organised, and the AVI 2008 conference, both in University of Rome “La Sapienza”, and a day workshop on Search Computing at Milan Polytechnic.

PPD10

The PPD10 workshop on Coupled Display Visual Interfaces1 followed on from a previous event, PPD08 at AVI 2008 and also a workshop on “Designing And Evaluating Mobile Phone-Based Interaction With Public Displays” at CHI2008.  The linking of public and private displays is something I’ve been interested in for some years and it was exciting to see some of the kinds of scenarios discussed at Lancaster as potential futures some years ago now being implemented over a range of technologies.  Many of the key issues and problems proposed then are still to be resolved and new ones arising, but certainly it seems the technology is ‘coming of age’.  As well as much work filling in the space of interactions, there were also papers that pushed some of the existing dimensions/classifications, in particular, Rasmus Gude’s paper on “Digital Hospitality” stretched the public/private dimension by considering the appropriation of technology in the home by house guests.  The full proceedings are available at the PPD10 website.

AVI 2010

AVI is always a joy, and AVI 2010 no exception; a biennial, single-track conference with high-quality papers (20% accept rate this year), and always in lovely places in Italy with good food and good company!  I first went to AVI in 1996 when it was in Gubbio to give a keynote “Closing the Loop: modelling action, perception and information“, and have gone every time since — I always say that Stefano Levialdi is a bit like a drug pusher, the first experience for free and ever after you are hooked! The high spot this year was undoubtedly Hitomi Tsujita‘s “Complete fashion coordinator2, a system for using social networking to help choose clothes to wear — partly just fun with a wonderful video, but also a very thoughtful mix of physical and digital technology.


images from Complete Fashion Coordinator

The keynotes were all great, Daniel Keim gave a really lucid state of the art in Visual Analytics (more later) and Patrick Lynch a fresh view of visual understanding based on many years experience and highlighting particularly on some of the more immediate ‘gut’ reactions we have to interfaces.  Daniel Wigdor gave an almost blow-by-blow account of work at Microsoft on developing interaction methods for next-generation touch-based user interfaces.  His paper is a great methodological exemplar for researchers combining very practical considerations, more principled design space analysis and targeted experimentation.

Looking more at the detail of Daniel’s work at Microsoft, it is interesting that he has a harder job than Apple’s interaction developers.  While Apple can design the hardware and interaction together, MS as system providers need to deal with very diverse hardware, leading to a ‘least common denominator’ approach at the level of quite basic touch interactions.  For walk-up-and use systems such as Microsoft Surface in bar tables, this means that users have a consistent experience across devices.  However, I did wonder whether this approach which is basically the presentation/lexical level of Seeheim was best, or whether it would be better to settle at some higher-level primitives more at the Seeheim dialog level, thinking particularly of the way the iPhone turns pull down menus form web pages into spinning selectors.  For devices that people own it maybe that these more device specific variants of common logical interactions allow a richer user experience.

The complete AVI 2010 proceedings (in colour or B&W) can be found at the conference website.

The very last session of AVI was a panel I chaired on “Visual Analytics: people at the heart of data” with Daniel Keim, Margit Pohl, Bob Spence and Enrico Bertini (in the order they sat at the table!).  The panel was prompted largely because the EU VisMaster Coordinated Action is producing a roadmap document looking at future challenges for visual analytics research in Europe and elsewhere.  I had been worried that it could be a bit dead at 5pm on the last day of the conference, but it was a lively discussion … and Bob served well as the enthusiastic but also slightly sceptical outsider to VisMaster!

As I write this, there is still time (just, literally weeks!) for final input into the VisMaster roadmap and if you would like a draft I’ll be happy to send you a PDF and even happier if you give some feedback 🙂

Search Computing

I was invited to go to this one-day workshop and had the joy to travel up on the train from Rome with Stu Card and his daughter Gwyneth.

The search computing workshop was organised by the SeCo project. This is a large single-site project (around 25 people for 5 years) funded as one of the EU’s ‘IDEAS Advanced Grants’ supporting ‘investigation-driven frontier research’.  Really good to see the EU funding work at the bleeding edge as so many national and European projects end up being ‘safe’.

The term search computing was entirely new to me, although instantly brought several concepts to mind.  In fact the principle focus of SeCo is the bringing together of information in deep web resources including combining result rankings; in database terms a form of distributed join over heterogeneous data sources.

The work had many personal connections including work on concept classification using ODP data dating back to aQtive days as well as onCue itself and Snip!t.  It also has similarities with linked data in the semantic web word, however with crucial differences.  SeCo’s service approach uses meta-descriptions of the services to add semantics, whereas linked data in principle includes a degree of semantics in the RDF data.  Also the ‘join’ on services is on values and so uses a degree of run-time identity matching (Stu Card’s example was how to know that LA=’Los Angeles’), whereas linked data relies on URIs so (again in principle) matching has already been done during data preparation.  My feeling is that the linking of the two paradigms would be very powerful, and even for certain kinds of raw data, such as tables, external semantics seems sensible.

One of the real opportunities for both is to harness user interaction with data as an extra source of semantics.  For example, for the identity matching issue, if a user is linking two data sources and notices that ‘LA’ and ‘Los Angeles’ are not identified, this can be added as part of the interaction to serve the user’s own purposes at that time, but by so doing adding a special case that can be used for the benefit of future users.

While SeCo is predominantly focused on the search federation, the broader issue of using search as part of algorithmics is also fascinating.  Traditional algorithmics assumes that knowledge is basically in code or rules and is applied to data.  In contrast we are seeing the rise of web algorithmics where knowledge is garnered from vast volumes of data.  For example, Gianluca Demartini at the workshop mentioned that his group had used the Google suggest API to extend keywords and I’ve seen the same trick used previously3.  To some extent this is like classic techniques of information retrieval, but whereas IR is principally focused on a closed document set, here the document set is being used to establish knowledge that can be used elsewhere.  In work I’ve been involved with, both the concept classification and folksonomy mining with Alessio apply this same broad principle.

The slides from the workshop are appearing (but not all there yet!) at the workshop web page on the SeCo site.

  1. yes I know this doesn’t give ‘PPD’ this stands for “public and private displays”[back]
  2. Hitomi Tsujita, Koji Tsukada, Keisuke Kambara, Itiro Siio, Complete Fashion Coordinator: A support system for capturing and selecting daily clothes with social network, Proceedings of the Working Conference on Advanced Visual Interfaces (AVI2010), pp.127–132.[back]
  3. The Yahoo! Related Suggestions API offers a similar service.[back]

databases as people think – dabble DB

I was just looking at Enrico Bertini‘s blog Visuale for the first time for ages. In particular at his December entry on DabbleDB & Magic/Replace. Dabble DB allows web-based databases and in some ways sits in similar ground with Freebase, Swivel or even Google docs spreadsheet, all ways to share data of different forms on/through the web.

The USP for Dabble DB amongst other online data sharing apps, is that it appears to really be a complete database solution online … and its USB amongst conventional databses is the way they seem to have really thought about real use.  This focus on real use by ordinary users includes dynamically altering the structure of the data as you gradually understand it more.  The model they have is that you start with plain table data from a spreadsheet or other document and gradually add structure as opposed to the “first analyse and then enter” model of traditional DBs.

As I read Enrico’s blog I remembered that he had mailed me about the ‘magic/replace‘ feature ages ago.  This lets you tidy up  data during import (but apparently not data already imported … wonder why?), using a ‘by example’ approach and is a really nice example of all that ‘programming by example‘ and related work that was so hot 15 years ago eventually finding its way into real products.

The downside to Dabble DB is that editing is via forms only … it is often so much easier to enter data in a spreadsheet view, the API is quite limited, and while they have a ‘Dabble DB Commons‘ for public data (rather like Swivel), there is no directory or other way to see what people have put up 🙁

I was particularly hoping the API was better as it would have been nice to link it into my web version of Query-by-Browsing. or even integrate with the Query-through-Drilldown approach for constructing complex table joins that Damon Oram implemented more recently.

In general, while the DB and (many) UI features are strong it is not really looking outwards to creating shared linked data (in the broadest sense of the term, not just pure SemWeb world linked data), … so still room there for the absolute killer shared data app!

making life easier – quick filing in visible folders

It is one of those things that has bugged me for years … and if it was right I would probably not even notice it was there – such is the nature of good design, but …  when I am saving a file from an application and I already have a folder window open, why is it not easier to select the open folder as the destination.

A scenario: I have just been writing a reference for a student and have a folder for the references open on my desktop. I select “Save As …” from the Word menu and get a file selection dialogue, but I have to navigate through my hard disk to find the folder even though I can see it right in front of me (and I have over 11000 folders, so it does get annoying).

The solution to this is easy, some sort of virtual folder at the top level of the file tree labelled “Open Folders …” that contains a list of the curently open folder windows in the finder.  Indeed for years I instinctively clicked on the ‘Desktop’ folder expecting this to contain the open windows, but of course this just refers to the various aliases and files permamently on the desktop background, not the open windows I can see in front of me.

In fact as Mac OSX is built on top of UNIX there is an easy very UNIX-ish fix (or maybe hack), the Finder could simply maintain an actual folder (probably on the desktop) called “Finder Folders” and add aliases to folders as you navigate.  Although less in the spirit of Windows, this would certainly be possible there too and of course any of the LINUX based systems.  … so OS developers out there “fix it”, it is easy.

So why is it that this is a persistent and annoying problem and has an easy fix, and yet is still there in every system I have used after 30 years of windowing systems?

First, it is annoying and persistent, but does not stop you getting things done, it is about efficiency but not a ‘bug’ … and system designers love to say, “but it can do X”, and then send flying fingers over the keyboard to show you just how.  So it gets overshadowed by bigger issues and never appears in bug lists – and even though it has annoyed me for years, no, I have never sent a bug report to Apple either.

Second it is only a problem when you have sufficient files.  This means it is unlikely to be encountered during normal user testing.  There are a class of problems like this and ‘expert slips’1, that require very long term use before they become apparent.  Rigorous user testing is not sufficient to produse usable systems. To be fair many people have a relatively small number of files and folders (often just one enormous “My Documents” folder!), but at a time when PCs ship with hundreds of giga-bytes of disk it does seem slighty odd that so much software fails either in terms of user interface (as in this case) or in terms of functionality (Spotlight is seriously challenged by my disk) when you actually use the space!

Finally, and I think the real reason, is in the implementation architecture.  For all sorts of good software engineering reasons, the functional separation between applications is very strong.  Typically the only way they ‘talk’ is through cut-and-paste or drag-and-drop, with occasional scripting for real experts. In most windowing environments the ‘application’ that lets you navigate files (Finder on the Mac, File Explorer in Windows) is just another application like all the rest.  From a system point of view, the file selection dialogue is part of the lower level toolkit and has no link to the particular application called ‘Finder’.  However, to me as a user, the Finder is special; it appears to me (and I am sure most) as ‘the computer’ and certainly part of the ‘desktop’.  Implementation architecture has a major interface effect.

But even if the Finder is ‘just another application’, the same holds for all applications.  As a user I see them all and if I have selected a font in one application why is it not easier to select the same font in another?  In the semantic web world there is an increasing move towards open data / linked data / web of data2, all about moving data out of application silos.  However, this usually refers to persistent data more like the file system of the PC … which actually is shared, at least physically, between applications; what is also needed is that some of the ephemeral state of interaction is also shared on a moment-to-moment basis.

Maybe this will emerge anyway with increasing numbers of micro-applications such as widgets … although if anything they often sit in silos as much as larger applications, just smaller silos.  In fact, I think the opposite is true, micro-applications and desktop mash-ups require us to understand better and develop just these ways to allow applications to ‘open up’, so that they can see what the user sees.

  1. see “Causing Trouble with Buttons” for how Steve Brewster and I once forced infrequent expert slips to happen often enough to be user testable[back]
  2. For example the Web of Data Practitioners Days I blogged about a couple of months back and the core vision of Talis Platform that I’m on the advisory board of.[back]

web of data practioner’s days

I am at the Web of Data Practitioners Days (WOD-PD 2008) in Vienna.  Mixture of talks and guided hands-on sessions.  I presented first half of session on “Using the Web of Data” this morning with focus (surprise) on the end user. Learnt loads about some of the applications out there – in fact Richard Cyganiak .  Interesting talk from a guy at the BBC about the way they are using RDF to link the currently disconnected parts of their web and also archives.  Jana Herwig from Semantic Web Company has been live blogging the event.

Being here has made me think about the different elements of SemWeb technology and how they individually contribute to the ‘vision’ of Linked Data.  The aim is to be able to link different data sources together.  For this having some form of shared/public vocabulary or ‘data definitions’ is essential as is some relatively uniform way of accessing data.  However, the implementation using RDF or use of SPARQL etc. seems to be secondary and useful for some data, but not other forms of data where tabular data may be more appropriate.  Linking these different representations  together seems far more important than specific internal representations.  So wondering whether there is a route to linked data that allows a more flexible interaction with existing data and applications as well as ‘sucking’ in this data into the SemWeb.  Can the vocabularies generated for SemWeb be used as meta information for other forms of information and can  query/access protocols be designed that leverage this, but include broader range of data types.