why software need never hang

Over 20 years ago I wrote “The Myth of the Infinitely Fast Machine“, about the way software developers effectively assume that everything on the machine side of human interaction happens instantly. Often interaction is programmed in a turn-taking style:

  1. wait for user action
  2. process the event
  3. display changes
  4. back to step 1

This assumption of instant (or at least infinitely fast) response at step 2 often ignores network delays, disk IO or heavy computation. This tends to work fine on a high-spec development or test machine, with a fast network and clean install of all system software … but when the software hits a real machine, a few years old, untidy system, slow network … things fall to pieces.

So 20 years later (as I described in my post last week) I am sitting watching the spinning rainbow ball as Word struggles to save a document (over an hour now, I think I will need to kill it). To be fair I think the root ’cause’ of the problem … or at least one problem … may be the printer as the Cannon printer driver has never worked properly on an Intel Mac (maybe new driver when I upgrade to Leopard?) and perhaps some change in the rest of the system (maybe the Office install) has tipped it over into not working at all.

As far as I can tell Word then decides to ask the printer things in order to set the margins properly when saving the document, and then gets stuck. I found a post on a Microsoft forum about a different print related problem and the ‘helpful’ tech support from MS simply said “not our fault, re-install everything”.

So to recap:

  • user asks Word to save – probably the most critical operation in the system, or the system auto-saves, again to ensure safety against crashes, so really critical
  • Word decides it needs information from the printer (although it has been displaying the page to the users using some existing information on page properties).
  • Word asks for info from the printer driver of the currently selected printer
  • if the printer doesn’t respond Word hangs and blocks all user interaction

However, the printer driver may be third party, may be connecting to a shared printer hanging off a different network, or in the case of a laptop on a network currently disconnected from the computer … and any resulting delay is not the fault of the developers of Word??!

The annoying thing is that such ‘hanging’ delays need never happen.

Basically there are four main causes for delays:

  1. ordinary computation takes a long time due to it being too complex for the available hardware
  2. unbounded internal computation -for example iterative algorithms
  3. waiting for external resources (disk, network, etc.)
  4. bugs that lead to the system going crazy (effectively case 2 by accident!)

Type 1 will surface during testing and may require re-design of the interaction, but is simply ‘slow’ rather than ‘hanging’. Typically it leads to things gradually getting slower as the document or data gets larger or more complicated. This requires standard profiling and optimisation.

Type 4 is hard to deal with – bugs do happen. However, the majority of the problems I’m experiencing in Word at the moment are not a failure of this kind as Word does, most of the time, eventually complete without crashing.

Types 2 and 3, especially the latter, should be detected and then dealt with in the design of the user interface.

Some real-time programming languages have ways of automatically working out how long code will take to run in order to be able to assert “this will respond within a 10 ms interrupt cycle”. However, this is hard, even for relatively simple embedded systems; so not practical for complex operating systems or user interfaces.

However, a simpler version of the above is possible. Certain system functions invoke external resources such as the disk, or the network. If any function or method in your own application invokes one of these system functions, then it could potentially hang – and should be documented to say so or return some sort of ‘promise’: “I’ve started to do X, please check back later to see if it is ready”. Of course the methods that call these themselves need to be documented as potentially hanging … and so forth.

If the response to any form of user interaction ends up calling a potentially hanging function, then it is in danger of having a delay of type 3 above. However, so long as this is known, it can be dealt with at the user interface level by spawning a thread to do the work so that some form of progress indicator or at least “Cancel” button can be active – it should never ‘hang’.

This marking of functions as potentially ‘hanging’ could be done by programmers themselves, but equally can be automated as a form of static analysis, simply starting with a known set of hanging system functions and recursively ‘colouring’ functions that call them. This kind of automated checking should be standard practice in any large software project.

The type 2 hanging is a little more complicated. The ADA programming language has a ‘safe’ subset that only allows loops where the bounds are fixed at compile time. This is probably too restrictive for complex software, but certainly any loop with unknown limits could be flagged. If as part of a code walk through or similar practice it is decided that the loop is ‘safe’ it can be annotated as such, otherwise, just like the case of system calls, the system can propagate the fact that certain functions may have unbounded computation and then the UI adjusted accordingly.

For small bespoke software development I can be forgiving, but for large vendors like Microsoft, Apple or Adobe, there is no excuse for this form of culpable failure.

… but I have a bad feeling that in 20 years time I may be writing again …

[[ News flash – 1.5 hours later Word has finished saving the document! … 14 pages obviously hard work. … but then it has hung again 🙁 ]]

pain, tears and office 2008

Some weeks ago I upgraded Microsoft Office to Office 2008 (yes it does still have menus on the Mac!), and life since has been constant trouble.

OK first there are ‘minor’ niggles like it eating 1/2 my screen space in huge tool bars replicated at the top of every window, or eveytime I read in an Excel spreadsheet it telling me that old macros no longer work … actually I don’t use Excel macros, but f you do and have lots of spreadsheets that use them what then? … and don’t get me started in the fact that I can no longer cut and paste directly between Word and Dreamweaver.

… and then, just over 2 weeks ago, I was at the AVI conference and, as one does, writing the slides for the presentation the day before. I had produced all the diagrams for the presentation in Powerpoint and then copied them into Word, so thought it would be easy – start with the Powerpoint file with all the diagrams in it and add a few words around them – after all pictures always best. However, this was reckoning without Office 2008. The figures had been produced in PPT 2004, and when I opened them in Office 2008 half the images just disappeared. I tried opening in the old version of office, but it simply crashed every time I tried to update a file, I assume the Office 2008 install broke the old Office 2004 install in some way. In desperation I tried cutting and pasting the slides between PPT 2004 and PPT 2008, but that failed (I guess because Powerpoint thought it was pasting back into itself!). Eventually I managed to get the crucial images by cutting and pasting via a third program.

But the reason I am blogging now, rather than doing the pile of work that I need to do, is that Word has decided that about every 10 minutes it needs a 15 minute break and disappears into a little spinning rainbow – it does eventually come back, but only after several cups of tea.

To be fair most of the problems seem to be with compatibility mode … but surely backward compatibility is not so difficult … after all we have a lot of old files out here .. or if they can’t code it properly simply produce one-off converters rather than pretending to work when they don’t!

But the spinning disk has at last stopped … so back to another 10 minutes work before it halts again.

Tags and Tagging: from semiology to scatology

I’ve just been at a two-day workshop on “Tags and Tagging” organised by the “Branded Meeting Places” project.

Tags are of course becoming ubiquitous in the digital world: Flickr photos, del.icio.us bookmarks; at the digital/physical boundary: RFID and barcodes; and in the physical world: supermarket price stickers, luggage labels and images of Paddington Bear or wartime evacuees each with a brown paper label round their necks. Indeed we started off the day being given just such brown paper tags to design labels for ourselves.

Alan's tag

As well as being labels so we know each other, they were also used as digital identifiers using a mobile-phone-based image-recognition system, which has been used in a number of projects by the project team at Edinburgh (see some student projects here). We could photograph each others tags with our own phones, MMS the picture to a special phone number, then a few moments later an SMS message would arrive with the other person’s profile.

Being focused on a single topic and even single word ‘tag’ soon everything begins to be seen through the lens of “tagging”, so that when we left the building and saw a traffic warden at work outside the building, instantly the thought came “tagging the car”!

Vocal Thumbs logoThe workshop covered loads of ground and included the design and then construction of a real application – part of the project’s methodology of research through design. However, two things that I want to write about. The first is the way the workshop made me think about the ontology or maybe semiology of tags and tagging, and the second is a particular tag (or maybe label, notice?) … on a toilet door … yes the good old British scatological obsession.

Continue reading

when virtual becomes real

Just read Adam Greenfield’s blog entry “Reality bites“. He describes how a design he produced for a friend’s new restaurant became a solid metal sign within days. Despite knowing about recent rapid fabrication techniques, actually seeing these processes in action for his own design was still shocking.

I too am still amazed at the relative ease that ideas can be turned into reality. In a presentation “As we may print” at the 2003 Interaction Design for Children, Michael Eisenberg described how he and his co-workers at University Colorado were using laser cutters to enable children to design their own 3D designs in card or even thin plywood. More recently at the National Centre for Product Design and Development Research in Cardiff, I saw 3D metal printers. I was aware of 3D printers working in various gels and foams, but did not realise it was possible to create parts in titanium and steel, simply printed from 3D CAD designs. Chasing one of Adam’s links I found instructions to make your own 3D printer on the MIT site … however, this constructs your designs in pasta paste not metal!

One of the arguments we are making about our FireFly technology is that it will change lighting from being a matter of engineering and electronics, to a digital medium where the focus moves form hardware to software. While FireFly allows more flexible 2D and 3D arrangements than other technologies we are aware of, it is certainly not alone in making this transformation in lighting. Last week I was talking to Art Lights London and they are planning some large installations using Barco’s LED lighting arrays. Soon anything that you can point on your computer screen you will also be able to paint in light from your own Christmas tree to London Bridge.

Although it sometimes seems that technology is simply fuelling war and environmental catastrophe, it is a joy to still glimpse these occasional moments of magic.