at time of publication: hci@hud, The HCI Research Centre, University of Huddersfield.
Position Paper for CHI 96 Basic Research Symposium (April 13-14, 1996, Vancouver, BC)
See also: Alan's pages about Time
Timing is critical in our day to day life and also in the effective design of user interfaces. But what is bad timing? Some time scales are clearly due to our psycho-motor abilities (e.g. reaction times limited by nerve impulses and muscles), others due to external stimuli. However, time scales of the order of fractions of a second to a second are common in music a speech and seem natural during interaction, but have a less obvious cause. Furthermore, rhythms seem easier to deal with than occasional delays. Are these culturally determined, or deeper? One solution is too ask why our cognitive abilities are the way they are - we are not designed (or evolved) to be computer users. Perhaps we can understand these phenomena better, if we decide to build computers for cavemen.
Keywords: time, rhythm, delays, evolutionary psychology
Time is important
Time and the user interface
How long is too long?
Timing is important in our everyday life: for communication with other people and for the things we do in the world. If a comedian pauses a moment too long or delivers the punch-line a moment too soon the joke is spoilt. In transatlantic video-conferencing delays can be of the order of a second. The gaps we leave in speech to tell the other person it is 'their turn' are only a few hundred milliseconds long. So by the time the other person has heard the gap and begun to join in the first person has started to talk again. Television interviewers are in fact very skilled at this and ask a question followed by about a second or two of 'postamble' which fills the gap before the reply arrives. Think of most sports: if a football is kicked a moment too late or a baseball bat is not swung at exactly the right moment, the game will be lost. However, more prosaic physical activities are also highly skilled: driving a car, typing, even walking down the street all demand that we do the right things in the right order at the right time. The slightest error in timing can lead to a fall, a mistype or a crash.
Figure 1. timing matters
Time is also critical in the user interface. I have argued this strongly before [Dix 1987; Dix 1994] and it was also the subject of a workshop at Glasgow last year [Johnson et al. 1996] as well as discussion groups at other recent workshops and conferences. This was a recognised problem in the days of command line interfaces, where delays ranging from several seconds to minutes were commonplace [Shneiderman 1982] , but for many years the problem was largely neglected - researchers and practitioners alike seemed to believe that machines would eventually be 'fast enough'. This is what I have called the myth of the infinitely fast machine. Recently, the heavy use of the world-wide web has made people aware again of the effects of delays. However, perhaps more important than the obvious delays on the web are the much shorter delays which punctuate the use of ordinary 'direct manipulation' user interfaces and which shatter the illusion of a virtual world.
So if time matters, how long a delay is too long? In previous work I argue that what matters is not so much the absolute timing, but the match between the pace of interaction and the pace of the task which we are performing [Dix 1992] , be this 100 milliseconds, 5 seconds or a week. To answer the question, we need to know about the sort of jobs we are doing. In some cases these are driven by the external world. Imagine you are driving down a major road. From time to time you look in your mirror or at the instrument panel, so clearly you do not need to be looking at the road all the time. In fact, you could periodically close your eyes for a fraction of a second and not risk an accident. So, you could probably (this is not a suggestion) drive down the road with your eyes closed, only opening them say every second to make a correction. However, if the period between glances at the road became too long you would eventually crash. How long this period is would depend on the road (sharp bends, other cars), on the car (wheel tracking, steering) and on how fast you were driving! In addition, there are certainly some physiological limits: if you needed millisecond responses (driving a Ferrari in a supermarket), you couldn't cope.
Using a computer system is slightly different from driving a Ferrari. The problem is usually not so much that the computer is too fast for you, but more the often erratic delays which break the flow of your work. The mismatch is between the speeds at which the computer works and the paces of activity that seem natural for you. So, what exactly is a natural timescale?
Some timescales of human activity are easy to see. The timescales for hand-eye coordination tasks (such as mouse positioning) are in the order of a few hundred milliseconds. This is determined by the nature of our nervous and motor systems, as is expressed, for example, in the model human processor [Card et al. 1983] . If feedback from a mouse exceeds this sort of time it becomes difficult or impossible to perform close control tasks such as freehand drawing or positioning. Reactions to sudden events, such as catching a glass as it falls from a table, are governed by similar physiological limitations and timescales. At a longer timescale we have our short-term memory which decays over a few seconds without constant rehearsal, and conversation where turns often take just a few seconds (but growing to a minute or more on the telephone). It is perhaps the fact that we are accustomed to such timescales which led to the recorded tolerance of 5 seconds for response times in command line interfaces [Shneiderman 1982] . Looking over an even greater timescale we have daily events, for which we have an internal clock running at an approximately 24 hour circadian rhythm, but also other regular events which punctuate our days (mealtimes, working hours) and lives (weekends, birthdays, conferences (!)).
One of the key results from early work on delays in command line interfaces is that regularity is often more important than the absolute length of delays. If people can predict how long they are likely to wait they are far happier. This is one of the principal reasons that people find the web annoying. In addition, it appears that rhythms are easier to deal with than occasional short delays. For example, all Mac users occasionally open a file by accident when they are really trying to change its name (fig. 2). This requires them to click, pause just under a second (predictably), and then click again to insert the name. It is rather like those automatic doors in hotels and airports which don't open quite fast enough as you walk up to them! It seems (although there is insufficient empirical evidence to support or refute it) that individual delays are both difficult to recognise (they appear to be random behaviour) and difficult to proceduralise.
Figure 2. changing a file name on the Mac
In contrast, most people can keep time to regular rhythms of around a second without difficulty. So, are rhythm and music in some sense native to our cognition (or even our soul!)? In fact, we are not equally adept at all rhythms. Try beating a rhythm every 3 seconds. Use a stopwatch to get you started and then try to maintain it with your eyes shut. Rhythms we feel comfortable with in day to day life, from music to language, seem to range from fractions of a second to around a second. When we need longer rhythms, we count faster ones to maintain time, for example, with sea shanties where one does a 'heave-ho' on every bar, not every beat.
This is not just a matter of expertise; concert conductors count time in beats per minute, so 60 is a beat per second. It is known that beats of fewer than 40-45 require a faster sub-rhythm to maintain, and a limit of 1800 milliseconds is said to be a fundamental limit (see Stephen Brewster's work on using audio stimuli to maintain longer rhythms).
Why a timescale of around a second? Is it cultural - learnt from lullabies as babies? Does it start earlier, listening to our mother's heartbeat in the womb? Or is it the result of some neurological or physiological factors? Of course, we can never finally answer such questions. The upper limits to timing are certainly determined by our neuro-physical make up, as the MHP shows - even if we could keep faster rhythms in our head, we certainly couldn't beat them out! The upper limit is more interesting.
One solution is to look beyond our actual cognitive and physical abilities and ask why we have them in the first place. Depending on ones views on prehistory and the 'descent of man', the age of homo sapiens is measured in tens or hundreds of thousands of years. For most of this time, most people have been hunter-gatherers. Computer use has come late on the scene. So, we should expect that our cognitive abilities are tuned to being hunter-gatherers rather than computer users.
This is an argument that is being used to understand human social behaviour by evolutionary psychologists such as Lena Cosmedes. I want to use it to reason about lower-level behaviour.
If we then consider the different timescales, they begin to make sense. As hunter-gatherers we certainly ought to be able to manipulate things (picking fruit) and have emergency reactions (avoid that nasty sabre-toothed tiger that is jumping out at us) but what about occasional delays. It is hard to imagine any situation where the hunter-gatherer should do something, pause a second, and then continue (at least I haven't thought of any). This leaves rhythm.
One of the things you do have to do as a hunter gatherer is walk and run. In fact, the need for timing during running is slightly complex, some aspects are reactive, we start to fall forward, move foot to stop ourselves etc. This is a lesson robotics learnt the hard way: early attempts at walking robots tried to drive them using sophisticated physical models ... the robot's fell over! However, although many aspects of walking and running are reactive, we do need our legs and feet to be in the right place at the right time, our brains must be able to drive them at the right rate, which is in periods of about a second (an ambling walk) to a fraction of a second (100 meters sprint in 10 seconds is around 10 paces per second).
Finally, in this puzzle: why do we run and walk at these rates? To avoid a circular argument we need some cause external to the psycho-motor system. One answer lies in our physical makeup. When running our legs are compound pendula, that is they have two (principal) joints, at the knee and body junction. When locked straight (walking) they are simple pendula. It so happens that the compound pendula has a faster natural frequency, which is why it so hard to walk fast, but easier to run at the same speed. Furthermore, it is very difficult to drive these pendula at frequencies very different from their natural frequency, giving us a natural rhythm.
Looking back at music this suggests that rather than dancing to the rhythms of music, we in fact appreciate music at the rhythms that we can walk, run and dance to: we music to dance, not dance to music. Anthropologically this also makes sense as primitive music is based around string rhythms and dance. And all is in the end driven by the dimensions of our body - the tempo of Beethoven is ultimately derived from the length of our legs.
Figure 3. feel the rhythm in your compound pendula
This sort of reasoning is not limited to timing phenomena. In general, if a computer system demands cognitive or motor facilities that a hunter gatherer would not need to possess, then users are likely to have problems. They may succeed, just as people learn to perform complex sporting activities, but not easily. There may also be 'spread' abilities which are tuned to certain activities, but which are weakly tuned (the problem solving we do with computers is hardly 'natural', but is a simple extension of more basic reasoning abilities.
However, the general lesson to take away is quite simple:
I have been piecing together this story over a year or so and talked to various people who have contributed pieces to the puzzle (but who do not all agree with my conclusions!). In particular, thanks to Steve Brewster, Steve Draper, Andrew Monk and members of the music and computing SIG at Huddersfield.