Neural networks and human-computer interaction

Alan Dix and Janet Finlay

Entry in Handbook of Neural Computation, Eds. E. Fiesler and R. Beale. CRC Press. pp. G7.1:1-G7.1:7.
(at time of publication Institute of Physics Publishing and Oxford University Press)


Abstract: There is much interest over several years in the use of neural networks within Human-Computer Interaction. However, this promise has lead to surprisingly few published results. This article reviews those applications which have been addressed by neural networks or similar techniques. It also describes the use of the ADAM neural network for task recognition from traces of user interaction with a bibliographic database. This achieved high accuracy rates in training and in some on-line use. However, there were significant problems with its use. These problems are of interest not just for this system, but for any which is attempting to analyse trace data. The two main problems were due to the continuous sequential data and the presence of literal input (personal names, file names, dates etc.). Those systems which have achieved success in this area have not used neural techniques, but instead more traditional (although often ad hoc) methods. However, it is expected that recurrent networks may be suitable but probably only within a hybrid approach.

Authors' draft text, for final text see the publishers website

1. Context

The use of neural networks in Human–Computer Interaction (HCI) is largely pragmatic. They are used if they do their job well. The applications to which they are suited are also tackled by other statistical and machine learning techniques. It would be nice to report that the choice between these techniques is based on sound principles, but in fact the choice is usually based on familiarity with a particular technique. So, when considering those applications within HCI it is better to consider neural networks under the wider banner of pattern recognition techniques.

There has been considerable interest in the application of neural networks and pattern recognition within HCI. There have now been several well attended workshops dedicated to the theme the results of two of which have been collected in a book (Beale and Finlay 1992). However, despite the apparent interest there is relatively few published articles on actual neural network applications (although many on more traditional AI techniques). This may be because few researchers have skills in both areas and thus do not achieve their desired results.

Applications in HCI

In common with other domains, applications of neural networks in HCI can be divided into those which require only a behavioural or black-box method and those which care about the manner in which the solution is represented (and perhaps even derived). Also, HCI applications differ in the extent to which the network must mimic human behaviour – the network either satisfies a purely computational role or else must be to some extent anthropomorphic.

First consider purely computational uses, that is where there is no requirement that the behaviour is in any way human. Some will be pure black-box applications. One example of this is the use of real-time ECG data by British Aerospace to detect whether pilots are becoming drowsy. This is basically a matter of signal processing (see F.1.8). Other applications include speech, handwriting and gesture recognition (see F.2.1). Of special note is the use of gesture recognition among the disabled. Normally recognition systems have to be very accurate to be acceptable. However, were normal verbal communication is very difficult, even relatively low recognition rates may be acceptable.

In other cases the system must give some explanation of its behaviour – the traditional problem of expert systems. An example of this is Query-by-Browsing (QbB) an intelligent database front-end developed by one of the authors (Dix and Patrick 1994). From examples of required records, the system generates a database query. Although the process of reasoning does not have to resemble that of a human expert, it is important that the query is in a form comprehensible to the user so that it can be verified. For this reason the present version of QbB uses decision tree induction rather than a neural network to perform pattern matching. Similar uses include the vetting of credit and job applications. In both cases explanations may be required for both legal and ethical reasons (Dix 1992).

Now consider the anthropomorphic uses. These uses include various forms of task analysis and recognition (described below) and various forms of simulation where the network takes the place of the user in the testing of software. An example of the latter is in the evaluation of the readability of computer and paper form layouts. In this case human like behaviour is sufficient, so long as the system gives similar responses to humans (especially, if it can pinpoint the problem parts of the layout) it needs no further explanation. However, other researchers require that the network models more faithfully the process of human reasoning. For example, McGrew (1992) uses the interconnection weights of PDP network to generate a task analysis graph. Also Booth (1992) models the way misunderstandings give rise to errors in human-computer interaction. An important part of this analysis is an understanding of the way different areas of knowledge are used during (incorrect) reasoning.

Trace analysis and task recognition

An important class of HCI applications are those based on trace analysis. That is, where a record of the user’s interaction with a system is analysed to recognise or uncover patterns. The data for this process may be collected automatically, often by keystroke or event logs, or may be generated as the result of observation. This raw data can be recorded for later analysis or used on-line to guide the system during interaction. The off-line data can be used to aid task analysis. Task analysis involves the identification of patterns of behaviour used to accomplish particular goals.

Self-organising networks can be used to find repeated patterns of behaviour which can then be examined by the human analyst as possible task sequences. A particular task may often be accomplished by several sequences of actions and so the use of a network does not replace the human analyst. However, hand analysis is very tedious as the logs are often very long and repetitive and so this is an application where the automatic tools truly augment human skills.

On-line data can be used in various ways:

  • To identify a particular user (Stacey et al. 1992). This may be used to recall user specific preferences or for security purposes.
  • To identify a particular user (Stacey et al. 1992). This may be used to recall To classify the user, for example as novice/expert (Finlay 1990), in order to adapt the interaction to suit the user’s knowledge and ability.
  • To learn repeated sequences of actions so that they can be offered as potential ‘macros’ (Hassell and Harrison 1994; Crow and Smith 1992)) or as a predictive accelerator (Cypher 1991; Schlimmer and Hermens 1993).
  • To recognise known task sequences (which may themselves be the result of human or automatic task analysis). Uses of this include driving a user model during computer-based learning and offering context sensitive help.

The system we will describe in the rest of this article is addressed at the last of these, automatic task identification.

Whose error?

Throughout this article we talk about various user errors, but, in most software, such errors are inevitable because of the design of the system. Hence, the error is most often not so much the user’s but the designer’s. However, to constantly use phrases and language to emphasise this important point would detract from the rest of the description. Hence whenever we talk of the user’s error please bear in mind that this is a gross simplification.

2. System description

Problem domain

We now describe a system designed to recognise tasks in a menu-driven bibliographic database program called REF. More detailed descriptions of this work can be found elsewhere (Finlay 1990; Finlay and Beale 1992; Finlay and Harrison 1990). The program supported a fixed number of tasks and was therefore a very constrained environment in which to examine the issues of task recognition. However, it was far from a trivial domain. The sequence of user commands to accomplish task varied from 3 to 16 and so the neural network had to cater for time series data with variable length patterns. The trace was complicated by the fact that some user actions involved typing literal inputs: names, titles, dates etc. For the purposes of event logging such literals were reduced to a single user action. However, this was based on the system’s idea of whether the user was entering literal input rather than menu commands. Of course, if the user and system got out of sync. – a major incident – the logging did not accurately reflect the user’s understanding of the interaction.

System overview

Users interacted with the bibliographic database on an IBM compatible PC. The event trace was transmitted along a serial link to a SUN workstation which performed the network calculations.. In order to deal with the time series data, the trace was windowed on 2 or 6 characters (although 2 sounds small, many tasks were easily identified by their initial two events). The windowed data was then n-tupled and passed through an ADAM associative array. The output was thresholded to give a task code and associated confidence. Finally this task code was displayed on the experimenter’s terminal. For example, in figure G7.1.1, the input trace “SsM#eM” is passed through the ADAM array giving an output of <8,5,6,2,8,0,6,2>, this is thresholded at a level of 6 to yield <1,0,1,0,1,0,1,0>. Finally this binary pattern is recognised as representing the ‘exit’ task, but is obviously not an exact match and gets a confidence rating of 70%

Figure G7.1.1 System components

Input format

Both the event logs and the training set included both the user’s actions and some system responses. The system responses were also coded as single characters. Since the selection of menu options in REF was case insensitive all the user’s commands were translated into lowercase and uppercase letters and digits were used to code the system responses.

An example trace is “MsSn?2”. This translates to: (M) system shows main menu, (s) user types ‘s’, (S) system shows select sub-menu, (n) user types ‘n’, (?) user types a name to find, (2) system responds that there are two or more matching records. Of course, in the user’s event log such sequences are appended one after the other. Also, whereas this trace represents correct activity, event logs may also include various forms of user error.

Training set

The REF system has 11 main task types (e.g., selecting a set of references, altering existing references, exiting the program). A complete description of the system was produced in CSP (Hoare 1985) and this was used to enumerate all possible correct task sequences. This gave rise to 529 traces which were used for training (including the example above). As these traces varied in length they were padded to a fixed size. In subsequent experiments traces of some known common problems were added to the training set.

Topology

The neural-component in the system was the ADAM binary associative network (Austin 1987). This was chosen mainly because of its speed in learning and recall. This consists of an n-tupling stage followed by a form of Willshaw network. The output if the net was n-point thresholded yielding a class code and a confidence measure.

Pre-processing

As described earlier, the user’s entry of literal input was reduced to a single character in the trace also the sequence of characters was reduced to a finite length by using a moving window. The ADAM net requires binary input and two representations were used, one using the normal ASCII coding and one using one bit for each possible character. The former lead to an input size of 2¥8 or 6¥8 depending on the window size. The latter was much bigger and was expected to give a better performance because of the sparser representation. However, there was no measurable difference in performance, possibly because the latter representation effectively duplicated the job of the n-tupling.

3. Evaluation

Results

The system showed very high accuracy and generalisation on the training set. With 50% of the full set of tasks used for training the recognition on the complete set was perfect, and stayed high even when only 10% of the examples were used in training. However, when used on actual traces the picture was more complicated. When the small window size of 2 was used, the accuracy was around 99%. However, this dropped to 65% when the larger window of 6 was used. Apparently the problems with variable length patterns was getting in the way with the larger window. The smaller window did not have this problem (it was smaller than the shortest task sequence). However, it is likely that it was simply recognising the user’s initial main menu choice – acceptable when the user does it right, but not much help when the user and system are out of sync. Comparison with traditional methods

The results using ADAM were compared with those obtained using a variant of ID3 (Quinlan 1979), a machine learning algorithm which builds decision trees by induction. When tested on the training set it too obtained 100% accuracy using 50% of the full set of tasks, and was highly accurate, but slightly worse than ADAM on smaller training sets. When used on the actual logs of user interaction, its accuracy was substantially lower than ADAM, although following the same pattern attaining 85% accuracy with a window size of 2 but only 35% on a window size of 6.

General problems

This application highlights several problems which must be tackled if neural techniques are to be applied within the human-computer interface. Matching varying length subsequences within an event trace was clearly a substantial problem. There are various issues connected with this. The segmentation problem is well known in other fields, for example in separating words within continuous speech. Recall how the accuracy rate for recognising the training set (which was already segmented) was high, but that it fell off dramatically when faced with continuous user traces. For some kinds of task recognition it is possible to use information from the state of the computer dialogue – for example, when the REF system was at the main menu. However, as we saw, an important class of interface errors occur when this does not concur with the user’s notion of the task state.

Assuming we have segmented the trace, we still face problems due to the omission of required actions or where the task sequence is split by irrelevant or erroneous actions. This is similar to the problems of spelling correction. Windowing techniques are very fragile in the face of changes in the relative position of parts of a sequence. These problems are also faced by systems (as discussed earlier) which look for repeated sequences in the user input and other sequence based problems. To our knowledge none of these use neural techniques, instead relying on symbolic AI techniques (Cypher 1991), inductively built finite state machines (Schlimmer and Hermens 1993), hidden Markov models (Hanlon and Boyle 1992) or special purpose algorithms (Crow and Smith 1992). However, it seems likely that recurrent neural networks could also be used for this purpose. Indeed, many representations of user interaction can be transformed into some form of finite state representation which could be used to train recurrent networks such that the networks internal representation matches that of the analyst (Dix et al. 1992). In fact, the sequences we dealt with were not as difficult as they could be. The REF system was an old-fashioned DOS application, with only a single thread of control. Consider a windowed application. These typically allow the user to perform several tasks in parallel, even within one window. From the recognition system’s point of view, these appear rather like insertion errors. A typical insertion error is caused by the user accidentally typing an extra character which breaks the original pattern in two. In the case of multiple windows the user may begin to perform a task in one window, then swop to another and perform one or more tasks in the second window, and finally return to the first window to complete the initial task. Just like a mistyping, the original task is broken in two. However, in contrast to simple insertion errors caused by mistyping, the breaks in a windowed may often be substantial. Neither is it sufficient to regard each window or application separately, part of the power of windowed systems is that tasks involve interaction with multiple applications.

The other major problem we discussed was literal inputs, such as the typing of author names to search for in the bibliography. These cause three problems. First, they act as variable length insertions in the trace. The method used in our system to code them works only when the user and system are in agreement. If the user starts to type a name when the system is expecting further menu choices, then the trace will record the full name. At just the moment when the user is confused and needs help, we find that the network is equally confused!

Second, the value of the literal inputs matter. Although the particular value is typically unimportant it is often important whether the same name is used several times. For example, in an operating system consider the commands: copy onefile.txt another.txt delete onefile.txt

It is very important that the two commands in this sequence refer to the same file. The Query-by-Browsing system mentioned earlier uses variable matching techniques, but this is in the context of inductively learnt decision trees where it is easier to add symbolic constraints (Dix and Patrick 1994). Third, the insertions resulting from literal input often have a completely different syntactic form to that of the rest of the interaction. This can make it easy to detect and so segment, although the exact start of the insertion may be less clear. However, this suggests that the pattern recogniser needs to have, either explicitly or implicitly, several modes. A similar problem arises when dealing with multiple applications in a windowed system. As with literal input it is no good relying on the system’s interpretation of where input belongs – an important error is precisely when the user mistakenly inputs data to the wrong window.

4. Conclusions

For this application, the ADAM neural network performed better than an alternative machine learning algorithm. However, there were fundamental problems which arose which need to be tackled by anyone wishing to apply neural networks to on-line or off-line trace analysis. The nature of these suggests that a hybrid rather than pure neural approach will be required.

Further reading

R. Beale and J. Finlay, Eds. (1992). Neural Networks and Pattern Recognition in Human–Computer Interaction. Chichester, UK, Ellis-Horwood.
A collection of papers from two workshops held in the US and UK, covering both neural networks and related pattern recognition techniques.
J. Finlay and R. Beale (1993). Neural networks and pattern recognition in Human–Computer Interaction. SIGCHI Bulletin, 25(2): 25–35.
J. E. Finlay and A. J. Dix (1994). Pattern recognition in Human–Computer Interaction a viable approach? SIGCHI Bulletin, 26(10).
Reports on the CHI’91 and CHI’94 workshops of the same names.
There is also a moderated mailing list, interested parties should send a request to be added to prhci@zeus.hud.ac.uk

References

  • J. Austin (1987). ADAM: a distributed associative memory for scene analysis. Proc. First International Conference on Neural Networks, San Diago, IEEE.
  • R. Beale and J. Finlay, Eds. (1992). Neural Networks and Pattern Recognition in Human–Computer Interaction. Chichester, UK, Ellis-Horwood.
  • P. A. Booth (1992). Modelling misunderstandings using artificial neural networks. Neural Networks and Pattern Recognition in Human–Computer Interaction , Eds. R. Beale and J. Finlay. Chichester, UK, Ellis-Horwood. 301–319.
  • D. Crow and B. Smith (1992). DB_Habits: comparing minimal knowledge and knowledge-based approaches to pattern recognition in the domain of user–computer interactions. Neural Networks and Pattern Recognition in Human–Computer Interaction , Eds. R. Beale and J. Finlay. Chichester, UK, Ellis-Horwood. 39–63.
  • A. Cypher (1991). Eager: programming repetitive tasks by example. Proceedings f CHI’91, New Orleans, ACM Press.
  • A. Dix (1992). Human issues in the use of pattern recognition techniques. Neural Networks and Pattern Recognition in Human Computer Interaction , Eds. R. Beale and J. Finlay. Ellis Horwood. 429-451.
  • A. Dix, J. Finlay and R. Beale (1992). Analysis of user behaviour as time series. Proceedings of HCI’92: People and Computers VII, Cambridge University Press.
  • A. Dix and A. Patrick (1994). Query By Browsing. Proceedings of IDS’94: The 2nd International Workshop on User Interfaces to Databases, Lancaster, UK, Springer Verlag.
  • J. Finlay (1990). Modelling users by classification. D. Phil Thesis, University of York.
  • J. Finlay and R. Beale (1992). Pattern recognition and classification in dynamic and static user modelling. Neural Networks and Pattern Recognition in Human–Computer Interaction , Eds. R. Beale and J. Finlay. Chichester, UK, Ellis-Horwood. 65–89.
  • J. E. Finlay and M. D. Harrison (1990). Pattern recognition and interaction models. Human–Computer Interaction — INTERACT’90, North-Holland.
  • S. J. Hanlon and R. D. Boyle (1992). Syntactic knowledge in word level text recognition. Neural Networks and Pattern Recognition in Human–Computer Interaction , Eds. R. Beale and J. Finlay. Chichester, UK, Ellis-Horwood. 173–193.
  • J. Hassell and M. Harrison (1994). Generalisation and the adaptive interface. Proceedings of HCI’94: People and Computers IX, Glasgow, Cambridge University Press.
  • C. A. R. Hoare (1985). Communicating Sequential Processes. Prentice-Hall International.
  • J. K. McGrew (1992). Task analysis, neural nets, and very rapid prototyping. Neural Networks and Pattern Recognition in Human–Computer Interaction , Eds. R. Beale and J. Finlay. Chichester, UK, Ellis-Horwood. 91–102.
  • J. R. Quinlan (1979). Discovering rules by induction from large collections of examples. Expert Systems in the Micro-Electronic Age , Ed. D. Michie. Edinburgh University Press. 168–201.
  • J. C. Schlimmer and L. A. Hermens (1993). Software agents: completing patterns and constructing user interfaces. Journal of Artificial Intelligence Research, 1: 61–89.
  • D. Stacey, D. Calvert and T. Carey (1992). Artificial neural networks for analysing user interactions. Neural Networks and Pattern Recognition in Human–Computer Interaction , Eds. R. Beale and J. Finlay. Chichester, UK, Ellis-Horwood. 103–113.

 


Handbook of Neural Computation,
Eds. E. Fiesler and R. Beale.
CRC Press. 1996

 


https://alandix.com/academic/papers/NNandHCI-1996/

Alan Dix 30/12/2024