An Explorative Analysis of User Evaluation Studies

Geoff Ellis
Lancaster University, UK.
http://www.comp.lancs.ac.uk/computing/users/ellisg2/

Alan Dix
Lancaster University, UK
http://www.hcibook.com/alan/

Paper at BELIV06 "BEyond time and errors: novel evaLuation methods for Information Visualization" workshop at AVI'2006, 23rd May 2006, Venezia, ITALY.

Abstract

This paper presents an analysis of user studies from a review of papers describing new visualisation applications and uses these to highlight various issues related to the evaluation of visualisations. We first consider some of the reasons why the process of evaluating visualisations is so difficult. We then dissect the problem by discussing the importance of recognising the nature of experimental design, datasets and participants as well as the statistical analysis of results. Finally, we propose explorative evaluation as a method of discovering new things about visualisation techniques, which may give us a better understanding of the mechanisms of visualisations.

keywords: Explorative evaluation, Information visualisation, Evaluation, Case study

Full reference:: G. Ellis and A. Dix(2006). An explorative analysis of user evaluation studies in information visualisation. In Proceedings of the 2006 Conference on Beyond Time and Errors: Novel Evaluation Methods For information Visualization (Venice, Italy, May 23 - 23, 2006). BELIV '06. ACM Press, New York, NY, 1-7.
http://www.hcibook.com/alan/papers/
beliv06-evaluation/

more:: ACM DOI= http://doi.acm.org/10.1145/1168149.1168152
Download draft paper (PDF, 320K)
related work on visualisation at: http://www.hcibook.com/alan/topics/vis/

elsewhere: Henry Lieberman's and Rant: The Tyranny of Evaluation and Shuman Zhai's elequent reply Evaluation is the worst form of HCI research except all those other forms that have been tried reflect some of the same points as we do in this paper.

References

Bederson, B.B., Shneiderman, B., Wattenberg, M. Ordered and Quantum Treemaps: Making Effective Use of 2D Space to Display Hierarchies . ACM Transactions on Graphics, 21(4), Oct 2002, 833-854
Carroll J M, Rosson MB Getting around the task-artifact cycle: how to make claims and design by scenario. ACM Transactions on Information Systems. Vol 10 No 2, April 1992, 181-212
Dumais, S., Cutrell, E., Chen, H. Optimizing Search by Showing Results In Context. Proc. CHI’01, 2001, ACM Press, 277-284
Ellis, G.P., Bertini, E., Dix, A. The Sampling Lens:Making Sense of Saturated Visualisations, Proc. CHI’05 Extended Abstracts on Human Factors in Computing Systems, Portland, USA, 2005, ACM Press, 1351-1354
Fekete, J-D., Plaisant, C. Excentric Labeling: Dynamic Neighbourhood Labeling for Data Visualization. Proc. CHI’99, Pittsburgh, 1999, ACM Press, 512-519
Kosara, R., Healey, C.G., Interrante, V., Laidlaw, D.H., Ware, C. Thoughts on User Studies: Why, How, and When. Computer Graphics & Applications, 23(4), July 2003, 20-25
Mackinlay, J. D. , Rao, R., Card, S. K. An Organic User Interface For Searching Citation Links, Proc. CHI'95, Denver, May 1995, ACM Press, 67-73
O'Donnell, R., Dix, A. Exploring The PieTree for Representing Numerical Hierarchical Data. HCI 2006, (to appear)
Paek, T., Dumais, S., Logan, R. WaveLens: A New View onto Internet Search Results. Proc. CHI’04, Vienna, Austria, Apr 2004, ACM Press, 727-733
Pirolli, P., Schank, P., Hearst, M., Diehl, C. Scatter/Gather browsing communicates the topic structure of a very large text collection. Proc. CHI'96, Vancouver, May 1996, ACM Press, 213-220
Plaisant, C., Milash, B., Rose, A., Widoff, S., Shneiderman, B. LifeLines: Visualizing Personal Histories. Proc. CHI'96, 1996, ACM Press, 221-227
Plaisant, C. The Challenge of Information Visualization Evaluation. Advanced Visual interfaces, Italy, 2004, ACM Press
Tory, M., Möller, T. Evaluating Visualizations: Do Expert Reviews Work?. IEEE Computer Graphics and Applications, 25(5), 2005, 8-11
Wong, N., Carpendale, S., Greenberg, S. EdgeLens: An Interactive Method for Managing Edge Congestion in Graphs. IEEE Symposium on Information Visualization, Oct 2003, 51-58
Yang, J., Ward, M.O., Rundensteiner, E.A., Huang, S. Interactive hierarchical displays: a general framework for visualization and exploration of large multivariate dataComputers and Graphics, 27(2), Apr 2003, 265-283

Key points

Evaluation in visualisation papers is infrequent, and apparently poor when it occurs.
This is because it is hard.
But more important is to understand when and why evaluation is needed - and thus to do it, interpret it and judge it better.

Empirical evaluation of
generative artefacts is
methodologically unsound

Generative artfacts are ones which exist to make other things. This includes methodologies, toolkits, design principles, guidelines and visualisation techniques or algorithms. It is only the software produced by these by a particular designer for a particular purpose that can in pronciple be evaluated. Even then software is itself generative as it is the instance of use by a particular user for a particular task.

However they can be justified through reasoningand evaluation can form part of this.

Figure 1. the two sides of validation: justification and evaluation

Forms of evaluation

Summative - is it right
Formative - can I make it better
Explorative - can I understand moor

Summative and formative evaluation are good in design ... but it is usually explorative evlauation yuo need in research.

http://www.hcibook.com/alan/papers/beliv06-evaluation/

Alan Dix 22/4/2006