Judith Ramsay, Alessandro Barabesi and Jenny Preece
The Center for People and Systems Interaction
South Bank University
103 Borough Rd., London, England
Tel. + (0) 171 815 7414
E-mail: {ramsayja, barabea, preecej}@sbu.ac.uk
This position paper presents late-breaking findings about informal communication over desktop video-conferencing software. Evidence is presented in support of users coupling media, and in support of video providing some form of shared object perspective (previously known as the 'video as data' hypothesis [1]).
KEYWORDS: Computer-mediated communication (CMC), desktop video conferencing, informal, shared object, media-coupling.
Computer-mediated communication (CMC) can be textual or an audio visual communication infrastructure involving cameras, monitors, microphones and speakers in users' homes or offices. This allows communication with others who are geographically distributed. CMC tends to be less formal in nature than CSCW and can be used in work or non-work situations.
Of particular interest to us is the extent to which the media are coupled with one another, in interactional units, during computer-mediated communication. An example of this is the following:
User One initiates: "Can you see me?" (Typed)
User Two responds: "Yes" (Spoken)
(In fact, our desktop video conferencing system (CU-SeeMe) couples audio and video (audio/visual), but in this study we focus primarily on audio.)
In addition we believe that focusing on shared objects and their contextual role might provide a fruitful way forward. This reflects Whittaker's [1] view that work in desktop video conferencing has neglected the importance of shared objects and their role in a shared context.
The aim of this research is to investigate: (i) the way that different media are used, and (ii) the requirement for shared objects and operational spaces in informal communication via desktop video conferencing.
The interactional cues that video transmits are likely to play a more important role among people who know each other, and who are motivated to carry out the communication task [2]. We thus investigated informal communication, which is representative of individuals who know one another, and to increase motivation and ecological validity, individuals selected their own issue for discussion.
CU-SeeMe is a desktop video conferencing system designed by Cornell University for use on the Internet or other IP networks. It runs on Macintosh and Windows platforms. Video transmission requires a camera and digitizer. It displays 4-bit grayscale video windows.
Ten pairs of subjects (five male-male, one female-female, and four male-female pairings) were video-taped interacting with one another using CU-SeeMe. The pairs knew one another, but all had little experience of using the software. The subject sample comprised students and professors. Each pair discussed a small problem of personal significance. One individual helped the other in reaching a solution to the problem, or in defining a plan of action. Examples of problems discussed included:
The subjects were seated in two separate rooms furnished with a PowerPC with a 14" monitor. A camera (Connectix QuickCam) and a microphone were mounted on each monitor. The internal speakers were used. The audio channel was a 64 Kb/s full duplex. Video was transmitted through the ethernet at a maximum speed of 500 Kb/s. Each subject was provided with a self view window, a remote view window and a text window (known as the talk window) to send text to the remote partner. The monitor output was connected to a video-recorder through a computer-to-video scanner in order to videotape the interactions.
The taped interactions were examined for instances of: (i) choice of media during interaction, and (ii) the need, if any, for shared objects and operational spaces.
Seven out of the ten pairs occasionally engaged in some form of media mixing or coupling. As the video link is omnipresent in the interaction, and usually accompanies audio channel usage, the media coupling we discuss here refers to audio/visual and textual links. Frequency counts revealed the following:
text (given): audio/visual (returned) - sixteen events;
audio/visual (given): text (returned) - nine events.
Eleven of the sixteen text-audio/visual pairings occurred when there was an audio problem such as feedback being received, and the users were arranging, via the text link, how to handle the audio without involving the audio channel in so doing. The remaining five instances were of the following nature: Person A types (text): Person B interrupts (audio/visual). This tended to occur in scenarios in which the person using text embarked upon a detailed or lengthy description, which the other person then interrupted.
Not only did users couple different media types, they occasionally doubled up on, or overlapped their media use, for example talking whilst typing. This was true to a very large extent in three out of the ten pairs. The need to annotate the ongoing interaction is possibly linked to all three of these interactive sessions also showing the desire for some form of shared object.
A frequency count was made of the number of times a shared operational space was desired, either through verbal comment or evident physical activity. Five out of the ten pairs spontaneously displayed behavior indicating a need for some type of "shared activity" space. A shared activity facility was desired during tasks of the following nature: those requiring a shared record facility, those that were clarificatory in nature, and those requiring a shared authoring space.
We observed that the audio/visual channel is used to override and interrupt a textual message from the other communication partner. This can be interpreted as a quick communication channel (audio) being used to draw the attention of the other individual using a slower communication channel (text).
Caldwell, Uang and Taha [3] found contradictory findings in the literature, and the relation between task and media usage is still unclear. That the need for a shared record of events occurs, is a likely function of the fact that audio and video interactions are dynamic and transitory. Whereas a textual record, although it develops during the interaction, remains static enough on the screen to act as an external memory buffer. This explains the uptake of the textual channel in the form of a shared object. This is possibly indicative of a more general human propensity to share objects, especially non-textual, abstract representations and pictures and diagrams.
This paper presents late breaking work, which will lead to our characterizing the dimensions that are needed to make CMC systems support users well; particularly in terms of media, shared objects and spaces.
Our position is that even in informal communication users display a strong natural need for some form of shared external memory to record key decisions and points raised in their conversation. Shared authoring, editing and drawing/sketching tools are needed for creating documents such as letters or for explaining abstract concepts. The ability to show objects of discussion to a conversation partner also appears to be important.
In the next stage of our work we will replicate our study with more participants to see if these trends are generalizable. If so, then we will characterize the kinds of tasks for which such objects and spaces are required and develop prototypes which can be tested experimentally. We will also analyze the video in detail using play-back interview techniques [4] to elicit users' reasons for their media choices and need for sharing objects and spaces.
[1] Whittaker, S. (1995) Rethinking video as a technology for interpersonal communications: theory and design implications. International Journal of Human-Computer Studies, 42, 501-529.
[2] Isaacs, E. and Tang, J. (1994) What video can and cannot do for collaboration: a case study. Multimedia Systems, 2, 63-73.
[3] Caldwell, B. S., Uang, S-T. and Taha, L. H. (1995) Appropriateness of communications media use in organizations: situation requirements and media characteristics. Behavior and Information Technology, 4, 199-207.
[4] Monk, A., Wright, P., Haber, J. and Davenport, L. (1993) Improving your Human-Computer Interface: A Practical Technique. New York: Prentice-Hall.