Reviewing Designs for a Synchronous-Asynchronous Group Editing Environment

this chapter published as:: V. C. Miles, J. C. McCarthy, A. J. Dix, M. D. Harrison and A. F. Monk (1993). Reviewing designs for a synchronous-asynchronous group editing environment. In Computer Supported Collaborative Writing Ed. M. Sharples. Springer-Verlag. pp. 137-160 .

V.C. Miles, J.C. McCarthy, A.J. Dix, M.D. Harrison, A.F. Monk

Abstract: This paper marks the first steps in the design of a group editing environment for synchronous and asynchronous collaborative document development. A model of cooperative work is presented and applied to the task of collaborative writing. This analysis assists in a review of literature on computer mediated communication and shared editor design, and in the formulation of design ideas for a group editing environment. The designs suggested can be seen as providing users with a range communication 'channels'.

Keywords: Cooperative work, collaborative writing, computer mediated communication, shared editors, communication channels, interface design.

Introduction

The roots of computer supported cooperative work (CSCW) are in the observation a great deal of our daily activity involves some form of collaboration [Olson 1989]. Much of the early research in CSCW considered ways of supporting a general concept, that of cooperative work. More recently CSCW research has begun to focus on specific applications. Computer support for collaborative writing, particularly in the form of a shared editing environment, emphasizes a more task focussed approach to CSCW applications.

Collaborative writing involves two or more people working together to produce a document. It involves phases of writing and phases of communicating [Newman 1992,Beck 1993]. It also involves periods of synchronous activity, where the group works together at the same time, and periods of asynchronous activity, where group members work at different times. The diverse range of activities involved and the different modes of interaction (synchronous and asynchronous) make collaborative writing a particularly interesting domain for CSCW support. The CSCW literature discusses a number of shared editor applications which aim to support collaborative writing. It is interesting to note that many of these systems will support either synchronous or asynchronous interaction, but not both.

Work at York on CSCW has echoed the general pattern of CSCW research mentioned earlier. We began by looking at computer mediated communication, building the Conferencer as a 'generic' conferencing environment [McCarthy 1990,McCarthy 1991]. This work was valuable in clarifying our thoughts on CSCW. A natural progression, we felt, would be to attempt to support a more focussed task than conferencing. We chose to examine collaborative writing within the shared editor application domain. By focusing on shared editing, we are taking a narrow definition of collaborative writing. We concentrate on that part of the collaborative writing process which is concerned with the actual creation of text: 'putting pen to paper'. Our proposed shared editing environment does not aim to provide specific support for activities such as brainstorming, ideas organising and decision making, although these are important parts of the overall writing process. However, as this paper will show we did not deem it sufficient to design a shared text editor alone, but also considered the design of accompanying conversation spaces.

This paper then, marks the first steps in the design of our shared editor environment. We consider literature on collaborative writing, computer mediated communication and group editor applications to come up with some design ideas for our shared editor environment.

We start, in the next section, by touching on some of the issues surrounding collaborative writing, this is done with reference to a conceptual model of cooperative work. This section sets up the broad requirements for our shared editor environment. Sections 3 and 4, deconstruct the model into its component parts. These sections include discussion of designs for supporting computer mediated communication and group editing. Sections 3 and 4 see a move from the conceptual model to ideas for design. Section 5 attempts to reconstruct the model in the light of these design ideas. The paper concludes with a discussion section.

Cooperative Work

This section looks at collaborative writing, using a conceptual model of cooperative work as a focus. This model is show in figure 1.

Figure 1. Conceptual Model of Cooperative Work.

Direct Communication

Arrow (a) in the model of cooperative work refers to direct communication between participants. Direct communication implies that participants are aware of each other, and will address messages to their co-participant(s). Direct communication takes place in face-to-face interaction, for example, in informal conversation, formal meetings and so on. It can also be mediated, for example by telephone, or electronic messaging. Computer support for direct communication is often referred to as computer mediated communication (CMC). If we take the example of a group of people writing a document together, direct communication may include the communication of ideas about what to include in the document, how to go about the task, relevant expertise, comment and so on.

Successful direct communication implies the achievement of a common understanding between participants. This can be difficult, particularly when group members have a diversity of backgrounds. Different perspectives and different areas of expertise often mean different languages for addressing a problem [McCarthy 1991a]. Olson refers to this as a coordination cost [Olson 1989]. The complexity of the coordination problem is elucidated by Begeman, Cook, Ellis, Graf, Rein, Smith [1986]. in their analysis of meetings. This analysis itemises the variety of activities, information items, contexts, and goals which constitute a meeting. Only a subset of these items will be considered by each participant in a meeting. Clearly the nature of this subset will depend on the participant's background, and disparate backgrounds might be expected to result in ineffective meetings, as there may only be a small degree of overlap between subsets. Thus we see the importance of successful direct communication. If each participant can successfully communicate their perspective, a common understanding can be achieved, with each participant having some knowledge of how other's perspectives differ from their own.

Clark and Brennan [1991] maintain that it will be easier to reach a common understanding in face-to-face interaction, where a high degree of contextual information is available to participants. They maintain that in direct, face-to-face communication context is shared through: copresence; visibililty; audibility; cotemporality (where one person receives an utterance at roughly the same time as another produces it); simultaneity (where people can send and receive simultaneously) and sequentiality.

In the model of cooperative work, line (e) represents common understanding. Understanding need not only be an emergent property of direct communication, it may also be come from interaction through shared artifacts, discussed in the next section.

Shared Artifacts

Cooperative work emphasizes some shared task, or common purpose. Often, this shared task will involve some artifacts which are the subject of the work. These artifacts may be entirely conceptual, for example a joint decision, or be physical, such as shared document. The artifacts may or may not be part of a computer system. So, a report being prepared by two or more people may be a conceptual plan they share, a paper artifact, or a document that is represented on a computer.

The model of cooperative work, in figure 1, includes a physical artifact. The (b) arrows represent each participant's interaction with the artifact. Interaction with a common artifact can take place synchronously, where participants work concurrently, or asynchronously, where participants work at different times. One shared artifact central to collaborative writing is the shared document. Posner et al.'s [1991] research into the group writing process, has suggested that participant's interaction with a document artifact will be characterised by periods of synchronous and asynchronous activity. They comment on the need groups have to change writing strategies at any time during a collaborative authoring project. Writing strategies identified include: a single person writing the document based on discussion with the group; a scribe in a group meeting; a division of labour, with different members of a group authoring different sections; and a group writing together. The division of labour strategy proved the most widely used.

In addition to each participant's interaction with the artifact, communication can take place through the artifact. One can think of this as indirect communication. In figure 1 indirect communication, through the artifact, is shown by the curved arrow, (c), which links participants and artifact. Imagine two authors sharing paper and pens to write a document. The text of the document, and annotations to it, acts as a means of communication, helping to establish the task in which they are engaged, and perhaps indicating to each participant the expertise, perspective, and/or role of the other. The style in which the text is writing is informative; as is the type of paper and pen. Cheap paper, and sketchy notes written in pencil indicate to participants an informal interaction. A carefully penned letter suggests something more polished. The orientation of the writing material between authors, is another form of communication. When one participant pushes the paper across the desk to his partner, the implicit message is, "it's your turn now". Pettersson [1989] has examined the communicative power of the shared artifact in her analysis of document reading within two intensive care units. She analysed the type, use and recognition of documents in the wards. She found that information can be transmitted through: the location of a document; its spatial layout; the appearance of a field, filled or unfilled; and the nature of the handwriting and the pen used to produce it.

In addition to interacting with and though the artifact, participants will use various means to refer to particular artifacts. This is illustrated by arrow (d) in figure 1. Clark and Brennan talk about the importance of mutually establishing references to artifacts [Clark 1991]. They maintain that many conversations focus on objects and their identities, and discuss several common techniques for establishing 'referential identity', that is, the mutual belief that addressees have correctly identified a referent. In fact-to-face conversation one such technique is 'indicative gesture', which highlights the importance of pointing, looking and touching as a means of grounding references.

Support for artifact sharing is another target for computerisation. Group editing applications, shared calendars and shared drawing tools all provide computer support for artifact sharing. Such applications will embody the characteristics of arrows b and c in figure 1. Participants will interact with the shared artifact, shown by arrows (b), and will communication indirectly through the shared artifact, shown by arrow (c).

Broad-Based Requirements

This brief look at the cooperative work of collaborating writing has highlighted some important issues which can be translated into some board-based requirements for our group editing environment.

Many of the shared editor applications implemented to date will support either synchronous or asynchronous interaction, but not both. Yet, Posner et al.'s analysis of the writing process suggests that a joint writing effort can last from several days to several years and groups are likely to adopt both synchronous or asynchronous modes at some time during a project. Our choice, then, is to design an editing environment that is capable of supporting synchronous and asynchronous group interaction. In so doing, the aim is to support different modes of writing. For example, support for synchronous interaction would allow group members to write together, at the same time: one of the writing strategies identified by Posner et al. Supporting asynchronous interaction will allow group members to follow the widely used division of labour strategy: each individual writer will be able to produce their part of the text, in their own time, using a single environment. Provision of support both both synchronous and asynchronous interaction is favoured by Posner and her colleagues. She writes,

"Technology needs to be flexible and permissive, allowing groups to change strategies... Smooth transitions should be supported between ... synchronous and asynchronous work by group members."

For a more detailed look at synchronous and asynchronous collaborative writing see Baydere et al. in this volume.

When a shared editor is to support synchronous interaction only, then a designer can reply on face-to-face interaction, video and/or audio for direct communication between participants, arrow (b) in figure 1. When asynchronous interaction is also to be supported, then designers must consider other means of facilitating the group's direct communication. One (inexpensive) option is to use textual communication, that is, textual messages displayed at the workstation interface. Direct communication, therefore, is mediated by means of computer support for textual messaging. Other means of mediating asynchronous communication include asynchronous voice. Unfortunately we do not have the facilities to support this at York, so for practical reasons the choice is for textual messaging.

In terms of the conceptual model, we are suggesting that our group editing environment should provide support for both direct communication and work artifact interaction. We require text-based conversation space for direct communication, and shared editing facilities for interaction with and through the document artifact.

Issues in Computer Support for Direct Communication

In this section we begin the process of deconstructing the conceptual model of cooperative work. This section looks at design issues in computer support for direct communication between participants. The analysis is limited to textual messaging.

CSCW systems vary in approaches they take to structuring direct communication. There are two broadly differing approaches. On the one hand there are systems which enforce a dialogue structure, on the other hand there are systems which encourage users to structure their own conversation. Gibbs [1989] describes these approaches in terms of a "social versus software protocol". Miles, Johnson and McCarthy [1991] talk of 'local' and 'global' structuring. Local structuring is an emergent property of a particular interaction: global structuring is embedded at design time, and is enforced in the same way for every interaction.

Coordinator, an application designed to make clear the commitments of communicating partners, provides an example of a system which enforces a dialogue between its users. A central feature of Coordinator is its use of conversational templates, which are based on speech act theory. These templates define a rigid dialogue structure, which the system enforces. Thus, when a user opens a 'conversation for action' with another user, he is able to predict a closure of some kind, because this is defined within the template. User testing revealed a negative reaction towards Coordinator []. Carasik and Grantham blame this in part on the rigidity of the conversational structure. For example, they argue that it may be the norm within a task-oriented group that only certain statements need a response, yet Coordinator required an explicit response, violating interaction norms. This suggests a lack of context sensitivity. Given a certain relationship between participants there may be no need for the intermediate steps in a 'conversation for action', one utterance might be sufficient to make a number of speech acts. Coordinator's enforced structure limits participants' potential for novel intervention. To that extent, it is an unnecessary constraint on the group process, since it limits the options available for solving the problem under consideration. Carasik and Grantham's evaluation is that,

"The conversational templates appeared to be more a straight-jacket than a communications medium." [Carasik 1988]).

Work on the Amsterdam Conversation Environment (ACE) contrasts greatly with the approach taken with Coordinator. Designers of the ACE system [Dykstra 1991] adhere to the principle that users rather than the system should structure communication. ACE is designed for synchronous use only, and aims to support conversation and stimulate interaction among group members. The support that is available is for the expression and preservation of people's views, rather than enforcement of a dialogue structure. Hypertext, provides the means by which exchanges can be recorded. Designers of ACE are scornful of systems that "institutionalise a conversation space". They maintain that system enforcement of formalised methods of conversation leads to bureaucracy, and rules that stop conversation.

In terms of a conversation space for our proposed group editing environment, allowing user structuring and hypertext support for conversation may present problems for asynchronous users. The non-linearity inherent in hypertext may make it difficult for asynchronous users to follow a previous conversation. In other words the relationship between a piece of text and its context may be lost. Our earlier discussion of direct communication indicated that context can play a significant role in situating text. When the context of a text is unclear there is potential for that text to be misunderstood. Indeed, Conklin and Begeman, in their examination of linear versus non-linear text, found that the non-linearity of hypertext meant that users found it difficult to follow the thread of a writer's thoughts as it wound through several nodes. They suggest that,

"...traditional linear text provides a continuous, unwinding thread of context as ideas are proposed and discussed".

Lying somewhere between the rigid dialogue structuring of Coordinator, and the fluidity of conversation permitted by ACE, are systems, such as the Information LENS system [Malone 1987], which support semi-structured messages. LENS presents users with a range of semi-structured message types which they can use for information sharing. The writer is aided in structuring a complete message without any constraint on expressivity, while the reader is provided with a recognisable structure that facilitates efficient search and comprehension. Semi-structured messages can be an effective means of establishing the context of a textual message [McCarthy 1991]. This is an important consideration, since LENS is designed for asynchronous use, where the time lag between messages can lead to the relationship between the text and its context being lost. The message templates in LENS and the message frames used in KMS [Yoder 1989] provide information about the subject matter of the message, the identity of participants in the conversation, and so on. In both cases the template or frame surrounds the text of the message. In effect, the message is embedded within its context border.

Semi-structured messaging systems may not be suitable for synchronous use. Message templates must be explicitly filled before the information is shared. It may take some time to complete a template, slowing the pace of interaction. Designers of some synchronous systems, view the grain of text transmission inherent in semi-structured messages as too coarse for their needs. Fine grain updating may present valuable contextual information to users, allowing them to review the text as it is generated.

The importance of the grain of message sending is highlighted by Tatar et al. [ 1991], in their discussion of the development of Colab's Cognoter tool. Cognoter, a shared whiteboard for idea organisation [Stefik 1987] was initially developed to include private edit windows. Users of the Cognoter whiteboard could create an individual item using these private edit windows. When completed, an icon consisting of the first twenty characters of the item was created and placed in the Cognoter window. Once there, any user could change or add to any item by opening a private edit window on it. The private editing windows had an 'aggregating' [Johnson 1990] function, collecting chunks of text before transmission. Studying users' reactions to Cognoter, Tatar et al. found that groups were frustrated by the lack of transparency afforded by the whiteboard. Colab's What You See Is What I See (WYSIWIS) principle had been relaxed [Stefik 1987] to such a degree that the group process was being hampered. Users wanted a finer grain of message transmission than the private windows would allow. Cognoter was redeveloped as a consequence. The redesign featured a much finer grain transmission achieved by updating every few characters.

This review of CSCW systems which support text-mediated direct communication, highlights some of the issues and trade-offs designers must consider. Should the designer support the enforcement of a dialogue structure, or should users be encouraged to structure their own interaction? The rigid system enforced structure in Coordinator presents users with a ready knowledge of the thread of a conversation, but places unnecessary constraints on expression. Non-linear representation of participant structured conversation may present difficulties for the asynchronous user, who may be unable to reestablish the context of the text. Semi-structured messages provide an alternative, preserving and presenting the text and its context. However, in synchronous interaction, explicit completion of semi-structured messages may slow the pace of the conversation, and remove the important contextual information inherent in being able to see the emergent text.

Some Ideas for Conversation Space Design

The previous section looked at some CSCW systems that support textual direct communication. These were reviewed in terms of how well they support synchronous and asynchronous interaction. With this review in mind let us consider possible design ideas for a 'conversation space' within our proposed group editor. It should be stressed that these design ideas represent a first step to producing a prototype. Evaluation is required to assess how well these features support synchronous and asynchronous text-based conversation.

Figure 2. Some Conversation Space Design Ideas.

Reviewable text transcript: Text-based communication has an advantage over ephemeral speech [Dix 1991] in that it is reviewable. For asynchronous users particularly, it is vital the text of previous conversations remains to be reviewed. If users were able to delete text irrevocably, then the context history of the interaction, which structures the group process would be lost. In Conklin and Begeman's [1989] terms, the 'existence' of the conversation is important. They use the notion of 'existence' to emphasize that information elements and commitments which are forgotten or are not readily accessible have, in a sense, ceased to exist.; It might be of value to asynchronous users, if the system indicated new conversational text. That is, text that had been generated since a participant last used the system. For the system to be able to produce such an indicator, users would have to be registered with the system.; Systems like LENS allow 'directed messages' between users [Malone 1987]. That is, a user can choose to send messages to a particular person, or an enumerated group, rather than to 'all'. Directed messages occur in conventional collaborative writing, and therefore we should seek to support them in the proposed group editing environment. It will be interesting to observe the use of such a facility. Allowing directed messages within a small group, (we envisage supporting groups of less than ten people) may present problems, in that the conversation would cease to be common to all.
Sequential, linear text: Presenting the text of a conversation in non-linear form may confuse asynchronous participants. They may not be able to follow the thread of the conversation. System enforced sequential, linear text would provide a degree of structure to the conversation. Asynchronous users would be provided with a ready contextual history. Synchronous users would be able to predict that new text is always appended to the transcript. Sequentiality ensures that the presentation and acceptance of text takes place in an orderly way. True sequentiality means appending a message when it is begun, rather than when it is complete.
Indicating participant activity: Participant activity is a valuable source of contextual information. In face-to-face conversation information about participant activity is maximised. Participants will be aware of the interactions that led to the present utterance, they will know the content of the utterance and from whom it originated. In an environment that supports textual, distributed communication, the contextual clues available are limited. The sorts of contextual information that can be provided at the group interface are likely to vary for synchronous and asynchronous users.; In synchronous conversational interaction context can be established by representing participant status information. Indicating the identity of group members is a necessary part of this. Such information can be provided in a variety of ways. 'Conferencer' maintained a 'participant list' giving the name of every user in a session, other systems use face icons. Indicating a participant's conversation space activity is another form of status information. Fine grain updating is one way of indicating synchronous user activity. Our earlier discussion of the grain of text transmission indicates that rapid updating can increase the pace of interaction, and provide an emergent context. An alternative is to allow private message composition, with appropriate activity indicators. 'Conferencer' provided this form of support, by attaching a 'compose flag' to a writer's name on the participant list. A participant might prefer to compose in private if his message is complex requiring careful articulation. We are interested in supporting both fine and coarse grain message sending in our prototype group editing environment. We should like to observer the effects of providing users of a conversation space with a choice: either to transmit the message as it is typed, providing others with rapid updates, or to transmit the text when it is complete. By offering this option, users can generate their messages in a manner they consider others will understand. Offering a choice of grain size may pose problems for non-expert users, who may find it difficult to understand the distinction between the two choices, and therefore, which choice to make.; We have suggested that users be given a choice of message sending grain. The designers of ACE, take a similar approach to empowerment. They consider that systems should be capable of supporting constraints, but that the locus of control should remain with the users. Users should be able to choose and generate their own constraints.; Asynchronous users of our group editing environment would require information about previous activity: a contextual history of interaction. Message templates and frames, in systems like LENS and KMS, surround the text of a message with its context. Asynchronous participants are thereby provided with valuable contextual information about the message. Within a conversation space, contextual clues could be automatically generated to accompany the text. Indicating who wrote a message could be facilitated by having the system pre-fix a textual message with the writer's name. System generated messages could be used to indicated when text was generated, and whether it was produced during a synchronous or asynchronous session.

Figure 2 illustrates some of the conversation space design ideas put forward in this section. These design ideas can be seen as striking a balance between system enfored structure, and user structuring. A conversation space includes some system enforced structuring, in that a sequential, linear transcript is maintained. However, by giving users a choice of grain-size for message sending we are supporting a degree of flexibility in the interaction.

Issues in Shared Editor Design

This section continues the process of deconstructing the model of cooperative work. Here, we concentrate on the interaction with and through the artifact. The artifact in our shared editing environment is the document. The group's interaction with the document will take place by means of a shared editor. This section examines the literature on shared editor applications. The aim is to consider the design of a shared editor in our proposed group editing environment.

The earlier review of computer support for direct communication discussed the issue of structure. It was suggested that structure can be system enforced or user defined. A similar dichotomy can be seen from a review of some existing shared editor applications, and the approaches they take to structuring the collaborative authoring task. There are applications which use a model of a document to enforce a structure. Others apply a system enforced structure based on roles within the joint authoring task. Others again impose minimal constraint on users, allowing them to structure their own work.

Lubich and Plattner [1990] describe a system embedded, document model approach to structuring a shared editor. Lubich and Plattner are involved in the multimETH project, working on the design of a multi-media conferencing and collaborative authoring system. The aim is support the collaborative creation, editing and management of multi-media documents. To this end, use of a formal document model has been proposed, which consists of a number of hierarchically related 'structure elements', for example title, headlines, chapters, sections etc. Each of these 'structure elements' has a 'content portion', for example, text, graphics, bitmap, etc. The document itself is represented in a shared work space, the associated document model is shown as a logical tree structure in another window. By clicking on a node in the tree structure and issuing the 'lock' command, the user reserves a complete subtree, or a content portion within that subtree, for exclusive write access.

The approach taken by Beaudouin-Lafon [1990] is similar to that used in multimETH, in that the document is given a certain structure. The emphasis of Beaudouin-Lafon's work is on support for collaborative software development, so the shared document might be a program file. A key issue is version control for collaborative software development. A document is defined as a hierarchy of fragments. Fragmentation is usually performed automatically by the system, following a predefined set of fragmentation rules. Each fragment can have an arbitrary number of versions. These versions are embedded in the document, rather than kept externally. When users open a document they can select a version for display according to the context. For example, users can choose to see the original version of the document, or the version held by a particular user, or their own version. Rule based fragmentation of a document would appear to be particularly well suited to a program file where component procedures are built to follow a regular pattern. Beaudouin-Lafon's work highlights the importance of version control, which, he suggests, is an issue many other shared editing applications ignore.

Having a system enforced document model means that users are presented with a recognisable and consistent structure. Use in MultimETH of a logical document tree provides a concise representation of the document. This could be particularly useful for asynchronous users, who will be able to establish the current state of a document from its graphical representation. A document tree structure also provides participants with a ready means of establishing a referent within a document. Users can point to a node on the tree rather than the document itself. A document model approach to structuring a shared editor can be seen as an object oriented approach. The components of a document model can be viewed as objects, with certain attributes attached to them.

One concern with the document model approach is that rigid system enforcement of a model might hinder the group writing process, particularly the initial stages. Conklin and Begeman [1989] discuss the early stages of the individual's writing process. They maintain that,

"...the early phase of consideration of a writing or design problem is critical and fragile, and must be allowed to proceed in a vague, contradictory, and incomplete form for as long as necessary."

Analysis of the writing process by Sharples and Pemberton [1990] maintains that writing is "an open ended design task". They describe a writer as "under-constrained", facing an infinite number of possible texts that could fill his goal, and an infinite number of actions that he can take at any stage. It is not clear that system level enforcement of a highly structured document model would allow sufficient flexibility.

The document model in multimETH makes some use of social roles. When a document is created in the shared workspace the corresponding access rights must be declared by a chairman or the owner of the document. Similarly it is the role of the chairman or document owner to change access rights during a collaboration. Access rights are 'write', 'read' and 'annotate'. The Quilt application for asynchronous collaborative document production also makes use of system enforced social roles [Leland 1988,Fish 1988]. Each user of Quilt is assumed to have a specific role in regard to a particular document, for example, 'co-author', 'commenter', 'editor' and 'reader'. These roles combined with the chosen document style define access privileges to the document. Document styles are assigned to sections of text within the document. Setting the document style to 'exclusive' means only the author of a section can modify it; 'shared' means any co-author can modify any section; 'editor' ensures that only a designated editor can modify any section, and other authors must make submissions to the editor. Roles and style must be specifically defined at the beginning of a collaboration.

Role enforcement can be a useful means of structuring, and organising collaborative writing. The allocation of roles helps insure that activities are neither neglected nor unnecessarily duplicated. Knowledge of the their own and others' roles enables users to predict their responsibilities with the interaction. One contention, however is that role structure may not always be obvious at the beginning of a collaborative endevour. It may be an emergent property of the interaction. Neuwirth et al. [1990] make the point that 'premature' definition of roles can lead to undesirable consequences. They suggest that if, for example, authorship is denied at the outset, then this may reduce the motivation of someone who has been defined as 'non-author', and the person may be disinclined to contribute. Roles formed one element of Posner et al.'s writing process taxonomy. Their findings support the notion that roles change during a protracted interaction. They cite the example of groups whose members all started out with the intention of contributing. Later the group member with the least time to dedicate to the project, fell into a consultant role, and ceased to be an active writer. There may be a conflict between roles and control in a collaborative writing task. Roles in a collaborative writing task may not be accurately reflected when other relationships interfere. A supervisor and their student may both be defined as co-authors, but the relationship between them may make this declared equality hard to sustain.

The applications discussed so far enforce a structure, be it role or document model based, upon users. An alternative is to allow users to structure their own interaction by means of the emergent social protocols. The GROVE group editor [Ellis 1991,Ellis 1989] provides an example of an application which imposes minimal system constraint. The default in GROVE is a mode in which every user can see and edit any part of the shared document, and there is absolutely no locking while editing. Mode changes can be achieved by redefining the 'view' of some portion of the shared environment. A user can move from the default 'public' view to a 'private' view, where items can be accessed by that user only, or to a 'shared' view where items are accessible to an enumerated set of users. Studies of the use of GROVE provide evidence of social protocol mediation. Ellis et al. describe how users organised themselves to work in parallel; how they organised 'partitioned entry' where the group assigns particular members to refine or reorganise particular parts of the text and how the individuals avoided colliding with others working on a particular piece of text at a particular time.

ShrEdit, a shared text editor developed at the University of Michigan [Bellotti 1991], also enforces minimal system constraint. ShrEdit is a synchronous shared editing tool designed for use in face-to-face design meetings. Users can work simultaneously in any part of the document, although insertion points are locked, so that no two insertion points can be co-located. No continuous feedback is given to users to indicate the location of other users' editing activity, however users can elect to 'find' or 'track' other users to gain such information. Bellotti, Dourish and MacLean [1991] report users' reactions to ShrEdit. They gave eight groups of three co-located designers, three different design problems to solve using ShrEdit. Bellotti et al. were struck by the great diversity of ways in which ShrEdit was used. A number of design themes emerged from their analysis, including users' desires to know what people were doing and where. Users described "bumping into each other" because of the lack of user activity information. One group described constructing a tree representation of the texts they had created, permitting "a fairly quick address for where we were talking". In other words they were providing themselves with the means to make 'referential identity' easier.

GROVE and ShrEdit are both designed for synchronous use, they allow parallel editing and are characterised by fine grain (or no) locking, and rapid updating. Frequent updates minimise the divergence between different views on a text [Olson 1990], ensuring the each user sees the most up-to-date version of a shared document. Continuous updating can be thought of as a way of 'communicating through the artifact'. Participants in a synchronous interaction will be able to watch others' text as it is created. Fine grain text transmission mitigates against what Chafe [1986] cites as one of the advantages of written textual communication. Writers, he claims, can revise and polish their thoughts before communicating them; a facility not available to speakers whose second thoughts and revisions are laid bare before their audience. Providing private workspaces [Olson 1990] give users the option to this polishing and revision of their contributions before including them in the shared document. Olson et al. [1990]comment that many group editors provide this for free, in that they are embedded in multi-tasking environments where other windows are available for private work. Adopting this solution, however, would mean that important user activity information would be lost to other participants. Ellis et al. [1991] have considered the issue of private work, and suggest a novel means by which private user activity information can be denoted. They suggest that private user text editing activity can be shown in the form of a 'cloudburst'. Textual modifications are shown immediately to the person who initiates them, but are indicate on other users' screens by the appearance of 'clouds' over the original text. The position and size of a cloud indicates the approximate location and extent of the modification.

GROVE and ShrEdit are designed for synchronous use only. They do not attempt to support asynchronous interaction. In ShrEdit, for example, this means that no provision is made for recording the context of a text, for maintaining versions or for reviewing changes. The LENS and KMS asynchronous messaging system discussed earlier deliberately separate text and context, 'framing' the text within a context border. Users can re-establish the context of a message by referring to its border. We require our proposed group editor to support synchronous and asynchronous interaction. It is important, therefore, to consider firstly, the separation and recording of text and context, and then, the means by which asynchronous users are able to re-establish the context of a text. The document model discussed earlier lends itself well to the separation and reinstatement of text and context. Contextual information can be associated with document model 'objects'. For example, it would be possible to establish who wrote what text object; when it was written, whether it had be changed; by whom and when; whether there were assess privileges associated with it and so on.

This review of shared editing applications has once again shown up the various trade-offs that designers must consider. Rigid enforcement of roles or a document model may be too inflexible to encompass the range of techniques groups employ when writing together. Role enforcement, however, can help ensure that activities are not ignored or duplicated. A document model lends itself well to 'object' status, and is amenable to graphical representation. Single editor, multiple cursor systems with minimal system intervention allow users to structure their own interaction, but do not support asynchronous interaction.

Some Ideas for Shared Editor Design

The previous section offered a brief review of some existing shared editor applications. In the light of that review, this section considers possible ideas for the design of a shared editor artifact within our proposed group editing environment. Once again it should be made clear that evaluation of the prototype is required to assess how well the system supports the group editing task.

Structuring

Group editing systems differ in the support they provide for structuring the editing task. In some systems the shared editor is structured by an embedded model of the document under production. Other applications use system enforced role assignment to structure the access to a shared document. Others still allow the group to structure their own work, and impose minimal system constraint. We should like to consider combining elements of these approaches in the design of our group editing environment.

An objection to system enforced document model structuring is that it is too rigid, and does not reflect the way people write together. This would appear to be the case particularly at the initial stages of a collaborative authoring endeavour. An advantage of document model structuring is that it readily supports the graphical representation of a document. Such a logical model could be a useful point of group reference for synchronous users, and a good means of assessing the current state of a document for each asynchronous user.

One might consider allowing users to determine their own document structure. Support could be given in the form of a graphical representation of the structure they choose. Users would be able to chunk a document into segments which they considered appropriate, as the interaction progressed. Each segment would be graphically illustrated. The graphical representation of the document could be used as a means of gaining edit access to a segment. Thus, clicking on the representation of a segment would allow users access to the editor containing the existing text of that segment. Allowing users to manipulate the graphical representation of their self structured document could provide a means of refining and altering the ordering of segments.

Each user defined segment could be given 'object' status. Attributes of such as object could include details of when the segment was edited and by whom. Associating such information with the text of the segment would provide asynchronous users with support for re-establishing the context in which a piece of text was written. Users could be provided with a history of the current text of a segment. Such information could also be used by the system to automatically indicate new text to each asynchronous user.

There are various ways in which the history of a segment could be presented. One could envisage provision of a 'meta segment'. This could take the form of a copy of the text of the segment, suitably annotated with information about who wrote and changed pieces of text and when. Another possibility would be to allow users to query the text of the segment itself. For example, a user selecting a line or paragraph could be presented with the history of that selection.

Included among the ideas for a conversation space design was the suggestion that users be given a choice of support for message sending. Continuing this theme of providing optional tool support, one could consider providing users with the option of segment ownership. As owner a user could be entitled to restrict access to the segment, for example, assign read-only roles to the other members of the group.

Version control

The process by which people write is highly complex. Text will inevitably require revision. Co-ordinating revisions among a group is a non-trivial task. The work of Beaudouin-Lafon has highlighted the importance of version control in collaborative writing. He has implemented system supported version control, embedding versions within the parent document or program file.

We are interested to see how groups handle the issue of versioning. To this end we would like to provide optional tools, which users can select to assist in version control if they choose. Participants will be able to use a conversation space discussion to consider different versions. In addition, one might consider the provision of a 'dumping bay' were users can off-load segments they do not wish to include in the final document. Such a facility could be operationalised by allowing users to move a graphically represented segment from the logical structure to the 'dumping bay'.

Another feature which may be useful in version control, is a voting tool. Participants may not be able to agree on a particular version using the conversation space. In such a situation an optional voting tool could be used. Design would have to consider notifying asynchronous users that their vote is required.

Figure 3. Some Shared Editor Design Ideas.

Sharing a segment

Many existing synchronous CSCW systems provide users with the facility to work privately, as well as publically. Private workspaces allow participants to polish and revise their contributions before communicating them.

Allowing private work means explicit support must be given to the locking of a segment. When one user cannot see where his colleague is working, then he cannot ensure that he will avoid editing the same text. The potential for contention is obvious. One possibility would be to support segment level locking. Thus when a user selects a graphically represented segment, that segment is locked. A locked segment would allow write access to the lock owner, and read access to the other users. Allowing synchronous users the option to work privately means that the readers of a locked segment will see an out-of-date version on the text. The system must indicate this important contextual information. Informing users of where a colleague is working may be insufficient. The 'cloud burst' metaphor suggested by Ellis et al. gives users an idea of the size of text currently under preparation. Information as to the degree of activity in a colleagues private workspace may also be of value.

Our aim is to allow users to alternate freely between private and public work. In this way users are able to choose how they 'communicate through the artifact'. Segment level locking would avoid the problems of users 'bumping into each other' and experiencing unpredictable or chaotic interaction. To facilitate public work, rapid updating is required. Even with public work there may be the need for user activity information to be denoted. Indicating a pause in a colleague's work, for example, may be of value.

Segment level locking ensures that contention is avoided, but at a cost. Users cannot work in parallel on the same segment. However, with a segment based model, it would be possible for each member of the group to work on a different segment at the same time. Thus there is support for some form of parallel working.

Figure 3 illustrates some of the design ideas proposed. Once again, these ideas can be seen as combining elements of system enforced structuring and user define structure. We are keen that users structure their own document segmentation in a way that is agreeable to the group. We wish to provide a choice of private or public editing. However, a degree of system enforcement is inherent in the segment locking strategy proposed. Many of our design ideas focus on supporting the presentation of contextual information. An important consideration here, has been to present not only the current interaction state, but also a context history for asynchronous users.

Design Ideas for Intergrating Conversation Spaces and Shared Editor

The last two sections deconstructed the model of cooperative work introduced in section 2. First we looked at at computer support for direct communication between participants. This involved consideration of designs for a conversation space. Next we looked at interaction with and through the document artifact. This involved reviewing designs for shared editing.

In order to support asynchronous, as well as synchronous interaction, we need both conversation space and shared editing facilities in our shared editing environment. In this section we consider how to support the connectivity between these two communication spaces. We are broaching the subject of how to design an environment rather than its individual components. The aim is to consider how the system as a whole would function. In effect, what we want to do here, is to reconstruct the model of cooperative work, looking at the integration of conversation spaces and shared editor in a single environment.

Establishing Referential Identity

Earlier discussion mentioned the importance of grounding references to artifacts. In face-to-face communication pointing, looking and touching the object of a conversation helps the group establish 'referential identity'. Some CSCW systems, e.g. rIBIS and Boardnoter [Stefik 1987], identifying referents is facilitated by providing telepointers which allow users to 'point' to screen objects. Control is usually held by one person at a time, while others see the movement of the pointer on their own screens. Telepointing is suitable to synchronous communication, but not to asynchronous. So what alternatives are there that would be effective for both synchronous and asynchronous interaction? One possibility would be to associate a conversation space with each document segment. Our suggestion is that users should be free to segment a document as they chose, rather than having to follow a rigid document model such as 'title', 'abstract', 'section' and so on. A conversation space could be generated automatically to accompany each segment. The conversation space would be provided for users to talk about the associated segment, while the segment itself would be available for the document text. A criticism of this approach is that is may not be sufficiently context sensitive. It would allow a conversation about a segment to be associated with that segment, but would not support the sort of direct deictic reference that telepointing provides.

An alternative option would be to provide a facility more akin to annotation. In traditional paper and pencil authoring, asynchronous communication about the identity of a referent is easily established by annotation. For example the writing of comments and highlighting of text. There are several CSCW systems which support text annotation. The Collaborative Annotator system [Koszarek 1990] is a multi-user document review system, which allows graphic, voice and video annotations to be associated with a shared document image. The document is scanned, and its image displayed on the group interface. This image can be annotated with software "highlighter pens, multi-font text and yellow stickeys". The Quilt system also supports annotation. It differs from the Collaborative Annotator in that it supports editing of the document as well as review. Indication of an annotation is embedded within the text of a document. This ensures that the referent of the annotation is clear.

We should like to explore the potential for embedding annotations within the shared document of our group editing environment. That is, having annotations that 'point' to specific text. In terms of structure, annotations could be structured as conversation spaces, allowing sequential, reviewable contributions. Thus, rather than having one conversation space per segment as suggested earlier, users would be able to generate a number of annotation conversation spaces. These they could embed within a segment's text, assisting with accurate referential identity.

Annotations that are structured as topic-specific conversation spaces may overcome some of the problems of annotation put forward by Neuwirth, Kaufer, Chandhok and Morris [1990]. Neuwirth et al. maintain that writers are frustrated be the lack of consistency in comments that contradict each other. The more reviews there are the more pressing this problem becomes. Annotation that are communication spaces may help alleviate such difficulties, since different reviewers would be able to see the comments of others, and add their suggestions to the appropriate topic-specific area.

Figure 4 illustrates these two approaches to establishing referential identity: (a) shows a conversation space accompanying a document segment; (b) shows annotations embedded within the document segment, these are displayed as 'think bubbles' that users can open, read and edit.

Figure 4. Two Alternative Ideas for Establishing Referential Identity.

Providing a 'Global' Conversation Space

Design ideas for our proposed group editing environment included support for user defined document segmentation, with an annotation facility available for each segment. What is lacking is a facility for 'global' coordination. A 'global', ever-present conversation space may help to provide this function. It would provide a window for procedural group discussion, for example, conversation about how to perform the task, what division of labour is appropriate, what documentation segmentation to choose, and so on.

Play-Back Facilities

An important requirement of our proposed group editing environment is that both synchronous and asynchronous modes of interaction are supported. For asynchronous users particularly, it is important to make clear the context of previous interactions. One way to support this reconstruction of context is to provide 'play-back' facilities. These would allow users to see what interaction had taken place since they last used the environment. Play-back could be provided for individual components of the group editing environment. For example users could play-back events within a particular document segment. One could also envisage 'global play-back', where users are invited to step through the changes to all components of the group editing environment in sequence. A degree of context sensitivity is provided with this strategy since users would be able to integrate changes to different components. For example, users would be able to integrate the sequence of changes to the document with new or changed annotations.

Discussion

This paper marks the first steps in a design process. The aim has been to consider support for collaborative writing in the form of a shared editing environment.

We began by introducing a model of cooperative work, and applying that model to the task of collaborative writing. Group writing involves direct communication between participants, and interaction with and through the document artifact. Group writing can be a protracted exercise and is amenable to both concurrent and asynchronous work. With these observations in mind, we proceeded by deconstructing the model of cooperative work into its component parts. First we considered direct communication and drew from the literature on computer mediated communication to propose some ideas for conversation space design. Next, we looked at interaction with, and indirect communication through the shared document. This involved a review of some of the literature on shared editing applications, and some ideas for shared editor design. One fundamental design issue, which affects both computer mediated communication systems and shared editing applications, concerns the approach to task structuring. One can identify two broadly differing approaches. Some systems are designed to enforce a particular structure, other encourage users to structure their own interaction.The design ideas that have been put forward for possible inclusion in our group editing environment reflect an attempt to strike a balance between these two approaches. An important design consideration has been the presentation of contextual information. One aim has been to present not only the current interaction state, but also a context history for asynchronous users.

By proposing that our group editing environment will have both conversation spaces and shared editing we are suggesting provision of a range of different 'channels' for different types of communication (for a similar classification see Crowcroft in this volume). For example a document segment will provide a task specific communication channel, while the global conversation space might be used as a channel for general procedural discussion. These 'channels' are not physical, in the sense of the visual, auditory and linguistic channels that are available in face-to-face communication. They can be thought of as 'virtual channels': these describe the communication media that participants perceive [McCarthy 1990]. Participants' perception of a communication channel will be influenced by its appearance, and the way they interact with it. Giving users a range of differently styled channels can increase the effectiveness of their communication. For example, when a channel is perceived as supporting a particular type of communication, then that channel will help to specify the intended meaning of messages received on it. Receiving related information on more than one channel can have a strengthening effect. For instance, seeing an annotation suggesting a change to a segment, and then seeing the change made to the text reinforces the intention. Provision of a range of channels also allows for deictic referencing, so users can refer to "this segment" or "that annotation". Deictic references can increase the efficiency of communication [McCarthy 1991b].

Having separated the model of cooperative work into its component parts, we then considered its reconstruction. The focus was on how to integrate conversation spaces and shared editing into a single environment. In conceptual terms the aim is to provide interrelated and complementary channels: interrelated in the sense that the relationship between the channels is made clear, complementary in the sense that different channels can mutually supply each other's lack. For example, document segments and segment annotations will be closely related, both concerned with a particular piece of text. The communication channels they provide will be complementary, with a document segment providing a channel for indirect communication through the document, and the annotations providing a channel of direct communication about the document. The hope is, that providing a range of complementary and interrelated channels, will help users to achieve the common understanding, ((e) in figure 1), that is central to successful cooperative work.

Bibligraphy

E. E. Beck (1993). A Survey of Experiences of Co-Authoring. In Computer Supported Collaborative Writing, Ed. M. Sharples. London, Springer-Verlag.
M. Beaudouin-Lafon (1990). Collaborative Development of Software. Multi-User Interfaces and Applications - Proc. of the IFIP WG 8.4 Conference on Multi-User Interfaces and Applications, Eds. S. Gibbs and A. Verrijn-Stuart. Heraklion, Greece, North-Holland. pp. 103-114.
M. Begeman, P. Cook, C. Ellis and M. Graf (1986). Project NICK: meetings augmentation and analysis. In CSCW'86: Conference on Computer Supported Cooperative Work, . ACM. pp. 1-6.
V. Bellotti, P. Dourish and A. MacLean (1991). From Users' Themes to Designers' DReams: Developing a Design Space for Shared Interactive Technologies. AMODEUS RP6/WP7, EPC-91-112, Rank Xerox EuroPARC, 61, Regent Street, Cambridge, CB2 1AB, UK.
R. P. Carasik and C. E. Grantham (1988). A Case Study of CSCW in a Dispersed Organisation. Proceedings CHI'88, . ACM Press. pp. 61-66.
W. L. Chafe (1986). Writing in the perspective of speaking. In Studying Writing: linguistic approaches, Ed. C. R. C. a. S. Greenbaum. Beverly Hills, Sage. pp. 12-39.
H. H. Clark and S. E. Brennan (1991). Grounding in Communication. In Perspectives on Socially Shared Cognition, Eds. L. B. Resnick, J. M. Levine and S. D. Teasley. Washington, APA Books. pp. 127-149
J. Conklin and M. L. Begeman (1989). gIBIS: A tool for all reasons. Journal of the American Society for Information Science, 40(3) pp. 200-213.
A. J. Dix (1991). Formal Methods for Interactive Systems. (see chapter 10). Academic Press.
E. A. Dykstra and R. P. Carasik (1991). Structure and support in cooperative environments: the Amsterdam Conversation Environment. Int. J. Man-Machine Studies, 34(3) pp. 419-434.
C. A. Ellis, S. J. Gibbs and G. L. Rein (1989). Design and use of a group editor. IFIP WG2.7 Working Conference on Engineering for Human Computer Interaction, . Napa Valley, C.A., U.S.A, August 1989, pp. @article{Ellis91">
C. A. Ellis, S. J. Gibbs and G. L. Rein (1991). Groupware: Some Issues and Experiences. Communications of the ACM, 34(1) pp.
R. Fish, R. Kraut, M. Leland and M. Cohen (1988). Quilt: A collaborative tool for cooperative writing. Proceedings of the Conference on Office Information Systems, . Palo Alto, Calif., ACM, New York. pp. 30-37.
S. J. Gibbs (1989). LIZA: An extensible groupware toolkit. CHI'89 Proceedings, . pp. 29-35.
C. W. Johnson and M. D. Harrison (1992). Using temporal logic to support the specification and prototyping of interactive control systems. International Journal of Man-Machine Studies, 36 pp. 357-385.
J. L. Koszarek, T. L. Lindstrom, J. R. Ensor and S. R. Ahuja (1990). A multi-user document review tool. In Multi-User Interfaces and Applications, Eds. S. Gibbs and A. A. Verrijn-Stuart. Amsterdam, Elsevier Science Publishers. pp. 207-214.
M. D. P. Leland, R. S. Fish and K. Robert E (1988). Collaborative document production using Quilt. In Proceedings of CSCW'88, . pp. 206-215.
H. Lubich and P. Bernhard (1990). A proposed model and functionality definition for a collaborative editing and conferencing system. In Multi-User Interfaces and Applications, Eds. S. Gibbs and A. A. Verrijn-Stuart. Elsevier Science Publishers. pp. 215-232.
T. W. Malone, K. R. Grant, K. Lai, R. Rao and D. Rosenblitt (1987). Semistructured messages are surprisingly useful for computer supported coordination. ACM Transactions on Office Information Systems, 5(2) pp. 115-131.
J. C. McCarthy and V. C. Miles (1990). Elaborating Communication Channels in Conferencer. In Multi-User Interfaces and Applications, Eds. S. Gibbs and A. A. Verrijn-Stuart. Elsevier Science Publishers. pp. 181-193.
J. C. McCarthy, V. C. Miles, A. F. Monk and M. D. Harrison (1991). Building Expectations From Context in On-line Conferencing: A Review. Human jobs and computer interfaces, Eds. M. I. Nurminen and G. R. S. Weir. Elsevier Science Publishers. pp. 147-162.
J. C. McCarthy, V. C. Miles and A. F. Monk (1991a). An experimental study of common ground in text-based communication. Reaching Through Technology: Proceedings of the CHI'91 Conference, Eds. S. Robertson, G. Olson and J. Olson. New Oreleans, ACM Press. pp. 209-217.
J. C. McCarthy and A. F. Monk (1991b). Communication and Computer Supported Cooperative Work (CSCW). Internal Report, University of York, UK.
V. C. Miles, C. W. Johnson, J. C. McCarthy and M. D. Harrison (1991). Supporting Prediction in Complex Dynamic Systems. Proceedings of HCI'91, People and Computers VI . Cambridge University Press. pp. 133-144
C. M. Neuwirth, D. S. Kaufer, R. Chandhok and J. H. Morris (1990). Issues in the design of computer support for co-authoring and commenting. In CSCW'90 Proceedings of the Conference on Computer-Supported Cooperative Work, Los Angeles. ACM SIGCHI & SIGOIS. pp. 183-195.
R. Newman and J. Newman (1992). Social Writing: Premisses and Practices in Computerised Contexts. In Computer Supported Collaborative Writing, Ed. M. Sharples. Springer Verlag. pp.
G. M. Olson (1989). The Nature of Group Work. Proceedings of the Human Factors Society 33rd Annual Meeting,.
J. S. Olson, G. M. Olson, L. A. Mack and P. Wellner (1990). Concurrent editing: the group's interface. Human-Computer Interaction - INTERACT'90, Eds. D. Diaper, D. Gilmore, G. Cockton and B. Shackel. Elsevier Science Publishers, Amsterdam. pp. 835-840.
E. Pettersson (1989). Automatic information processes in document reading. A study of information handling in two intensive care units. ECSCW'89, . pp. 63-73.
L. R. Posner, R. M. Baecker and M. M. Mantei (1991). How people write together. Technical Report, Computer Systems Research Institute and Department of Computer Science, University of Toronto, Canada.
M. Stefik, D. Bobrow, G. Foster, S. Lanning and D. Tatar (1987). WYSIWIS revisited: early experiences with multiuser interfaces. ACM Transactions on Office Systems, 5(2) pp. 147-167.
M. Sharples and L. Pemberton (1990). Starting from the Writer: Guidelines for the Design of User-Centred Document Processors. Computer Assisted Language Learning, pp. 37-57.
D. G. Tatar, G. Foster and D. G. Bobrow (1991). Design for Conversation: Lessons from Cognoter. International Journal of Man-Machine Studies, 34(2) pp. 185-209.
E. Yoder, R. Akscyn and D. McCracken (1989). Collaboration in KMS, a shared hypermedia system. In Proceedings CHI'89 Conference on Computer-Human Interaction, . ACM, New York. pp. 37-42.