Providing a structured method for integrating non-speech audio into human-computer interfaces.

All of the citations here are in my bibliography. The full thesis can be downloaded from my publications list page, papers by me witten on the topic can be found on my publication list.

Thesis Abstract

This thesis provides a framework for integrating non-speech sound into human-computer interfaces. Previously there was no structured way of doing this, it was done in an ad hoc manner by individual designers. This led to ineffective uses of sound. In order to add sounds to improve usability two questions must be answered: What sounds should be used and where is it best to use them? With these answers a structured method for adding sound can be created.

An investigation of earcons as a means of presenting information in sound was undertaken. A series of detailed experiments showed that earcons were effective, especially if musical timbres were used. Parallel earcons were also investigated (where two earcons are played simultaneously) and an experiment showed that they could increase sound presentation rates. From these results guidelines were drawn up for designers to use when creating usable earcons. These formed the first half of the structured method for integrating sound into interfaces.

An informal analysis technique was designed to investigate interactions to identify situations where hidden information existed and where non-speech sound could be used to overcome the associated problems. Interactions were considered in terms of events, status and modes to find hidden information. This information was then categorised in terms of the feedback needed to present it. Several examples of the use of the technique were presented. This technique formed the second half of the structured method.

The structured method was evaluated by testing sonically-enhanced scrollbars, buttons and windows. Experimental results showed that sound could improve usability by increasing performance, reducing time to recover from errors and reducing workload. There was also no increased annoyance due to the sound. Thus the structured method for integrating sound into interfaces was shown to be effective when applied to existing interface widgets.

1.1 INTRODUCTION

The combination of visual and auditory information at the human-computer interface is a natural step forward. In everyday life both senses combine to give complementary information about the world; they are interdependent. The visual system gives us detailed data about a small area of focus whereas the auditory system provides general data from all around, alerting us to things outside our peripheral vision. The combination of these two senses gives much of the information we need about our everyday environment. Dannenberg & Blattner ([23], pp xviii-xix) discuss some of the advantages of using this approach in multimedia/multimodal computer systems:

"In our interaction with the world around us, we use many senses. Through each sense we interpret the external world using representations and organisations to accommodate that use. The senses enhance each other in various ways, adding synergies or further informational dimensions".

They go on to say:

"People communicate more effectively through multiple channels. ... Music and other sound in film or drama can be used to communicate aspects of the plot or situation that are not verbalised by the actors. Ancient drama used a chorus and musicians to put the action into its proper setting without interfering with the plot. Similarly, non-speech audio messages can communicate to the computer user without interfering with an application".

These advantages can be brought to the multimodal human-computer interface. Whilst directing our visual attention to one task, such as editing a document, we can still monitor the state of other tasks on our machine. Currently, almost all information presented by computers uses the visual sense. This means information can be missed because of visual overload or because the user is not looking in the right place at the right time. A multimodal interface that integrated information output to both senses could capitalise on the interdependence between them and present information in the most efficient and natural way possible. This thesis aims to investigate the creation of such multimodal interfaces.

The classical uses of non-speech sound can be found in the human factors literature (see Deatherage [48] or McCormick & Sanders [116]). Here it is used mainly for alarms and warnings or monitoring and status information. Alarms are signals designed to interrupt the on-going task to indicate something that requires immediate attention. Monitoring sounds provide information about some on-going task. Buxton [38] extends these ideas and suggests that encoded messages could be used to pass more complex information in sound and it is this type of auditory feedback that will be considered here.

The use of sound to convey information in computers is not new. In the early days of computing programmers used to attach speakers to a computer's bus or program counter [168]. The speaker would click each time the program counter was changed. Programmers would get to know the patterns and rhythms of sound and could recognise what the machine was doing. Another everyday example is the sound of a hard disk. Users often can tell when a save or copy operation has finished by the noise their disk makes. This allows them to do other things whilst waiting for the copy to finish. Sound is therefore an important information provider, giving users information about things in their systems that they cannot see. It is time that sound was specifically designed into computer systems rather than being an add-on or an accident of design that can be taken advantage of by the user. The aim of the research described here is to provide a method to do this.

As DiGiano & Baecker [55] suggest, non-speech audio is becoming a standard feature of most new computer systems. Next Computers [175] have had high quality sound input and output facilities since they were first brought out and Sun Microsystems and Silicon Graphics [154,185] have both introduced workstations with similar facilities. As Loy [110] says, MIDI interfaces are built in to many machines and are available for most others so that high quality music synthesisers are easily controllable. The hardware is therefore available but, as yet, it is unclear what it should be used for. The hardware manufacturers see it as a selling point but its only real use to date is in games or for electronic musicians. The powerful hardware plays no part in the everyday interactions of ordinary users. Another interesting point is made by DiGiano & Baecker [55]: "The computer industry is moving towards smaller, more portable computers with displays limited by current technology to fewer colours, less pixels, and slower update rates". They suggest that sound can be used to present information that is not available on the portable computer due to lack of display capability.

We have seen that users will take advantage of sounds in their computer systems and that there is sophisticated sound hardware available currently doing nothing. The next step that must be taken is to link these two together. The sound hardware should be put to use to enhance the everyday interactions of users with their computers. This is the area addressed by the research described in this thesis.

1.1.1 Research topics in auditory interface design

In 1989 Buxton, Gaver & Bly [39] suggested six topics that needed further research in the area of auditory interfaces. The areas that they suggest for further investigation partly motivated the work in this thesis. The research topics are:

Use of non-speech sound: Research is needed to find out how people use sound and also to find out about the perception of higher-level musical structures to assess their potential to encode information. What sorts of variations of sounds will prove the most useful and the best associated with a particular meaning? What about the annoyance due to sound?
Mapping of information to sound: Research is needed to explore the mapping of information to sound. Everyday sounds can be mapped to everyday events in a computer. This is intuitive but does not work if there is no everyday equivalent to the operation. Some musical properties map easily into sound (high pitch means up) but are there others? How hard is it to learn new mappings?
Sounds in relation to graphics: How do sounds work in relationship to other types of feedback in the interface? Sounds can complement, replace or work independently of other feedback. Can auditory and visual components be designed to create one coherent system?
User manipulation of sounds: What control should users be given over the parameters of sounds in the interface? Should they be allowed to control volume? What other kinds of controls are needed?
Structure of sounds: Can useful sounds be built-up from smaller components? How are complex structures mapped to sound? How easy is it for listeners to perceive and learn the structures?
System support for sound: What architectures (hardware and software) are needed to support sound in the interface? What capabilities of sounds are needed? Are MIDI controllers and synthesisers necessary?

These topics are presented so that the description of research in the thesis which follows is put in context. After the contents of the thesis have been described in section 1.6 the work in the thesis will be explained in terms of this research agenda.

One question that might be asked is: Why use sound to present information? A graphical method could be used instead. The drawback with this is that it puts an even greater load on the visual channel. Furthermore, sound has certain advantages. For example, it can be heard from all around, it does not disrupt the user's visual attention and it can alert the user to changes very effectively. It is for these reasons that this thesis suggests sound should be used to enhance the graphical user interface.

1.2 MOTIVATION FOR RESEARCH INTO AUDITORY INTERFACES

Some of the general advantages that can be gained from adding sound have been described but what are the specific benefits that it offers? There are many reasons why it is important to use sound in user interfaces:

To reduce the load on the user's visual system [114]. Modern, large screen workstations and graphical interfaces use the visual system very intensively. This means that we may miss important information because the visual system is overloaded. Mountford & Gaver ( [119], p 322) suggest that the visual display can be overburdened because:
"system information is traditionally displayed via graphical feedback that remains on the screen, although it may be obsolete or irrelevant soon after it is shown. The result is often crowded, incomprehensible displays".
To stop this overload, information could be displayed in sound. With the limited amount of screen space available, the presentation of some information in sound would allow more important graphical data to be displayed on the screen.
Non-intrusive enhancement [103]. Sound can be added to visual displays without interfering with existing tools and skills. If sounds are introduced redundantly with graphics then users will be able to continue to use the systems as before but gain from the advantages of sound. Kramer [103] suggests that the addition of sound will enhance the perceived quality of systems because it allows increased refinement and subtlety.
The auditory sense is under-utilised. The auditory system is very powerful and would appear to be able to take on the extra capacity. Experiments have shown that a human can distinguish between any two of 400,000 different sounds and remember and identify up to 49 sounds at one time [27]. In certain cases, reaction to auditory stimuli have also been shown to be faster than reactions to visual stimuli [27].
Sound is attention grabbing [99]. Users can choose not to look at the screen but cannot avoid hearing sound (if they are at the machine). This makes the auditory system very good for presenting alarms and warnings.
There is psychological evidence to suggest that sharing information across different sensory modalities can actually improve task performance [36, 132] (See Chapter 3 for more on this). Intermodal correlations resulting from sharing between the senses may make the interface more natural. For example, throwing something into the wastebasket and hearing a smashing noise on a computer reflects real life. Sound also has a greater temporal resolution than vision. This means it is good for representing rapidly changing data.
When information is represented in a visual form users must focus their attention on the output device in order to obtain the presented information and to avoid missing anything. According to Perrott, Sadralobadi, Saberi & Strybel [132] humans view the world through a window of 80 deg. laterally and 60 deg. vertically. Within this visual field focusing capacity is not uniform. The foveal area of the retina (the part with the greatest acuity) subtends an angle of only two degrees around the point of fixation [139]. Sound, on the other hand, is omni-directional. It can be heard without the need to concentrate on an output device, thus providing greater flexibility. Sound does have drawbacks because of its transient nature - once it has been played it cannot be heard again but this may be advantageous, for example, when presenting dynamic, rapidly changing data.
Some objects or actions within an interface may have a more natural representation in an auditory form. Mountford & Gaver ( [119], p 321) suggest sound is useful because "[it] is a familiar and natural medium for conveying information that we use in our everyday lives". Gaver [74] suggests that sounds are good for providing information on background processes or inner workings without disrupting visual attention. Sound is also a very different medium for representing information than graphics. Bly ( [27], p 14) suggests: "... perception of sound is different to visual perception, sound can offer a different intuitive view of the information it presents ...". Therefore, sound could allow us to look at information we already have in different ways.
To make computers more usable by visually disabled users. Developments in graphical user interfaces, such as the Apple Macintosh or Microsoft Windows for the PC, have made it harder for blind people to use computers [63]. In older systems, for example PC's running MSDOS, all the information presented was in text. A screen reader could be attached which would read all the text displayed on the screen in synthetic speech. Thus a blind person had access to all of the same information as a sighted person. With the development of graphical displays, information is presented in a pictorial form; users click on a picture of the application they want, instead of reading its name in a list. A screen reader cannot read this kind of graphical information. Providing information in an auditory form could help solve this problem and allow visually disabled persons to use the facilities available on modern computers [121].
Buxton ( [38], p 3) claims that sighted users can become so overloaded with visual information that they are effectively visually disabled. He says that if our visual channel is overloaded "we are impaired in our ability to assimilate information through the eyes". Therefore research into displaying information in sound for visually disabled users could be used to help the sighted in these situations. This is also the case in `eyes-free' interfaces. For instance, where the user must keep visual contact with other elements of the environment or where vision is otherwise impaired, for example in the cockpit of a fighter aircraft.

The area of auditory interfaces is growing as more and more researchers see the possibilities offered by sound because, as Hapeshi & Jones ( [89], p 94) suggest, "Multimedia provide an opportunity to combine the relative advantages of visual and auditory presentations in ways that can lead to enhanced learning and recall". There are several examples of systems that use sound and exploit some of its advantages. However, because the research area is still in its infancy, most of these systems have been content to show that adding sound is possible. There are very few examples of systems where sound has been added in a structured way and then formally evaluated to investigate the effects it had. This is one of the aims of this thesis.

1.3 WHAT SOUNDS SHOULD BE USED AND WHERE?

Section 1.2 showed that there are many compelling reasons for using sound at the interface. This brings up two fundamental questions:

What sounds should be used at the interface to communicate information effectively?
Where should sound be used to best effect at the interface?

Prior to the work reported in this thesis there was no structured method a designer could use to add sound. It had to be done in an ad hoc manner for each interface. This led to systems where sound was used but gave no benefit, either because the sounds themselves were inappropriate or because they were used in inappropriate places. If sounds do not provide any advantages then there is little point in the user using them. They may even become an annoyance that the user will want to turn off. However, if the sounds provide information users need then they will not be turned off. The work described in this thesis answers these two questions and from the answers provides a structured method to allow a designer (not necessarily skilled in sound design) to add effective auditory feedback that will improve usability. The structured method provides a series of steps that the designer can follow to find out where to use sound and then to create the sounds needed.

There are several different methods for presenting information in sound and two of the main ones are: Auditory icons [74] and earcons [25]. Auditory icons use natural, everyday sounds to represent actions and objects within an interface. The sounds have an intuitive link to the thing they represent. For example, selecting an icon might make a tapping sound because the user presses on the icon with the cursor. Auditory icons have been used in several interfaces. Whilst they have been shown to improve usability [79] no formal evaluation has taken place. One drawback is that some situations in a user interface have no everyday equivalents and so there are no natural sounds that can be used. For example, there is no everyday equivalent to a database search so a sound with an intuitive link could not be found.

Earcons are the other main method of presenting information in sound. They differ from auditory icons in the types of sounds they use. Earcons are abstract, synthetic tones that can be used in structured combinations to create sound messages to represent parts of an interface. Earcons are composed of motives, which are small sub-units that can be combined in different ways. They have no intuitive link to what is represented; it must be learned. Prior to the research described in this thesis, earcons had never been evaluated. The best ways to create them were not known. It was not even clear if users would be able to learn the structure of earcons or the mapping between the earcon and its meaning. This lack of knowledge motivated the investigation of earcons carried out in this thesis. When more was known about earcons a set of guidelines for their production could be created. These guidelines should also embody knowledge about the perception of sound so that a designer with no skill in sound design could create effective earcons.

Neither of the two sound presentation methods above give any precise rules as to where in the interface the sounds should be used. The work on auditory icons proposed that they should be used in ways suggested by the natural environment. As discussed above, this can be a problem due to the abstract nature of computer systems; there may be no everyday equivalent of the interaction to which sound must be added. This work also only uses sounds redundantly with graphical feedback. Sounds can do more than simply indicate errors or supply redundant feedback for what is already available on the graphical display. They should be used to present information that is not currently displayed (give more information) or present existing information in a more effective way so that users can deal with it more efficiently. A method is needed to find situations in the interface where sound might be useful and this thesis presents such a method. It should provide for a clear, consistent and effective use of non-speech audio across the interface. Designers will then have a technique for identifying where sound would be useful and for using it in a more structured way rather than it just being an ad hoc decision.

In the research described in this thesis sound is used to make explicit information that is hidden in the interface. Hidden information is an important source of errors because often users cannot operate the interface effectively if information is hidden. There are many reasons why it might be hidden: It may not be available because of hardware limitations such as CPU power; it may be available but just difficult to get at; there may be too much information so that some is missed because of overload; or the small area of focus of the human visual system may mean that things are not be seen. This thesis describes an informal analysis technique that can be used to find hidden information that can cause errors. This technique models an interaction in terms of event, status and mode information and then categorises this in terms of the feedback needed to present it.

Many uses of sound at the human-computer interface are never evaluated. One reason for this is that research into the area is very new so that example systems are few in number. Most of the interfaces developed just aimed to show that adding sound was possible. However, for the research area to develop and grow it must be shown that sound can effectively improve usability. Therefore, formal testing of sonically-enhanced interfaces is needed. One aim of this research is to make sure that the effects of sound are fully investigated to discover its impact. In particular annoyance is considered. This is often cited as one of the main reasons for not using sound at the interface. This research investigates if sound is annoying for the primary user of the computer system.

The answers to the two questions of where and what sounds are combined to produce a structured method for adding sound to user interfaces. The analysis technique is used to find where to add sound and then the earcon guidelines are used to create the sounds needed. This method is tested to make sure the guidelines for creating sounds are effective, the areas in which to add sound suggested by the analysis technique work and that usability is improved.

1.4 A DEFINITION OF TERMS

1.4.1 Usability

In the section above one of the aims of the thesis was shown to be creating a structured method for adding sound that would increase usability. What is meant by usability in this case? In ISO standard 9241-11 (reported in [19], p 135 and also described in [126]) it is defined as: "The effectiveness, efficiency and satisfaction with which specific users achieve specified goals in particular environments". Bevan & Macleod [19] suggest that effectiveness can be measured by accuracy, efficiency by time and satisfaction by subjective workload measures. This definition of usability will be used when measuring the effectiveness of the structured method for adding sound.

1.4.2 Multimedia and multimodal systems

The research described in this thesis aims to create multimodal interfaces. What is a multimodal interface and how does it differ from a multimedia one? There are, as yet, no accepted definitions of the terms multimedia and multimodal as Alty and Mayes both describe [3,115]. This thesis uses the definitions proposed by Mayes:

Multimedia: A medium is a carrier of information, for example printed paper, video or a bit-mapped display. As Mayes says (p 2): "It may be used to refer to the nature of the communication technology". A multimedia computer system is one that is capable of the input or output of more than one medium [22]. In this definition a computer screen is a multimedia device because it can display text, graphical images and video. The medium of the display can contain pictures, text, etc.
Multimodal: The term mode has many meanings. In computer system dialogues modes put an interpretation on information and affect what the user is able to do at any given point in the system (see Chapter 6 for more on this). Mode refers to the state of the system. Mode can also refer to the human sense that is used to perceive the information - the sensory modality (see Chapter 2 for more on this). This is the standard psychological definition. In this thesis a multimodal interface is defined as one that presents information in different sensory modalities, specifically visual and auditory.

Almost all computer systems are multimedia by this definition. They all have the ability to present information via different media such as graphics, text, video and sound. They are not all multimodal however. Most of the different media they use present information to the visual system. Very few systems make much of their capacity to produce sound. Errors are sometimes indicated by beeps but almost all interactions take place in the visual modality. The aim of this research is to broaden this and make everyday interactions with computers use the auditory modality as well as the visual.

1.4.3 Musical notation used in the thesis

Standard musical notation is used to describe the earcons in this thesis. In this very brief description only the parts of musical notation used by the sounds in the thesis are described. For a more detailed description of the notation used see Scholes [148]. The earcons used are based around the quarter note. Whole notes are four times as long as quarter notes, half notes twice as long, eighth notes half the length, etc. A quarter note rest is a period of silence for the length of a quarter note. These time divisions and their iconic notations are:

The arrangement of notes on the stave (the series of horizontal lines) defines the rhythm of the earcon. An example earcon might look like this:

These are three quarter notes of increasing pitch. A note with a `>' above it is accented (played slightly louder than normal), with a `<' it is muted. A sequence of notes with a `<' underneath indicates that they get louder (crescendo) and with a `>' they get quieter (decrescendo). The height of the note on the stave indicates its relative pitch. This is only a very simple overview of musical notation.

1.4.4 Pitch notation used in the thesis

In addition to describing the notes and rhythms used the octave of the notes must be specified. There are eight octaves of seven notes in the western diatonic system [148]. There are many different systems for notating pitch. The one used in this thesis is described in Scholes. In this commonly used system a note, for example `C', is followed by an octave number, for example:


                       Middle C                         
    C1       C2           C3          C4          C5        
  1046 Hz    523 Hz       261 Hz      130 Hz      65 Hz

So A above middle C (440 Hz) would be A3. This system will be used throughout the thesis to express pitch values.

1.5 THESIS AIMS

In this section the main aims of the thesis will be summarised. The overall aim of this research is to provide a structured method that designers can use to integrate sounds into human-computer interfaces. By doing this it is also hoped that sound will be shown to be effective at communicating information and able to increase the usability of systems. Before the method can be created two questions must be answered:

What sounds should be used at the human-computer interface? The main aims of this part of the work are:

To investigate whether earcons are an effective method for presenting structured information in sound;
To show the best way to construct earcons;
To investigate whether their rate of presentation could be increased so that they can keep pace with interactions;
To improve upon the current rules for creating them and produce a set of guidelines for interface designers.

Where should sound be used at the human-computer interface? The main aims of this part of the work are:

To analyse some interactions to investigate whether there are problems due to hidden information;
To find out if using event, status and mode analysis will make useful predictions about where to use sound;
To see if the feedback to make the hidden information explicit can be modelled;
To produce an analysis technique that an interface designer could use to find where to add effective sounds.

These two components will be brought together and the structured method will be evaluated. The aim of the evaluation will be:

To determine the effectiveness of the structured method by investigating if the sounds added improve usability;
To find out if sounds used in this way are annoying to the primary user of the computer system.

1.6 CONTENTS OF THE THESIS

Figure 1.1 shows the structure of the thesis and how the chapters contribute to the two questions being investigated. Chapters 2 and 3 set the work in context, Chapters 4 and 5 investigate what sounds are the best to use, Chapter 6 shows where sound should be used and Chapter 7 brings all the work together to show the structured method in action. The following paragraphs give an overview of each chapter.

Figure 1.1: Structure of the thesis.

Chapter 2 gives an introduction to psychoacoustics, the study of the perception of sound. This is important when designing auditory interfaces because using sounds without regard for psychoacoustics may lead to the user being unable to differentiate one sound from another or being unable to hear the sounds. The main aspects of the area are dealt with including: Pitch and loudness perception, timbre, localisation and auditory pattern recognition. The chapter concludes by suggesting that a set of guidelines incorporating this information would be useful so that auditory interface designers would not need have an in-depth knowledge of psychoacoustics.

Chapter 3 provides a background of existing research in the area of non-speech audio at the interface. It gives the psychological basis for why sound could be advantageously employed at the interface. It then goes on to give detailed information about the main systems that have used sound including: Soundtrack, auditory icons, earcons and auditory windows. The chapter highlights the fact that there are no effective methods in existence that enable a designer to find where to add sound to an interface. It also shows that none of the systems give any real guidance about designing the types of sounds that should be used. One of the main systems, earcons, has not even been investigated to find out if it is effective.

Chapter 4 describes a detailed experimental evaluation of earcons to see whether they are an effective means of communication. An initial experiment shows that earcons are better than unstructured bursts of sound and that musical timbres are more effective than simple tones. The performance of non-musicians is shown to be equal to that of trained musicians if musical timbres are used. A second experiment is then described which corrects some of the weaknesses in the pitches and rhythms used in the first experiment to give a significant improvement in recognition. These experiments formally show that earcons are an effective method for communicating complex information in sound. From the results some guidelines are drawn up for designers to use when creating earcons. These form the first half of the structured method for integrating sound into user interfaces.

Chapter 5 extends the work on earcons from Chapter 4. It describes a method for presenting earcons in parallel so that they take less time to play and can better keep pace with interactions in a human-computer interface. The two component parts of a compound earcon are played in parallel so that the time taken is only that of a single part. An experiment is conducted to test the recall and recognition of parallel compound earcons as compared to serial compound earcons. Results show that there are no differences in the rates of recognition between the two types. Non-musicians are again shown to be equal in performance to musicians. Parallel earcons are shown to be an effective means of increasing the presentation rates of audio messages without compromising recognition. Some extensions to the earcon creation guidelines of the previous chapter are proposed.

Chapter 6 investigates the question of where to use sound. It describes an informal analysis technique that can be applied to an interaction to find where hidden information may exist and where non-speech sound might be used to overcome the associated problems. Information may be hidden for reasons such as: It is not available in the interface, it is hard to get at or there is too much information so it cannot all be taken in. When information is hidden errors can occur because the user may not know enough to operate the system effectively. Therefore, the way this thesis suggests adding sound it to make this information explicit. To do this, interactions are modelled in terms of events, status and modes. When this has been done the information is categorised in terms of the feedback needed to present it. Four dimensions of feedback are used: Demanding versus avoidable, action-dependent versus action-independent, transient versus sustained, and static versus dynamic. This categorisation provides a set of predictions about the type of auditory feedback needed to make the hidden information explicit. In the rest of the chapter detailed analyses of many interface widgets are shown. This analysis technique, with the earcon guidelines, forms the structured method for integrating sound into user interfaces.

Chapter 7 demonstrates the structured method in action. Three sonically-enhanced widgets are designed and tested based on the method. The chapter discusses problems of annoyance due to sound and some ways it may be avoided. The first experiment tests a sonically-enhanced scrollbar. The results show that sound decreases mental workload, reduces the time to recover from errors and reduces the overall time taken in one task. Subjects also prefer the new scrollbar to the standard one. Sonically-enhanced buttons are tested next. They are also strongly preferred by the subjects and they also reduce the time taken to recover from errors. Finally, sonically-enhanced windows are tested. Due to a problem with the experiment it is not possible to say whether they improve usability. In all of the three experiments subjects did not find the sounds annoying. The structured method for adding sound is therefore shown to be effective.

Chapter 8 summarises the contributions of the thesis, discusses its limitations and suggests some areas for further work.

1.6.1 The thesis in terms of the research topics in auditory interface design

How does the work in this thesis fit into the research agenda described in section 1.1.1? The investigation of earcons in Chapters 4 and 5 falls into three of these areas. It investigates the use of non-speech sound. The experiments investigate the best types of sounds to use; the best timbres, pitches, rhythms, etc. The work deals with the mapping of information to sound and how hard these mappings are to learn. Finally, the chapter looks at the structure of sounds. Earcons are investigated to find out if listeners can extract and learn their structure.

Chapter 6 investigates mapping information to sound. The agenda suggests that a method for translating events and data into sound is needed and this is what the research provides. It gives an analysis technique that models hidden information and from this produces rules for creating sounds. The chapter also investigates the sound in relation to graphics, suggesting that sound and graphics can be combined to create a coherent system.

Chapter 7 again looks at the use of non-speech sound and particularly at the annoyance due to sound. It considers sound in relation to graphical feedback. Sounds are shown complementing and replacing graphics.

The thesis does not investigate system support for sound, although from the research the types of sounds necessary in an interface are shown. This knowledge could then be used when deciding what hardware and software are needed to support sound in a computer system. The research also does not investigate user manipulation of sounds.

The work undertaken for this thesis has been shown to address many of the major research issues that Buxton et al. suggest are important for the future of auditory interfaces. The answers gained from this thesis will extend knowledge of how sounds can be used at the interface.

These pages are maintained by Stephen Brewster
Email: stephen@dcs.gla.ac.uk