Position Paper for CHI 96 Basic Research Symposium (April 13-14, 1996, Vancouver, BC)
Non-Speech Audio and Human-Computer Interaction
Department of Computing Science
The University of Glasgow
Glasgow, G12 8QQ, UK
Tel: +44 (0)141 330 4966, firstname.lastname@example.org
ABSTRACTThe goal of my research is to improve the usability of human-computer
interfaces by the addition of sound. Sound plays an important role in our everyday world but almost none when we interact with computers. I am investigating the combination of sound with visual feedback to improve graphical interfaces, with speech to improve telephone-based interfaces and the
use of sound to provide access to otherwise inaccessible systems for visually
disabled people. This research improves upon the simple beeps used now to
provide a rich, powerful and carefully designed set of sounds that will enable
more effective interaction.
The research described in this paper suggests the use of non-speech sound
output to enhance information display at the human-computer
interface. There is a growing body of research which indicates that the
addition of non-speech sounds to interfaces can improve performance and
increase usability, for example [2, 5, 8].
Sound is an important means of communication in the everyday world and the
benefits it offers should be taken advantage of at the interface. Such
multimodal interfaces allow a greater and more natural communication
between the computer and the user. They also allow the user to employ more
appropriate sensory modalities to solve a problem, rather than just using one
modality (usually vision) to solve all problems. In spite of increased interest
in multimedia, little solid research has been done on the effective uses of
sound in computers even though all computer manufacturers now include sound
producing hardware in their machines.
Sound has many advantages. For example, it is omni-directional and attention
grabbing so can be used to indicate problems to users. It can work alongside
synthetic speech in purely auditory interfaces or be integrated with graphical
feedback. Graphical interfaces display a large amount of information and can
result in visual overload of the user. One way to overcome this problem is to
use sound. Important information can be displayed on the screen and other
information in sound, reducing overload of the visual sense. Brewster et
showed that by adding sound to a graphical interface both the time taken to
complete certain tasks and the time taken to recover from errors could be
The sounds used for this research are all based around structured audio messages
[7,9]. Earcons are abstract, synthetic tones that can be used in structured combinations
to create sound messages to represent parts of a human-computer interface. Detailed
investigations of earcons by Brewster, Wright & Edwards  showed that
they are an effective means of communicating information in sound.
Earcons are constructed from motives. These are short, rhythmic
sequences that can be combined in different ways. The simplest method of
combination is concatenation to produce compound earcons. By using more
complex manipulations of the parameters of sound (timbre, register, intensity,
pitch and rhythm) hierarchical earcons can be created
. Using these techniques structured combinations of sounds can be created and
varied in consistent ways.
There are two strands to this work: Integrating sound into graphical
human-computer interfaces and using sound for navigation cues.
Integrating sound into human-computer interfaces
My main area of investigation in this first strand has been the addition of
sound to graphical interface widgets, such as buttons and scrollbars, to
correct usability problems. Commonly used graphical widgets often have
problems. Using a structured technique I developed , I have analysed the problems with standard widgets and then corrected them
using sound. One problem with button widgets is that the user can slip-off the
button by mistake. This problem is exacerbated because the feedback from a
slip-off is the same as that for a correct button press and the user is unlikely to be looking at the button to notice the slip-off; he or she will have moved on to the next part of the task. This problem was
solved by adding sound. An experiment showed that participants could recover
from such slip-off errors significantly faster and with significantly fewer
mouse-clicks with sound, they also significantly preferred the
sonically-enhanced buttons .
This was not at the expense of making the system more annoying to use. A
similar analysis and experimental evaluation was conducted on
sonically-enhanced scrollbars .
In this case the sounds gave location information and stopped `kangarooing'
errors. Again the results were favourable: Participants completed certain tasks
significantly faster and there was significantly reduced mental workload with
the sonically-enhanced scrollbar. These results show that adding sound to
standard graphical widgets can improve the basic usability of an interface. The
next stage of this work is to build a complete toolkit of sonically-enhanced
My final topic of interest in this strand is the addition of non-speech sounds
to aid disabled people who use single-switch scanning input. With scanning
input the set of choices the user can make is laid out in a matrix on the
display. Scanning row-by row then occurs until the required row is selected,
then scanning item-by-item begins. The user then selects the required item.
Scanning input is a temporal task; users have to press a switch when a cursor
is over the required target, but it is usually presented as a spatial task with
the items laid-out in a grid. Research has shown that for temporal tasks the
auditory modality is often better than the visual. I investigated this by
adding non-speech sound to a visual scanning system. It also supported our
natural abilities to perceive rhythm so that this could be used to aid the
scanning process. The results from a preliminary investigation (again using
earcons for the sound output) were favourable, indicating that the idea is
feasible and further research should be undertaken
Using sound for navigation cues
The other main strand of my research is providing navigation cues in sound. In
some situations graphical feedback cannot be used to provide this information.
In completely auditory interactions, such as telephone-based interfaces or
those for visually disabled people, it is impossible to use graphical cues. In
other systems where graphical feedback is available, the display may already be
completely occupied by important information that extra graphical cues would
For example, an interface for people with speaking difficulties who need to
access a graphical library of pictographic images
An experiment to discover if earcons could provide navigational cues in a menu
hierarchy was conducted. A hierarchy of 25 nodes and four levels was created
with earcons for each node. Participants had to identify their location in the
hierarchy by listening to an earcon. Results showed that participants could
identify their location with 81.5% accuracy, indicating that earcons are a
powerful method of communicating hierarchy information. Participants were also
tested to see if they could identify where previously unheard earcons would fit
in the hierarchy. The results showed that they could do this with over 90%
accuracy. These results showed that the participants quickly learned the rules
from which the earcons were constructed, so demonstrating that earcons are a
robust and extensible method of communicating hierarchy information in sound.
Any references by Brewster can be found on-line on my publications
- Blattner, M., Papp, A. and Glinert, E. Sonic enhancements of two-dimensional
graphic displays. In The Proceedings of the First International Conference
on Auditory Display (Santa Fe Institute, Santa Fe) Addison-Wesley, 1992,
- Blattner, M., Sumikawa, D. and Greenberg, R. Earcons and icons: Their
structure and common design principles. Human Computer Interaction 4, 1
- Brewster, S.A., Raty, V.-P. and Kortekangas, A. Earcons as a Method of
Providing Navigational Cues in a Menu Hierarchy. Accepted for publication in HCI'96
(London, UK), 1996.
- Brewster, S.A., Raty, V.-P. and Kortekangas, A. Enhancing scanning input
with non-speech sounds. Accepted for publication in Assets'96
(Vancouver, Canada), 1996.
- Brewster, S.A., Wright, P.C., Dix, A.J. and Edwards, A.D.N. The sonic
enhancement of graphical buttons. In Proceedings of Interact'95
(Lillehammer, Norway) Chapman & Hall, 1995, pp. 43-48.
- Brewster, S.A., Wright, P.C. and Edwards, A.D.N. The design and evaluation
of an auditory-enhanced scrollbar. In Proceedings of CHI'94 (Boston,
Ma.) ACM Press, Addison-Wesley, 1994, pp. 173-179.
- Brewster, S.A., Wright, P.C. and Edwards, A.D.N. An evaluation of earcons
for use in auditory human-computer interfaces. In Proceedings of
INTERCHI'93 (Amsterdam) ACM Press, Addison-Wesley, 1993, pp. 222-227.
- Gaver, W., Smith, R. and O'Shea, T. Effective sounds in complex systems: The
ARKola simulation. In Proceedings of CHI'91 (New Orleans) ACM Press,
Addison-Wesley, 1991, pp. 85-90.
- Sumikawa, D., Blattner, M., Joy, K. and Greenberg, R. Guidelines for the
syntactic design of audio cues in computer interfaces. Lawrence Livermore
National Laboratory, 1986.
Back to the CHI BRS HomePage