Glasgow Interactive Systems Group (GIST) logo - click here to go to the GIST Website

The Multimodal Interaction Group website has moved. Click here to visit the new site

All About Earcons

This description of earcons is based on the original one by Blattner and can be found in my thesis (Chapter 3), see my publication list to get a copy. The citations can be found in my bibliography.

Auditory maps and other examples of earcons (continued)

Barfield, Rosenberg & Levasseur [14] carried out experiments where they used earcons to aid navigation through a menu hierarchy. They say (p 102):

"...the following study was done to determine if using sound to represent depth within the menu structure would assist users in recalling the level of a particular menu item".

The earcons they used were very simple, just decreasing in pitch as a subject moved down the hierarchy. The sounds lasted half a second. They describe them thus (p 104):

"...the tones were played with a harpsichord sound and started in the fifth octave of E corresponding to the main or top level of the menu and descended through B of the fourth octave".

These sounds did not fully exploit all the advantages offered by earcons (for example, they did not use rhythm or different timbres) and they did not improve user performance in the task. If better earcons were used then a performance increase may have been found. This work shows that there is a need for a set of guidelines for the creation of effective earcons. If earcons are created without care then they may be ineffective. Later in this thesis an investigation of earcons is undertaken and a set of guidelines are produced to help in such cases as this. One other reason for the lack of success of the earcons in this experiment was that they might not have been used in the best place in the interface. There are no rules for where to use sounds, the choice is up to individual designers and so mistakes can occur. Later in the thesis an investigation is undertaken of where sound can effectively be used in an interface.

Blattner, Greenberg & Kamegai [24] discuss the use of earcons to present specific information such as speed, temperature or energy, in a system to sonify turbulent liquids. They suggest that earcons could be used to present this information so that the user would not have to take their eyes off the main graphical display. Unfortunately, the system they discuss was never implemented.

Some recent work by Stevens, Brewster, Wright & Edwards [160] used earcons to provide a method for blind readers to `glance' at algebra. They say that a glance, or overview, is very important for planning the reading process. There is currently no way to do this. They suggest using earcons. The parameters described above for manipulating earcons are very similar to those describing prosody in speech. Stevens et al. combined algebra syntax, algebraic prosody and earcons to produce a system of algebra earcons. They describe a set of rules that can be used to construct algebra earcons from an algebra expression.

Different items within an algebra expression are replaced by sounds with different timbres, such as: Piano for base-level operands, violin for superscripts and drums for equals. Rhythm, pitch and intensity are then defined according to other rules. Stevens et al. experimentally tested the earcons to see if listeners could extract algebraic structure from the sounds and identify the expressions they represented. Their results showed that subjects performed significantly better than chance. This work again shows earcons are a flexible and useful method of presenting complex information at the interface.


Comparison 1: Jones & Furner

There have been two main comparisons of earcons and auditory icons. The first was carried out, by Jones & Furner [100]. They performed two experiments. In the first they tried to find out which types of sounds listeners preferred. Subjects were given typical interface commands (for example, delete or copy) and were played a sample earcon, an auditory icon and some synthesised speech. In the experiment, earcons were preferred to auditory icons. A second experiment was carried out to see if subjects could associate sounds with commands. The subjects were played a sound and had to match it to a command in a list. This time auditory icons proved to be more popular than earcons. This may have been because family earcons would initially be harder to associate to commands (as they have no inherent meaning and take time to learn) whereas auditory icons have a semantic relationship with the function they represent.

In both experiments the subjects scored highest with the synthesised speech which is not surprising as speech contains all the information necessary and does not require any learning. The paper does not describe the nature of the earcons and auditory icons used. It may have been that the cues were not well designed which would have affected preference and association. The reasons why speech is not being used were outlined at the beginning of this section and these still hold true as speech loses some of its advantages when used in different situations.

In the preference experiment a further set of abstract commands was presented to the subjects. There was no significant difference between earcons and synthetic speech in this case but the preference for auditory icons was much lower. This matches expectations as there is no semantic link between an abstract auditory icon and its meaning. It is interesting to note that there was no difference in preference between the earcons and synthetic speech in this case.

Comparison 2: Lucas

A further comparison experiment was undertaken by Lucas [111]. He conducted a more detailed analysis of earcons, auditory icons and synthetic speech. He presented subjects with sound stimuli of the three types and they had to choose which command they felt most closely fitted the sound. He conducted a second trial a week after the first.

His results showed that there was no difference in response time between earcons and auditory icons. Speech was significantly faster. There were no differences in response between trial one and two. However, subjects did make fewer errors on trial two. This would indicate that with more training auditory cues could easily be learned. There were no significant differences in error rates between earcons and auditory icons. There were no errors in the speech condition (as the stimuli were self-explanatory). After the first trial, half of the subjects were given an explanation of the sounds used and the design methods. These subjects showed a decrease in error rates on trial two. Lucas (p 7) says : "This indicates that a knowledge of the auditory cue design can improve the accuracy of cue recognition...".

Discussion of the comparisons

These two comparisons have shown little difference between earcons and auditory icons. It may be that each has advantages over the other in certain circumstances and that a combination of both might be best. In some situations the intuitive nature of auditory icons may make them favourable. In other situations earcons might be best because of the powerful structure they contain, especially if there is no real-world equivalent of what the sounds are representing. Indeed, there may be some middle ground where the natural sounds of auditory icons can be manipulated to give the structure of earcons. Cohen [44] proposes that there is a continuum of sound from the literal everyday sounds of auditory icons to the abstract sounds of earcons. Objects or actions within an interface that do not have an auditory equivalent must have an auditory icon made for them. This then has no semantic link to what it represents: Its meaning must be learned. The auditory icon then moves more towards the abstract end of the continuum. When hearing an earcon, the listener may hear and recognise a piano timbre, rhythm and pitch structure as a kind of `catch-phrase'; he/she will not hear all the separate parts of the earcon and work out the meaning from them. The earcon will be heard more as a whole source and thus the perception of the earcon moves more towards the representational side of the continuum. Therefore, earcons and icons are not necessarily as far apart as they might appear. Both have advantages and disadvantages but these can be maximised/minimised by looking at the properties of each. This thesis will attempt to look more closely at the properties of earcons because, as yet, little is known about them.

One drawback of earcons is that it is unclear as to how effective they are. They have not been tested in any implementations (as auditory icons have) so it is not known whether listeners will be able to extract the complex information from them. It is also clear from the experiments of Barfield et al. [14] that creating earcons is not a simple matter; it is easy to create ineffective ones. Therefore some clear experiments to test the usability of earcons are needed as is a set of guidelines to help designers build effective ones. The work described in this thesis attempts to deal with these problems. Earcons are experimentally tested and from the results a set guidelines are produced to help designers of earcons.

In the work on earcons and auditory icons little indication is given as to where sounds should be used in the interface. Gaver suggests that they should be used in ways suggested by the natural environment but in a computer interface this is not always possible. One important step forward would be to have a method to find where sound in the interface would be useful. This thesis proposes such a method.