The role of head movement in the analysis of spatial impression

Research Student: Dr 'Ryan' Chungeun Kim
Principal Supervisor: Dr Russell Mason
Co-Supervisor: Dr Tim Brookes
Supported by: EPSRC

Start date: 2006
End date: 2010

Project Outline

This research project focuses on developing objective hearing models that can predict the attributes of a sound in a similar way to the manner in which a human listener might perceive and evaluate it. Specifically, in the development of models of auditory perception, it has been suggested that listener head movements should perhaps be taken into consideration, since these are known to help listeners to perceive the location of a sound source.

Humans are not usually stationary when listening, but use head movement to explore a sound field and resolve potentially confusing cues. We therefore need to take these movements into account in order to make measurements that accurately predict what listeners hear. For this, we need to find out: what type of head movement listeners make; how to capture the signals at the ears to take these into account; and what it sounds like when physical parameters change as we move our heads. By finding these answers, we can develop a measurement technique that captures the ear signals in a manner that is relevant to normal listening.

The head movements made by listeners were found by conducting an experiment where people listened to a wide range of stimuli. They were asked to judge one of a number of attributes whilst their head movements were tracked. It was found that the range of rotational movement by the listeners spanned the following: azimuth +-40 degrees; elevation +-12 degrees; and roll +-13 degrees. The movement range was dependent on the task, with most movement when the listeners were judging source width and envelopment, less when judging location, and little movement when judging timbre. The pattern of movement did not cover a range where the ear moves around the horizontal plane as previously assumed, but covered a sloped range of ear positions where the listeners raised their ear as it was moved backwards and lowered it when moved forwards.

A sound capture system was then developed to take these movements into account. Two approaches were considered: repeated movement of a head and torso simulator (HATS); or multiple 'ears' in a simpler model of a head. The former is more accurate, but takes more time to capture the signals; the latter is more rapid (therefore more practical), but is less accurate. Research was undertaken to evaluate the perceptual magnitude (based on just-noticeable difference studies) of the differences between measurements made using each technique. It was found that the addition of a torso to the sphere improved the accuracy (i.e. more similar to the HATS), but the addition of smaller features such as nose and pinnae had little effect. Overall, the accuracy of interaural time difference measurements was good in some areas, and the accuracy of interaural level difference measurements was good below c. 1kHz. It was found that the accuracy of both these parameters was good enough to accurately predict the perceived location for a specifically tailored binaural model. The accuracy of interaural cross-correlation coefficient measurements was generally good.

Using a combination of spatial sampling theory and perceptually-motivated error tolerances, the spacing of the microphones around the sphere was optimised to reduce the number of processing channels whilst still maintaining measurement accuracy. A demonstration system was created consisting of a sphere containing 20 omnidirectional microphones with a torso.

In order to determine how best to interpret the results of such measurements, a series of experiments were undertaken to determine the perceived effect of position-dependent variations in interaural cross-correlation (IACC). It was found that the variations in the IACC when facing forwards affected the source width, distance, and environment width, and that the variations in the IACC when facing sideways affected the environment depth, envelopment and spaciousness. The results also showed that the listeners tended to use a 'scanning technique' in which the IACC affected the perceived width along the lateral plane auditioned.