Skip to main navigationSkip to main content
The University of Southampton

Adaptive virtual sound imaging

Advances in computer technology and low cost cameras open up new possibilities for three dimensional (3D) sound reproduction. The problem is to update the audio signal processing scheme for a moving listener, so that the listener perceives only the intended virtual sound image.


Binaural technology is often used for the reproduction of virtual sound images. The principle of binaural technology is to control the sound field at the listener's ears so that the reproduced sound field coincides with the desired real sound field. For the implementation of binaural technology over loudspeakers, it is necessary to cancel the cross-talk that prevents a signal meant for one ear from being heard at the other. However, such cross-talk cancellation, normally realized by time-invariant filters, works only for a specific listening location and the sound field can only be controlled in a limited area referred to as the ‘'sweet-spot''. If the listener moves away from the optimal listening location, it is required that the inverse filters are updated so that the sweet-spot is steered to the listener's new location. The issues related to filter updates have been investigated intensively in this work.

The aim of this project is to find a way to improve filter update techniques as well as to determine the filter update rate necessary to stabilize an acoustic image regardless of listener movement. This work is based on the assumption that the location of the listener is known from a visual head tracking device.

The effectiveness of cross-talk cancellation depends on the geometry of the system and in theory each frequency band can be reproduced from a loudspeaker pair with an optimal source span. Therefore the concept of Frequency Distributed Loudspeakers (FDLs) has been studied, and the idea is to reproduce each frequency from an optimal source angle within a given listening area.

The area that the listener can move within when the filters are updated can be determined by introducing the concept of ‘'operational area''. Hence, the operational area represents the region where the ‘'sweet-spot'' can be moved within using an adaptive virtual sound system. The extent of the operational area depends on performance criteria and is investigated thoroughly.

The relatively small ‘'sweet-spot'' of a static virtual sound imaging system, creates strong demand for an effective head tracking algorithm within the field of virtual sound. Adding access to a video camera for the audio system gives the possibility to track head movements and update the inverse filters accordingly. The increasing interest in visual tracking is due in part to the falling cost of computing power, video cameras and memory. A sequence of images grabbed at or near video rate typically does not change radically from frame to frame, and this redundancy of information over multiple images can be extremely helpful for analysing the input, in order to track individual objects.

Operational area

The performance of the binaural audio signal processing scheme is limited by the condition number of the associated inversion problem. The condition number as a function of frequency for different listener positions and rotation is examined using an analytical model. The resulting size of the ‘'operational area'' with listener head tracking is illustrated for different geometries of loudspeaker configurations together with related cross-over design techniques.

Adaptive and static cross-over frequenciesAdaptive and static cross-over frequencies


HRTF database

The measurement of arguably the most comprehensive KEMAR database of head related transfer functions yet available is presented. A complete database of head related transfer functions measured without the pinna is presented.

HRTF database measurement rigHRTF database measurement rig


Visual tracking

The update of the audio signal processing scheme is initiated by a visual tracking system that performs head tracking without the need for the listener to wear any sensors.

Colour tracker Colour trackerColour tracker
Contour trackerContour tracker


Filter update techniques

The solution to the problem of updating the filters without any audible change is solved by using either a very fine mesh for the inverse filters or by using commutation techniques. The filter update techniques are evaluated with subjective experiments and have proven to be effective both in an anechoic chamber and in a listening room, which supports the implementation of virtual sound imaging systems under realistic conditions.

Integrated virtual sound imaging system

The design and implementation of a visually adaptive Virtual Sound Imaging (VSI) system is carried out. The system is evaluated with respect to filter update rates and cross-talk cancellation effectiveness.

VSI systemVSI system


Privacy Settings