Microphone Quality Metering & Enhancement

Start date: 2012
End date (phase-1): 2017 (collaboration ongoing)

Research Student: Andy Pearce
Principal Supervisor: Dr Tim Brookes
Co-Supervisor: Dr Martin Dewhirst
Co-Supervisor: Dr Russell Mason
Supported by: EPSRC and Cirrus Logic

Project Outline

The perceptual characteristics of a microphone are not always clear from its technical specification. This thesis documents a first step towards creating more perceptually relevant measures.

Consideration of relevant criteria revealed that the most appropriate method for recording stimuli for perceptual microphone comparisons is to use all microphones under test simultaneously. Experiments determined that a maximum array size of 150 mm will ensure that the perceptual differences between the recorded stimuli are predominantly due to the characteristics of the microphones and not artefacts of the spacing between them.

It was established that there are eight standard physical differences that exist between microphones which may impact the perceived characteristics of a recording. These differences, supplemented with expert opinions, indicated that recording five programme items with eight studio and two MEMS microphones would allow for determination of the most prominent inter-microphone perceptual differences. A combination of indirect and direct elicitation experiments on the resulting 50 recordings identified a hierarchy of 40 perceptual attributes that describe the differences between microphones. A novel attribute contribution experiment conducted on the 31 lowest-level attributes in the hierarchy showed that brightness contributes the most overall to the inter-microphone difference.

The spectral centroid and ratios comparing the relative level of high frequencies were previously used to predict brightness; however, these metrics did not predict subjective ratings of microphone-related brightness as well as a newly proposed combination metric: the product of the spectral centroid above 3 kHz, and the ratio of energy above 3 kHz compared to all energy. This model performed well on training data (r = 0.909). Validating it on independent microphones and programme items suggested that improvements may be necessary for error-free prediction of programme-related aspects of brightness, but showed good correlation with each programme item and overall (r = 0.854), indicating that the model predicts microphone-related brightness well.


Data Archive

The data on which the findings of this project are based will be made available as each stage of the project is completed. Data published so far are available in these repositories: