THE AUDITORY MODELING TOOLBOX

This documentation page applies to an outdated major AMT version. We show it for archival purposes only.
Click here for the documentation menu and here to download the latest AMT (1.6.0).

View the code

Go to function

JOERGENSEN2011 - the speech-based envelope power spectrum model

Usage

output = joergensen2011(x,y,fs_input,IO_param)

output = joergensen2011(x,y,fs_input,IO_param) returns the output of signal-to-noise envelope-power (SNRenv) ratio using the multi-resolution speech-based envelope spectrum model (mr-sEPSM) described in Joergensen et al. (2013)

Input parameters

'',x noisy speech mixture
'',y noise alone
'',fs sample rate in Hz
'',IO_param (optional) vector with parameters for the ideal observer that converts the SNRenv to probability of correct, assuming a given speech material. It contains four parameters of the ideal observer formatted as [k q m sigma_s].

Output parameters

'=output.SNRenv'
The SNRenv
'=output.P_correct'
The probability of correct given the SNRenv. This field is only included if IO_param is specified. Its calculation requires the Statistics ToolBox.

The model consists of the following stages:

  1. A gammatone bandpass filterbank to simulate the auditory filters
  2. An envelope extraction stage via the Hilbert Transform
  3. A modulation filterbank
  4. Computation of the long-term envelope power (output.SNRenv)

5) A decision mechanism based on a statistically ideal observer (output.P_correct)

References:

S. Joergensen and T. Dau. Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. J. Acoust. Soc. Am., 130(3):1475-1487, 2011.