THE AUDITORY MODELING TOOLBOX

Applies to version: 1.6.0

View the code

Go to function

RELANOIBORRA2019
Modulation filterbank (based on DRNL)

Usage:

out = relanoiborra2019(insig_clean, insig_noisy, fs, varargin)
out = relanoiborra2019(insig_clean, insig_noisy, fs, flow, fhigh, varargin)
[out, clean, noisy] = relanoiborra2019([..])

Input parameters:

insig_clean Clean speech template signal
insig_noisy Noisy speech target signal
fs Sampling frequency (Hz)
flow Lowest center frequency of auditory filterbank (Hz)
fhigh Highest center frequency of auditory filterbank (Hz)

Output parameters:

out

Correlation metric structure. It contains the following fields:

dint : Correlation values for each modulation band.

dsegments : Correlation values for each time window and modulation band.

dfinal : Final average correlation

Description:

relanoiborra2019 builds the internal representations of the template and target signals. For the correct initialisation of the adaptation stage of the model, the speech signals (clean template and noisy targets) need to be prepanned, i.e., padded with non-zero signals. By default, the internal representations are thus assessed for two appended repetitions of each sound, but ultimately only the second repetition is used by the back-end stage of the model. The prepanning can be used in three configurations:

'prepanning' Automatic prepanning by the model assuming two subsequent sound presentations but only keeping the second presentation for modelling. If N_org is provided, the prepanning will be done for N_org samples. If N_org is not provided, the prepanning will be done for the singal length, but a minimum of 1.5 s (this duration seems to be long enough to ensure statistically equivalent results.
'no_prepanning' No pre-panning is applied at all. This option is faster but may lead to an overestimation of the onset of the internal representations during the decision stage and is thus not recommended.
'prepanning_external'
 External prepanning by the user, i.e., the input signals are already prepanned by N_prepanning samples.

relanoiborra2019 also takes the following optional key-value pairs:

'N_org',N_org Length of original sentence required for prepanning. Default is double the length of insig_clean.
'subject',sbj Subject profile for the DRNL definition. Default: 'NH'
'N_prepanning',N_prepanning
 Samples of prepanning, used for truncating the internal representations during the decision stage. Required when using 'prepanning_external'.

The model has been optimized to work with speech signals, and the preprocesing and variable names follow this principle. The model is also designed to work with broadband signals. In order to avoid undesired onset enhancements in the adaptation loops, the model expects to recive a prepaned signal to initialize them.

References:

H. Relaño-Iborra, J. Zaar, and T. Dau. A speech-based computational auditory signal processing and perception model. The Journal of the Acoustical Society of America, 146(5), 2019. [ DOI ]

M. Jepsen, S. Ewert, and T. Dau. A computational model of human auditory signal processing and perception. The Journal of the Acoustical Society of America, 124(1), 2008.