out = relanoiborra2019(insig_clean, insig_noisy, fs, varargin) out = relanoiborra2019(insig_clean, insig_noisy, fs, flow, fhigh, varargin) [out, clean, noisy] = relanoiborra2019([..])
insig_clean | Clean speech template signal |
insig_noisy | Noisy speech target signal |
fs | Sampling frequency (Hz) |
flow | Lowest center frequency of auditory filterbank (Hz) |
fhigh | Highest center frequency of auditory filterbank (Hz) |
out | Correlation metric structure. It contains the following fields: dint : Correlation values for each modulation band. dsegments : Correlation values for each time window and modulation band. dfinal : Final average correlation |
relanoiborra2019 builds the internal representations of the template and target signals. For the correct initialisation of the adaptation stage of the model, the speech signals (clean template and noisy targets) need to be prepanned, i.e., padded with non-zero signals. By default, the internal representations are thus assessed for two appended repetitions of each sound, but ultimately only the second repetition is used by the back-end stage of the model. The prepanning can be used in three configurations:
'prepanning' | Automatic prepanning by the model assuming two subsequent sound presentations but only keeping the second presentation for modelling. If N_org is provided, the prepanning will be done for N_org samples. If N_org is not provided, the prepanning will be done for the singal length, but a minimum of 1.5 s (this duration seems to be long enough to ensure statistically equivalent results. |
'no_prepanning' | No pre-panning is applied at all. This option is faster but may lead to an overestimation of the onset of the internal representations during the decision stage and is thus not recommended. |
'prepanning_external' | |
External prepanning by the user, i.e., the input signals are already prepanned by N_prepanning samples. |
relanoiborra2019 also takes the following optional key-value pairs:
'N_org',N_org | Length of original sentence required for prepanning. Default is double the length of insig_clean. |
'subject',sbj | Subject profile for the DRNL definition. Default: 'NH' |
'N_prepanning',N_prepanning | |
Samples of prepanning, used for truncating the internal representations during the decision stage. Required when using 'prepanning_external'. |
The model has been optimized to work with speech signals, and the preprocesing and variable names follow this principle. The model is also designed to work with broadband signals. In order to avoid undesired onset enhancements in the adaptation loops, the model expects to recive a prepaned signal to initialize them.
H. Relaño-Iborra, J. Zaar, and T. Dau. A speech-based computational auditory signal processing and perception model. The Journal of the Acoustical Society of America, 146(5), 2019. [ DOI ]
M. Jepsen, S. Ewert, and T. Dau. A computational model of human auditory signal processing and perception. The Journal of the Acoustical Society of America, 124(1), 2008.