THE AUDITORY MODELING TOOLBOX

Applies to version: 0.9.8

View the code

Go to function

DIETZ2011 - Dietz 2011 binaural model

Usage

[fine,fc,ild,env] = dietz2011(insig,fs);

Input parameters

insig binaural signal for which values should be calculated
fs sampling rate (Hz)

Output parameters

fine Information about the fine structure (see below)
fc center frequencies of gammatone filterbank
ild interaural level difference in dB
env Information about the envelope (see below)

dietz2011(insig,fs) calculates interaural phase, time and level differences of fine- structure and envelope of the signal, as well as the interaural coherence, which can be used as a weighting function.

The output structures fine and env have the following fields:
itf : transfer function ipd : phase difference in rad itd : interaural time difference based on instantaneous frequency itd_C : interaural time difference based on center frequency f_inst_1 : instantaneous frequencies of left ear signal f_inst_2 : instantaneous frequencies of right ear canal signal f_inst : instantaneous frequencies (average of f_inst1 and 2) ic : interaural coherence rms : rms value of frequency channels for weighting ild_lp : based on low passed-filtered insig, level difference in dB ipd_lp : based on lowpass-filtered itf, phase difference in rad itd_lp : based on lowpass-filtered itf, interaural time difference itd_C_lp : based on lowpass-filtered itf, interaural time difference f_inst_lp : lowpass instantaneous frequencies

The _lp values are not returned if the 'nolowpass' flag is set.

The steps of the binaural model to calculate the result are the following (see also Dietz et al., 2011):

  1. Middle ear filtering (500-2000 Hz 1st order bandpass)
  2. Auditory bandpass filtering on the basilar membrane using a 4th-order all-pole gammatone filterbank, employing 23 filter bands between 200 and 5000 Hz, with a 1 ERB spacing. The filter width was set to correspond to 1 ERB.
  3. Cochlear compression was simulated by power-law compression with an exponent of 0.4.
  4. The transduction process in the inner hair cells was modelled using half-wave rectification followed by filtering with a 770-Hz 5th order lowpass.
  5. Modulationfilterbank with three different filters applied to every frequency channel. One 2nd order gammatone filter for the fine structure centered at the center frequency of the frequency channel. One 2nd order gammatone filter for the envelope of the signal centered at 135 Hz. And a 2nd order lowpass filter with a cutoff frequency of 30 Hz to extract the ILD of the signal.
  6. Calculation of binaural parameters such as IPD, ITD, IC for fine structure and envelope filter signals and ILD for the ILD filter.

dietz2011 accepts the following optional parameters:

'flow',flow Set the lowest frequency in the filterbank to flow. Default value is 200 Hz.
'fhigh',fhigh Set the highest frequency in the filterbank to fhigh. Default value is 5000 Hz.
'basef',basef Ensure that the frequency basef is a center frequency in the filterbank. The default value is 1000.
'filters_per_ERB',filters_per_erb
 Filters per erb. The default value is 1.
'middle_ear_thr',r Bandpass freqencies for middle ear transfer. The default value is [500 2000].
'middle_ear_order',n
 Order of middle ear filter. Only even numbers are possible. The default value is 2.
'compression_power',cpwr
 Applied compression of the signal on the cochlea with ^compression_power. The default value is 0.4.
'alpha',alpha Internal noise strength. Convention 65dB = 0.0354. The default value is 0.
'int_randn' Internal noise by adding random noise with rms = alpha. This is the default.
'int_mini' Internal noise by setting all values < alpha to alpha.
'filter_order',fo Filter order for the two gammatone filter used for the fine structure and envelope of the modulation filter bank. The default value is 2.
'filter_attenuation_db',fadb
 Filter attenuation for the two gammatone filter used for the fine structure and envelope of the modulation filter bank. The default value is 10.
'fine_filter_finesse',fff
 Filter finesse (determines the bandwidth with fc/finesse) for the fine structure gammatone filter. The defulat value is 3.
'mod_center_frequency_hz',mcf_hz
 Center frequency of the gammatone envelope filter. The default value is 135.
'mod_filter_finesse',mff
 Filter finesse (determines the bandwidth with fc/finesse) for the envelope gammatone filter. The defulat value is 8.
'level_filter_cutoff_hz',lfc_hz
 Cutoff frequency off the low pass filter used for ILD calculation. The default value is 30.
'level_filter_order',lforder
 Order of low pass filter for the ILD calculation. The default value is 2.
'tau_cycles',tau_cycles
 Temporal resolution of binaural processor in terms of cycles per frequency channel. The default value is 5.
'signal_level_dB_SPL',signal_level
 Sound pressure level of left channel. Used for data display and analysis. Default value is 70.
'lowpass' Calculate the interaural parameters of the lowpassed signal/ITF (_lp return values). This is the default.
'nolowpass' Don't calculate the lowpass based interaural parameters. The _lp values are not returned.
'debug' Display what is happening.

References:

M. Dietz, S. D. Ewert, and V. Hohmann. Auditory model based direction estimation of concurrent speakers from binaural signals. Speech Communication, 53(5):592-605, 2011. [ DOI | http ]