[fine,fc,ild,env] = dietz2011(insig,fs);
insig | binaural signal for which values should be calculated |
fs | sampling rate (Hz) |
fine | Information about the fine structure (see below) |
fc | center frequencies of gammatone filterbank |
ild | interaural level difference in dB |
env | Information about the envelope (see below) |
dietz2011(insig,fs) calculates interaural phase, time and level differences of fine- structure and envelope of the signal, as well as the interaural coherence, which can be used as a weighting function.
The output structures fine and env have the following fields:
itf transfer function
ipd phase difference in rad
itd interaural time difference based on instantaneous frequency
itd_C interaural time difference based on center frequency
f_inst_1 instantaneous frequencies of left ear signal
f_inst_2 instantaneous frequencies of right ear canal signal
f_inst instantaneous frequencies (average of f_inst1 and 2)
ic interaural coherence
rms rms value of frequency channels for weighting
ild_lp based on low passed-filtered insig, level difference in dB
ipd_lp based on lowpass-filtered itf, phase difference in rad
itd_lp based on lowpass-filtered itf, interaural time difference
itd_C_lp based on lowpass-filtered itf, interaural time difference
f_inst_lp lowpass instantaneous frequencies
The _lp values are not returned if the 'nolowpass' flag is set.
The steps of the binaural model to calculate the result are the following (see also Dietz et al., 2011):
dietz2011 accepts the following optional parameters:
'flow',flow | Set the lowest frequency in the filterbank to flow. Default value is 200 Hz. |
'fhigh',fhigh | Set the highest frequency in the filterbank to fhigh. Default value is 5000 Hz. |
'basef',basef | Ensure that the frequency basef is a center frequency in the filterbank. The default value is 1000. |
'filters_per_ERB',filters_per_erb | |
Filters per erb. The default value is 1. | |
'middle_ear_thr',r | Bandpass frequencies for middle ear transfer. The default value is [500 2000]. |
'middle_ear_order',n | |
Order of middle ear filter. Only even numbers are possible. The default value is 2. | |
'compression_power',cpwr | |
Applied compression of the signal on the cochlea with ^compression_power. The default value is 0.4. | |
'alpha',alpha | Internal noise strength. Convention 65dB = 0.0354. The default value is 0. |
'int_randn' | Internal noise by adding random noise with rms = alpha. This is the default. |
'int_mini' | Internal noise by setting all values < alpha to alpha. |
'filter_order',fo | Filter order for the two gammatone filter used for the fine structure and envelope of the modulation filter bank. The default value is 2. |
'filter_attenuation_db',fadb | |
Filter attenuation for the two gammatone filter used for the fine structure and envelope of the modulation filter bank. The default value is 10. | |
'fine_filter_finesse',fff | |
Filter finesse (determines the bandwidth with fc/finesse) for the fine structure gammatone filter. The default value is 3. | |
'mod_center_frequency_hz',mcf_hz | |
Center frequency of the gammatone envelope filter. The default value is 135. | |
'mod_filter_finesse',mff | |
Filter finesse (determines the bandwidth with fc/finesse) for the envelope gammatone filter. The default value is 8. | |
'level_filter_cutoff_hz',lfc_hz | |
Cutoff frequency off the low pass filter used for ILD calculation. The default value is 30. | |
'level_filter_order',lforder | |
Order of low pass filter for the ILD calculation. The default value is 2. | |
'tau_cycles',tau_cycles | |
Temporal resolution of binaural processor in terms of cycles per frequency channel. The default value is 5. | |
'signal_level_dB_SPL',signal_level | |
Sound pressure level of left channel. Used for data display and analysis. Default value is 70. | |
'lowpass' | Calculate the interaural parameters of the lowpassed signal/ITF (_lp return values). This is the default. |
'nolowpass' | Don't calculate the lowpass based interaural parameters. The _lp values are not returned. |
'debug' | Display what is happening. |
M. Dietz, S. D. Ewert, and V. Hohmann. Auditory model based direction estimation of concurrent speakers from binaural signals. Speech Communication, 53(5):592--605, 2011. [ DOI | http ]