This documentation page applies to an outdated major AMT version. We show it for archival purposes only.
Click here for the documentation menu and here to download the latest AMT (1.6.0).
[...] = dietz(insig,fs);
insig | binaural signal for which values should be calculated |
fs | sampling rate (Hz) |
fine | Information about the fine structure (see below) |
env | Information about the envelope (see below) |
dietz2011(insig,fs) calculates interaural phase, time and level differences of fine- structure and envelope of the signal, as well as the interaural coherence, which can be used as a weighting function.
The output structures fine and env have the following fields:
.s1 | Left signal as put in the binaural processor |
.s2 | Right signal as put in the binaural processor |
.fc | Center frequencies of the channels (f_carrier or f_mod) |
.itf | Transfer function |
.itf_equal | Transfer function without amplitude |
.ipd | Phase difference in rad |
.ipd_lp | Based on lowpass-filtered itf, phase difference in rad |
.ild | Level difference in dB |
.itd | Time difference based on instantaneous frequencies |
.itd_C | Time difference based on central frequencies |
.itd_lp | As .itd, with low-passed itf |
.itd_C_lp | As .itd_C, with low-passed itf |
.f_inst_1 | Instantaneous frequencies in the channels of the filtered s1 |
.f_inst_2 | Instantaneous frequencies in the channels of the filtered s2 |
.f_inst | Instantaneous frequencies (average of f_inst1 and 2) |
The steps of the binaural model to calculate the result are the following (see also Dietz et al., 2011):
The interaural temporal disparities are then extracted using a second-order complex gammatone bandpass (see paper for details).
dietz2011 accepts the following optional parameters:
'flow',flow | Set the lowest frequency in the filterbank to flow. Default value is 200 Hz. |
'fhigh',fhigh | Set the highest frequency in the filterbank to fhigh. Default value is 5000 Hz. |
'basef',basef | Ensure that the frequency basef is a center frequency in the filterbank. The default value is 1000. |
'filters_per_ERB',filters_per_erb | |
Filters per erb. The default value is 1. | |
'middle_ear_thr',r | Bandpass freqencies for middle ear transfer. The default value is [500 2000]. |
'middle_ear_order',n | |
Order of middle ear filter. Only even numbers are possible. The default value is 2. | |
'haircell_lp_freq',hlpfreq | |
Cutoff frequency for haircell lowpass filter. The default value is 770. | |
'haircell_lp_order',hlporder | |
Order of haircell lowpass filter. The default value is 5. | |
'compression_power',cpwr | |
| |
'alpha',alpha | Internal noise strength. Convention FIXME 65dB = 0.0354. The default value is 0. |
'int_randn' | Internal noise XXX. This is the default. |
'int_mini' | Internal noise XXX. |
'filter_order',fo | Filter order for output XXX. Used for both 'mod' and 'fine'. The default value is 2. |
'filter_attenuation_db',fadb | |
| |
'fine_filter_finesse',fff | |
Only for finestructure plugin. The default value is 3. | |
'mod_center_frequency_hz',mcf_hz | |
| |
'mod_filter_finesse',mff | |
| |
'level_filter_cutoff_hz',lfc_hz | |
| |
'level_filter_order',lforder | |
| |
'coh_param',coh_param | |
This is a structure used for the localization plugin. It has the following fields:
| |
'signal_level_dB_SPL',signal_level | |
Sound pressure level of left channel. Used for data display and analysis. Default value is 70. |
M. Dietz, S. D. Ewert, and V. Hohmann. Auditory model based direction estimation of concurrent speakers from binaural signals. Speech Communication, 53(5):592-605, 2011. [ DOI | http ]