Applies to version: 1.6.0

Demo of the model estimating the azimuth angles of concurrent speakers

Program code:

%demo_may2011 Demo of the model estimating the azimuth angles of concurrent speakers
%   DEMO_MAY2011 demonstrates the model estimating
%   the azimuth angles of three concurrent speakers. Also, it returns the
%   estimated angles.
%   Set the variable demo to one of the following flags to show other conditions:
%   - 1R: One speaker in reverberant room.
%   - 2: Two speakers in free field.
%   - 3: Three speakers in free field. This is default.
%   - 5: Five speakers in free field. 
%   Figure 1: Time-frequency-based azimuth estimates
%      This figure shows the azimuth estimates in the time-frequency
%      domain for three speakers.
%   Figure 2: Interaural time differences (ITDs)
%      This figure shows the ITDs in the time-frequency domain estimated
%      from the mixed signal of three concurrent speakers.
%   Figure 3: Interaural level differences (ILDs)
%      This figure shows the ILDs in the time-frequency domain estimated
%      from the mixed signal of three concurrent speakers.
%   Figure 4: Interaural coherence
%      This figure shows the interaural coherence in the time-frequency domain estimated
%      from the mixed signal of three concurrent speakers.
%   Figure 5: GMM pattern
%      This figure shows the pattern and the histogram obtained from the
%      GMM-estimator for the mixed signal of three concurrent speakers.
%   See also: may2011
%   References:
%     T. May, S. van de Par, and A. Kohlrausch. A probabilistic model for
%     robust localization based on a binaural auditory front-end. IEEE Trans
%     Audio Speech Lang Proc, 19:1--13, 2011.
%   #Author: Tobias May (2009): Original implementation for the AMT
%   #Author: Piotr Majdak (2024): Integration in the AMT 1.6.
%   #Author: Michael Mihocic (2024): check if Octave is used -> return warning

if isoctave
    warning([mfilename ' is not supported in Octave.']);

%% Select binaural recordings

  % Select a demo
if ~exist('demo','var')
% Create signals
switch lower(demo)
    case '1r'
        % Create input signal
        [signal,fs] = sig_competingtalkers('one_speaker_reverb');
        % Find all active sources
        nSources = 1;
    case '2'
        % Create input signal
        [signal,fs] = sig_competingtalkers('two_of_three');
        % Find all active sources
        nSources = 2;
    case '3'
        % Create input signal
        [signal,fs] = sig_competingtalkers('three_of_three');
        % Find all active sources
        nSources = 3;
    case '5'
        % Create input signal
        [signal,fs] = sig_competingtalkers('five_speakers');
        % Find all active sources
        nSources = 5;

%% Perform GMM-based sound source localization
% Perform localization
out = may2011(signal,fs);

%% Plot results
% Plot time-frequency-based azimuth estimates
imagesc(out.azimuth,[-90 90]);
xlabel('Number of frames')
ylabel('Number of gammatone channels')
title('Time-frequency-based azimuth estimates')
may2011_cbarlabel('Azimuth (deg)')
axis xy;

% Plot binaural cues
imagesc(out.itd,[-1e-3 1e-3]);
xlabel('Number of frames')
ylabel('Number of gammatone channels')
title('Interaural time difference (ITD)')
may2011_cbarlabel('ITD (ms)')
axis xy;

xlabel('Number of frames')
ylabel('Number of gammatone channels')
title('Interaural level difference (ILD)')
may2011_cbarlabel('ILD (dB)')
axis xy;

xlabel('Number of frames')
ylabel('Number of gammatone channels')
title('Interaural coherence (IC)')
axis xy;

% Histogram analysis of frame-based localization estimates
amt_disp(['The estimated azimuth angles are: ' num2str(azEst) ' degrees'],'documentation');