Gaussian Mixture Models

Some people can’t bear uncertainty. I can safely be excluded from that list. Uncertainty is rarely a bad thing. Let me explain with one example.

Imagine yourself in a room filled with people maintining conversations all around you. Someone then starts up a conversation with you. You have no problem focusing on your own conversation while simultaneously multiple others are in progress. How do you suppress the background noise in order to concentrate on your dialogue? This is known as the cocktail party effect. There are many ways to solve this problem, but the verdict is still out on how the brain actually does it. Interesting right?

Not too long ago, a friend asked me for some help with solving a problem. The question was: how can I detect multiple peaks in a noisy spectrum?

So, I thought that if the underlying source of each peak could be represented as a Gaussian distribution, the entire spectrum could be just the sum of multiple Gaussians. Effectively, you’re given one signal from which you must estimate multiple sources. Just like above! This formulation is the well known Gaussian Mixture Model (GMM). Let’s break it down.

Every peak i is a Gaussian distribution with mean \mu_i, variance \sigma^2_i and weight w_i, denoted by the probability density function f\left(x; \mu_i,\sigma^2_i\right) and is spelled out as

f\left(x; \mu_i,\sigma^2_i\right) = \displaystyle \frac{1}{\sqrt{2\pi\sigma^2_i}}\exp{\left[-\frac{1}{2}\left(\frac{x-\mu_i}{\sigma_i}\right)^2\right]}.

Then, for N peaks, the GMM is given by

F\left(x; \boldsymbol{\mu},\boldsymbol{\sigma}^2_i,w_i\right) = \displaystyle \sum_{i=1}^N w_i f\left(x; \mu_i,\sigma^2_i\right)

where all wieghts w_i are greater than 0 and sum to 1. Great! Unfortunately, we are not done. Since this is an inverse problem, the best we can do is to obtain estimates of the 3N parameters: \mu_i\sigma^2_i and w_i. Expectation-Maximization (EM) is one of a few algorithms used to get estimates of the parameters. Thankfully, the gmdistribution.fit function found in Matlab’s statistics toolbox solves this problem. All it asked for is the data and the number of peaks. Here we go.

Let’s create some data. We will generate N=2 normally distributed random variables x_1 and x_2 with known means and standard deviations shown below

\boldsymbol{\mu} = \begin{bmatrix} -3\\ 6 \end{bmatrix} and \boldsymbol{\sigma} = \begin{bmatrix} 1.5\\ 2.5 \end{bmatrix}

then, combine them into one random variable x. We then plot a normalized histogram.

N = 2;
K = 1000;
M1 = 10000;
M2 = 2*M1;
m  = [-3; 6];
s  = [1.5; 2.5];
x1 = s(1)*randn(M1,1) + m(1);
x2 = s(2)*randn(M2,1) + m(2);
x  = [x1; x2];
xaxis = [min(x):range(x)/(K-1):max(x)]';
xpdf  = hist(x,K)/(M1+M2);
figure(1)
set(gcf, 'color', 'w');
plot(xaxis, xpdf, 'k');
xlabel('x');
ylabel('f(x)');
axis tight;

How does Matlab now recover the estimates \hat{x}_1 and \hat{x}_2 from  x?

% pre-allocation
mu     = zeros(N,1);
sigma  = zeros(N,1);
weight = zeros(N,1);
gaussPdfi = zeros(K,N);
gaussPdf  = zeros(K,1);

% fit
options = statset('Display','final');
obj = gmdistribution.fit(x,N,'Options',options);
gaussPdf = pdf(obj,xaxis);
A = sum(gaussPdf);
gaussPdf = gaussPdf/A;

% separating N Gaussians
for n = 1:N,
    mu(n)          = obj.mu(n);
    sigma(n)       = sqrt(obj.Sigma(1,1,n));
    weight(n)      = obj.PComponents(n);
    gaussPdfi(:,n) = weight(n)*normpdf(xaxis,mu(n),sigma(n))/A;
end

Let’s see what the distributions look like.

figure(1)
set(gcf, 'color', 'w');
set(gca, 'FontSize', 6);
plot(xaxis, gaussPdf, 'r', 'linewidth', 1.25);
xlabel('x', 'Fontsize', 6);
ylabel('f(x), F(x)', 'Fontsize', 6);
axis tight;
hold off;

figure(2)
set(gcf, 'color', 'w');
set(gca, 'FontSize', 6);
area(xaxis, gaussPdfi(:,1));
h = findobj(gca, 'Type', 'patch');
set(h, 'FaceColor', 'g', 'EdgeColor', 'g', 'facealpha', 0.75);
hold on;
area(xaxis, gaussPdfi(:,2));
h = findobj(gca, 'Type', 'patch');
set(h,'facealpha', 0.75);
plot(xaxis, gaussPdf, 'r', 'linewidth', 1.25);
xlabel('x', 'Fontsize', 6);
ylabel('F(x), f(x_1), f(x_2)', 'Fontsize', 6);
axis tight;
hold off;

Looks good! What are the estimates of the mean \hat{\boldsymbol{\mu}} and standard deviation \hat{\boldsymbol{\sigma}}?

>> mu
mu =
   -2.9997
    5.9863
>> sigma
sigma =
    1.5144
    2.4889

I’ll take that!