class2d

class2d.m
Generate the dataset
Set the GP hyperparameters and compute the covariance function
Newton-raphson procedure to optimise the latent function values
Plot the optimised latent function values
Visualise the predictive function via a large grid of test points
Using the full GP distribution - propagating the uncertainity through the sigmoid

class2d.m

Performs binary GP classification with two-dimensional data

From A First Course in Machine Learning Simon Rogers, August 2016 [simon.rogers@glasgow.ac.uk]

clear all; close all;

Generate the dataset

create some random data and then change the means of the two classes to seeparate them

rng(2)
x = randn(20,2);
x(1:10,:) = x(1:10,:) - 2;
x(11:end,:) = x(11:end,:) + 2;
t = [repmat(0,10,1);repmat(1,10,1)];

% Plot the data
figure();
hold off
pos = find(t==0);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[0.6 0.6 0.6])
hold on
pos = find(t==1);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[1 1 1])
xlabel('$x_1$','interpreter','latex')
ylabel('$x_2$','interpreter','latex')

Set the GP hyperparameters and compute the covariance function

alpha = 10;
gamma = 0.1;
N = size(x,1);
C = zeros(N);
for n = 1:N
    for m = 1:N
        C(n,m) = alpha*exp(-gamma*sum((x(n,:)-x(m,:)).^2));
    end
end

Newton-raphson procedure to optimise the latent function values

Initialise all to zero

f = repmat(0,N,1);
allf = [f'];
% Pre-compute the inverse of C
invC = inv(C);
for iteration = 2:6
    g = 1./(1+exp(-f));
    gra = t - g - invC*f;
    H = -diag(g.*(1-g)) - invC;
    f = f - inv(H)*gra;
    allf(iteration,:) = f';
end
H = -diag(g.*(1-g)) - invC;
% Plot the evolution of the f values
figure()
plot(allf,'k')
xlabel('Iteration')
ylabel('f')

Plot the optimised latent function values

Creates a 3D plot with the function value as the z axis

figure()
hold off
pos = find(t==0);
z = zeros(length(pos),1);
plot3(x(pos,1),x(pos,2),z,'ko','markersize',10,'linewidth',2,'markerfacecolor',[0.6 0.6 0.6])
hold on
pos = find(t==1);
z = zeros(length(pos),1);
plot3(x(pos,1),x(pos,2),z,'ko','markersize',10,'linewidth',2,'markerfacecolor',[1 1 1])
for n = 1:N
    plot3(x(n,1),x(n,2),f(n),'r+','markersize',10,'linewidth',2)
    plot3([x(n,1) x(n,1)],[x(n,2),x(n,2)],[0 f(n)],'k--','color',[0.6 0.6 0.6],'linewidth',2)
end
grid on
zlabel('$f$','interpreter','latex','fontsize',20)
set(gca,'cameraviewangle',8,'cameraposition',[29.3256 -49.3194 29.7317])
set(gca,'position',[0.200 0.16 0.70 0.8150])
axis tight
xlabel('$x_1$','interpreter','latex')
ylabel('$x_2$','interpreter','latex')
zlabel('z')

Visualise the predictive function via a large grid of test points

Create the grid

[X,Y] = meshgrid(-4.5:0.3:4.5,-4.5:0.3:4.5);
testN = prod(size(X));
testX = [reshape(X,testN,1) reshape(Y,testN,1)];
% Create the test covariance function
R = zeros(N,testN);
for n = 1:N
    for m = 1:testN
        R(n,m) = alpha*exp(-gamma*sum((x(n,:) - testX(m,:)).^2));
    end
end
% Compute the mean predictive latent function
testf = R'*invC*f;
Z = reshape(testf,size(X));
% Contour the predictions (function values, not probabilities)
figure()
hold off
[c,h]=contour(X,Y,Z,'color',[0.6 0.6 0.6]);
tl = clabel(c,h);
set(h,'linewidth',2)
hold on
pos = find(t==0);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[0.6 0.6 0.6])
pos = find(t==1);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[1 1 1])
xlabel('$x_1$','interpreter','latex')
ylabel('$x_2$','interpreter','latex')


% Contour the probabilities
figure()
hold off
[c,h]=contour(X,Y,1./(1+exp(-Z)),'color',[0.6 0.6 0.6]);
tl = clabel(c,h);
set(h,'linewidth',2)
hold on
pos = find(t==0);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[0.6 0.6 0.6])
pos = find(t==1);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[1 1 1])
xlabel('$x_1$','interpreter','latex')
ylabel('$x_2$','interpreter','latex')

Warning: Text handle output is not supported for managed labels. 
Warning: Text handle output is not supported for managed labels.

Using the full GP distribution - propagating the uncertainity through the sigmoid

pred_var = zeros(testN,1);
pavg = zeros(size(testf));
minpred_var = 1e-3;
% loop over the test points, computing the marginal predictive variance and
% sampling function values before passing them through the sigmoid and
% averaging to get a probability

for n = 1:testN
    pred_var(n) = max(minpred_var,alpha - R(:,n)'*invC*R(:,n));
    u = randn(10000,1).*sqrt(pred_var(n)) + testf(n);
    pavg(n) = mean(1./(1+exp(-u)));
end

% Contour the resulting probabilities
Z = reshape(pavg,size(X));
figure()
hold off
[c,h]=contour(X,Y,Z,'color',[0.6 0.6 0.6]);
tl = clabel(c,h);
set(h,'linewidth',2)

hold on
pos = find(t==0);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[0.6 0.6 0.6])
pos = find(t==1);
plot(x(pos,1),x(pos,2),'ko','markersize',10,'linewidth',2,'markerfacecolor',[1 1 1])

xlabel('$x_1$','interpreter','latex')
ylabel('$x_2$','interpreter','latex')

Warning: Text handle output is not supported for managed labels.