Comparing performance and accuracy of EM, IP and mix-SQP algorithms¶

In this example, we compare the runtime and accuracy of the EM algorithm, the mix-SQP algorithm, and the interior-point method implemented by the MOSEK commercial solver (and called via the KWDual function in the R package REBayes).

Analysis setup¶

Before attempting to run this Julia code, make sure your computer is properly set up to run this code by following the setup instructions in the README of the git repository.

We begin by loading the Distributions, LowRankApprox, RCall and other packages, as well as some function definitions used in the code chunks below.

using Pkg
using Printf
using Random
using Distributions
using LinearAlgebra
using SparseArrays
using LowRankApprox
using RCall
include("../code/datasim.jl");
include("../code/likelihood.jl");
include("../code/mixEM.jl");
include("../code/mixSQP.jl");
include("../code/REBayes.jl");

Next, initialize the sequence of pseudorandom numbers.

Random.seed!(1);

Generate a small data set¶

Let's begin with a smaller example with 50,000 samples.

z = normtmixdatasim(round(Int,5e4));

Compute the likelihood matrix¶

Compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals, with $k = 20$. Note that the rows of the likelihood matrix are normalized by default.

sd = autoselectmixsd(z,nv = 20);
L  = normlikmatrix(z,sd = sd);
size(L)

(50000, 20)

Fit mixture model¶

First we run each of the optimization algorithms once to precompile the relevant functions.

outem  = mixEM(L,maxiter = 10);
outsqp = mixSQP(L,maxiter = 10,verbose = false);
outip  = REBayes(L);

┌ Warning: RCall.jl: Warning: package ‘REBayes’ was built under R version 3.4.4
│ Loading required package: Matrix
└ @ RCall /Users/pcarbo/.julia/packages/RCall/iojZI/src/io.jl:113

Next, let's fit the model using the three algorithms.

@time xem, tem = mixEM(L,tol = 1e-4,maxiter = 1000);
@time xip, tip = REBayes(L);
@time outsqp   = mixSQP(L,verbose = false);

  3.929660 seconds (26.65 k allocations: 7.131 GiB, 22.02% gc time)
  1.286706 seconds (3.54 k allocations: 243.581 KiB)
  0.271515 seconds (35.16 k allocations: 252.154 MiB, 12.41% gc time)

The mix-SQP algorithm algorithm is much faster than the other two methods, with EM being the slowest.

The quality of the IP and SQP solutions is very similar, whereas the EM solution is much worse:

fem  = mixobjective(L,xem);
fip  = mixobjective(L,xip);
fsqp = mixobjective(L,outsqp["x"]);
fbest = minimum([fem fip fsqp]);
@printf "Difference between EM solution and best solution:  %0.2e\n" fem - fbest
@printf "Difference between IP solution and best solution:  %0.2e\n" fip - fbest
@printf "Difference between SQP solution and best solution: %0.2e\n" fsqp - fbest

Difference between EM solution and best solution:  1.88e+01
Difference between IP solution and best solution:  3.07e-09
Difference between SQP solution and best solution: 0.00e+00

Comparison using a larger data set¶

Next, let's see what happens when we apply these three algorithms to a larger data set.

z = normtmixdatasim(round(Int,1e5));

As before, we compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals. This time, we use a finer grid of $k = 100$ normal densities.

sd = autoselectmixsd(z,nv = 100);
L  = normlikmatrix(z,sd = sd);
size(L)

(100000, 100)

Now we fit the model using the three approaches.

@time xem, tem = mixEM(L,tol = 1e-4,maxiter = 1000);
@time xip, tip = REBayes(L);
@time outsqp   = mixSQP(L,verbose = false);

 11.760771 seconds (1.58 k allocations: 11.780 GiB, 45.34% gc time)
 13.678142 seconds (258 allocations: 12.984 KiB)
  0.733824 seconds (473.63 k allocations: 537.169 MiB, 22.60% gc time)

In this example, the mix-SQP algorithm reaches a solution much faster than the both EM and IP approaches.

As before, the quality of the IP and SQP solutions is similar, whereas the EM solution is much worse.

fem  = mixobjective(L,xem);
fip  = mixobjective(L,xip);
fsqp = mixobjective(L,outsqp["x"]);
fbest = minimum([fem fip fsqp]);
@printf "Difference between EM and best solutions:  %0.2e\n" fem - fbest
@printf "Difference between IP and best solutions:  %0.2e\n" fip - fbest
@printf "Difference between SQP and best solutions: %0.2e\n" fsqp - fbest

Difference between EM and best solutions:  1.42e+02
Difference between IP and best solutions:  5.15e-06
Difference between SQP and best solutions: 0.00e+00

Session information¶

The section gives information about the computing environment used to generate the results contained in this notebook, including the version of Julia, R and the packages used.

Pkg.status("Distributions")
Pkg.status("LowRankApprox")
Pkg.status("RCall")
versioninfo()

    Status `~/.julia/environments/v1.1/Project.toml`
  [31c24e10] Distributions v0.21.1
    Status `~/.julia/environments/v1.1/Project.toml`
  [898213cb] LowRankApprox v0.2.3
    Status `~/.julia/environments/v1.1/Project.toml`
  [6f49c342] RCall v0.13.3
Julia Version 1.1.1
Commit 55e36cc (2019-05-16 04:10 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin15.6.0)
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

Since we called the KWDual function in R, it is also useful to record information about R, and the R packages used.

R"sessionInfo()"

RObject{VecSxp}
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] REBayes_1.8   Matrix_1.2-12

loaded via a namespace (and not attached):
[1] compiler_3.4.3  Rmosek_8.0.69   grid_3.4.3      lattice_0.20-35