mix-SQP experiments

mix-SQP demo with different low-rank matrix approximations

In this example, we illustrate how the QR and singular value decompositions of the likelihood matrix can be used to speed up the SQP algorithm.

Analysis setup

Before attempting to run this Julia code, make sure your computer is properly set up to run this code by following the setup instructions in the README of the git repository.

We begin by loading the Distributions, LowRankApprox and other packages, as well as some function definitions used in the code chunks below.

In [1]:
using Pkg
using Random
using Printf
using Distributions
using LinearAlgebra
using SparseArrays
using LowRankApprox
include("../code/datasim.jl");
include("../code/likelihood.jl");
include("../code/mixSQP.jl");

Next, initialize the sequence of pseudorandom numbers.

In [2]:
Random.seed!(1);

Generate a small data set

Let's start with a smaller example with 50,000 samples.

In [3]:
z = normtmixdatasim(round(Int,5e4));

Compute the likelihood matrix

Compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals, with $k = 20$. Note that the rows of the likelihood matrix are normalized by default.

In [4]:
sd = autoselectmixsd(z,nv = 20);
L  = normlikmatrix(z,sd = sd);
size(L)
Out[4]:
(50000, 20)

Fit mixture model using SQP algorithm

First we run the mix-SQP algorithm a few times to precompile the relevant functions.

In [5]:
out = mixSQP(L,lowrank = "svd",maxiter = 10,verbose = false);
out = mixSQP(L,lowrank = "qr",maxiter = 10,verbose = false);
out = mixSQP(L,lowrank = "nothing",maxiter = 10,verbose = false);

Next, run the mix-SQP solver again with the SVD and QR approximations to the likelihood matrix, and with no approximation. The approximation tolerance is set very low, to 1e-10.

In [6]:
@time outSVD = mixSQP(L,lowrank = "svd",pqrtol = 1e-10,verbose = false);
@time outQR  = mixSQP(L,lowrank = "qr",pqrtol = 1e-10,verbose = false);
@time out    = mixSQP(L,lowrank = "nothing",verbose = false);
  0.227304 seconds (30.38 k allocations: 272.346 MiB, 10.40% gc time)
  0.226690 seconds (30.03 k allocations: 264.711 MiB, 10.62% gc time)
  0.249882 seconds (37.26 k allocations: 432.825 MiB, 13.55% gc time)

You may see a slight improvement in the computation time with the QR and SVD approximations. The solutions using the low-rank approximations are still very close to the solution without an approximation:

In [7]:
mixobjective(L,outSVD["x"]) - mixobjective(L,out["x"]), 
mixobjective(L,outQR["x"]) - mixobjective(L,out["x"])
Out[7]:
(1.8189894035458565e-12, 1.8189894035458565e-12)

Generate a larger data set

Next, let's see what happens when we use the SQP algorithm to fit a mixture model to a much larger data set.

In [8]:
z = normtmixdatasim(round(Int,1e6));

Compute the likelihood matrix

As before, we compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals. This time, we use a finer grid of $k = 100$ normal densities.

In [9]:
sd = autoselectmixsd(z,nv = 100);
L  = normlikmatrix(z,sd = sd);
size(L)
Out[9]:
(1000000, 100)

Fit mixture model using SQP algorithm

As before, let's run the mix-SQP solver with the SVD and QR approximations, and with no approximation.

In [10]:
@time outSVD = mixSQP(L,lowrank = "svd",pqrtol = 1e-10,verbose = false);
@time outQR  = mixSQP(L,lowrank = "qr",pqrtol = 1e-10,verbose = false);
@time out    = mixSQP(L,lowrank = "none",verbose = false);
  9.602211 seconds (485.50 k allocations: 6.457 GiB, 9.72% gc time)
  8.688203 seconds (112.77 k allocations: 6.322 GiB, 8.06% gc time)
 39.652424 seconds (40.08 k allocations: 34.259 GiB, 9.46% gc time)

In the larger data set, the QR and SVD approximations yield much larger improvements in computation time, which is not surprising because the matrix-vector operations (particularly in computing the gradient and Hessian) dominate the computational cost.

As before, the solutions using the low-rank approximations are still close to the solution that is obtained without any approximation:

In [11]:
mixobjective(L,outSVD["x"]) - mixobjective(L,out["x"]), 
mixobjective(L,outQR["x"]) - mixobjective(L,out["x"])
Out[11]:
(9.953510016202927e-9, 5.820766091346741e-9)

Session information

The section gives information about the computing environment used to generate the results contained in this notebook, including the version of Julia, and the versions of the Julia packages used here.

In [12]:
Pkg.status("Distributions");
Pkg.status("LowRankApprox");
versioninfo()
    Status `~/.julia/environments/v1.1/Project.toml`
  [31c24e10] Distributions v0.21.1
    Status `~/.julia/environments/v1.1/Project.toml`
  [898213cb] LowRankApprox v0.2.3
Julia Version 1.1.1
Commit 55e36cc (2019-05-16 04:10 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin15.6.0)
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

© 2017-2018 Youngseok Kim, Peter Carbonetto, Matthew Stephens & Mihai Anitescu.