In this example, we illustrate how the QR and singular value decompositions of the likelihood matrix can be used to speed up the SQP algorithm.
Before attempting to run this Julia code, make sure your computer is properly set up to run this code by following the setup instructions in the README of the git repository.
We begin by loading the Distributions, LowRankApprox and other packages, as well as some function definitions used in the code chunks below.
using Pkg
using Random
using Printf
using Distributions
using LinearAlgebra
using SparseArrays
using LowRankApprox
include("../code/datasim.jl");
include("../code/likelihood.jl");
include("../code/mixSQP.jl");
Next, initialize the sequence of pseudorandom numbers.
Random.seed!(1);
Let's start with a smaller example with 50,000 samples.
z = normtmixdatasim(round(Int,5e4));
Compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals, with $k = 20$. Note that the rows of the likelihood matrix are normalized by default.
sd = autoselectmixsd(z,nv = 20);
L = normlikmatrix(z,sd = sd);
size(L)
First we run the mix-SQP algorithm a few times to precompile the relevant functions.
out = mixSQP(L,lowrank = "svd",maxiter = 10,verbose = false);
out = mixSQP(L,lowrank = "qr",maxiter = 10,verbose = false);
out = mixSQP(L,lowrank = "nothing",maxiter = 10,verbose = false);
Next, run the mix-SQP solver again with the SVD and QR approximations to the likelihood matrix, and with no approximation. The approximation tolerance is set very low, to 1e-10
.
@time outSVD = mixSQP(L,lowrank = "svd",pqrtol = 1e-10,verbose = false);
@time outQR = mixSQP(L,lowrank = "qr",pqrtol = 1e-10,verbose = false);
@time out = mixSQP(L,lowrank = "nothing",verbose = false);
You may see a slight improvement in the computation time with the QR and SVD approximations. The solutions using the low-rank approximations are still very close to the solution without an approximation:
mixobjective(L,outSVD["x"]) - mixobjective(L,out["x"]),
mixobjective(L,outQR["x"]) - mixobjective(L,out["x"])
Next, let's see what happens when we use the SQP algorithm to fit a mixture model to a much larger data set.
z = normtmixdatasim(round(Int,1e6));
As before, we compute the $n \times k$ likelihood matrix for a mixture of zero-centered normals. This time, we use a finer grid of $k = 100$ normal densities.
sd = autoselectmixsd(z,nv = 100);
L = normlikmatrix(z,sd = sd);
size(L)
As before, let's run the mix-SQP solver with the SVD and QR approximations, and with no approximation.
@time outSVD = mixSQP(L,lowrank = "svd",pqrtol = 1e-10,verbose = false);
@time outQR = mixSQP(L,lowrank = "qr",pqrtol = 1e-10,verbose = false);
@time out = mixSQP(L,lowrank = "none",verbose = false);
In the larger data set, the QR and SVD approximations yield much larger improvements in computation time, which is not surprising because the matrix-vector operations (particularly in computing the gradient and Hessian) dominate the computational cost.
As before, the solutions using the low-rank approximations are still close to the solution that is obtained without any approximation:
mixobjective(L,outSVD["x"]) - mixobjective(L,out["x"]),
mixobjective(L,outQR["x"]) - mixobjective(L,out["x"])
The section gives information about the computing environment used to generate the results contained in this notebook, including the version of Julia, and the versions of the Julia packages used here.
Pkg.status("Distributions");
Pkg.status("LowRankApprox");
versioninfo()