A brief comparison of speed of Interior Point and EM algorithms

Last updated: 2017-01-17

Code version: efcd39d916a87f7322f9c515c668905c36a86f9a

First, we load the necessary libraries.

library(REBayes)

## Loading required package: Matrix

library(ashr)
library(ggplot2)

Set up simulation

The following simulation assumes \(\beta \sim N(0,\sigma_{\sf betasd}^2)\) and with \(\hat{\beta}\) having standard deviation 1.

timed_sims = function(ash.args,nsim = 20,nmin = 100,nmax = 1e3,betasd = 1) {
  n = 10^seq(log10(nmin),log10(nmax),length = nsim)
  elapsed.time = rep(0,nsim)
  for (i in 1:nsim) {
    set.seed(i)
    with(ash.args,cat(sprintf("%6s %13s %3d\n",method,optmethod,i)))
    betahat = rnorm(n[i],sd = sqrt(1 + betasd^2))
    elapsed.time[i] =
      system.time(do.call(ash,args = modifyList(ash.args,
        list(betahat = betahat, sebetahat = 1))))[3]
  }
  return(data.frame(elapsed.time = elapsed.time,seed = 1:nsim,n = n))
}

Now run a simulation study, for \(n\) (number of tests; \(p\) in paper) in range 10 to 100,000. Note that the warnings are being generated by the EM algorithm in big problems due to lack of convergence.

df = data.frame()
cat("method --optmethod-- sim\n")
for(method in c("fdr","shrink")) {
  for(optmethod in c("mixIP","cxxMixSquarem")) {
    df = rbind(df,data.frame(method = method,optmethod = optmethod,
           timed_sims(list(method = method,optmethod = optmethod),
                           nsim = 50,nmin = 10,nmax = 1e5)))
  }
}

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.

cat("\n")

Summarize the model fitting results in a table.

cat("Average computation time (in seconds):\n")

Average computation time (in seconds):

print(as.table(by(df,df[c("method","optmethod")],
                  function (x) mean(x$elapsed.time))))

        optmethod
method    mixIP cxxMixSquarem
  fdr     6.800        23.699
  shrink  5.972        27.931

Results

Now plot time as a function of \(n\):

qplot(x = n,y = elapsed.time,data = df,col = optmethod,facets = .~method,
      ylab = "elapsed time (s)")

Zoom-in on “small” problems with \(n < 5000\).

qplot(x = n,y = elapsed.time, data = subset(df,n < 5000),
      col = optmethod,facets = .~method,ylab = "elapsed time (s)")

Summary

The IP method clearly scales to large problem much better than EM. It is faster, and also more reliable (sometimes reaches higher log-likelihood than EM, never smaller); see checkIP.html.

However, for small problems (n < 5000) the EM is adequate for many practical purposes, solving within a few seconds. This is particularly true for the penalty term (method = fdr), which helps the EM converge, presumably because it is helping identifiability, removing the large flat parts of the likelihood objective function.

Indeed, for small problems with the penalty term (n < 2000) EM with squarem acceleration is a little faster in these comparisons than IP.

Session information

sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] knitr_1.14     ggplot2_2.1.0  ashr_2.0.4     REBayes_0.63  
[5] Matrix_1.2-7.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8       magrittr_1.5      MASS_7.3-45      
 [4] munsell_0.4.3     doParallel_1.0.10 pscl_1.4.9       
 [7] colorspace_1.2-7  SQUAREM_2016.8-2  lattice_0.20-34  
[10] foreach_1.4.3     plyr_1.8.4        stringr_1.1.0    
[13] tools_3.3.2       parallel_3.3.2    grid_3.3.2       
[16] gtable_0.2.0      htmltools_0.3.5   iterators_1.0.8  
[19] assertthat_0.1    yaml_2.1.13       rprojroot_1.1    
[22] digest_0.6.10     reshape2_1.4.1    formatR_1.4      
[25] codetools_0.2-15  evaluate_0.10     rmarkdown_1.2    
[28] labeling_0.3      stringi_1.1.2     Rmosek_7.1.3     
[31] scales_0.4.0      backports_1.0.3   truncnorm_1.0-7

A brief comparison of speed of Interior Point and EM algorithms

Matthew Stephens

2016-01-26

Set up simulation

Results

Summary

Session information