**Last updated:** 2017-01-17

**Code version:** efcd39d916a87f7322f9c515c668905c36a86f9a

First, we load the necessary libraries.

`library(REBayes)`

`## Loading required package: Matrix`

```
library(ashr)
library(ggplot2)
```

The following simulation assumes \(\beta \sim N(0,\sigma_{\sf betasd}^2)\) and with \(\hat{\beta}\) having standard deviation 1.

```
timed_sims = function(ash.args,nsim = 20,nmin = 100,nmax = 1e3,betasd = 1) {
n = 10^seq(log10(nmin),log10(nmax),length = nsim)
elapsed.time = rep(0,nsim)
for (i in 1:nsim) {
set.seed(i)
with(ash.args,cat(sprintf("%6s %13s %3d\n",method,optmethod,i)))
betahat = rnorm(n[i],sd = sqrt(1 + betasd^2))
elapsed.time[i] =
system.time(do.call(ash,args = modifyList(ash.args,
list(betahat = betahat, sebetahat = 1))))[3]
}
return(data.frame(elapsed.time = elapsed.time,seed = 1:nsim,n = n))
}
```

Now run a simulation study, for \(n\) (number of tests; \(p\) in paper) in range 10 to 100,000. Note that the warnings are being generated by the EM algorithm in big problems due to lack of convergence.

```
df = data.frame()
cat("method --optmethod-- sim\n")
for(method in c("fdr","shrink")) {
for(optmethod in c("mixIP","cxxMixSquarem")) {
df = rbind(df,data.frame(method = method,optmethod = optmethod,
timed_sims(list(method = method,optmethod = optmethod),
nsim = 50,nmin = 10,nmax = 1e5)))
}
}
```

```
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
Warning in estimate_mixprop(data, g, prior, optmethod = optmethod, control
= control): Optimization failed to converge. Results may be unreliable. Try
increasing maxiter and rerunning.
```

`cat("\n")`

Summarize the model fitting results in a table.

`cat("Average computation time (in seconds):\n")`

`Average computation time (in seconds):`

```
print(as.table(by(df,df[c("method","optmethod")],
function (x) mean(x$elapsed.time))))
```

```
optmethod
method mixIP cxxMixSquarem
fdr 6.800 23.699
shrink 5.972 27.931
```

Now plot time as a function of \(n\):

```
qplot(x = n,y = elapsed.time,data = df,col = optmethod,facets = .~method,
ylab = "elapsed time (s)")
```

Zoom-in on “small” problems with \(n < 5000\).

```
qplot(x = n,y = elapsed.time, data = subset(df,n < 5000),
col = optmethod,facets = .~method,ylab = "elapsed time (s)")
```

The IP method clearly scales to large problem much better than EM. It is faster, and also more reliable (sometimes reaches higher log-likelihood than EM, never smaller); see checkIP.html.

However, for small problems (n < 5000) the EM is adequate for many practical purposes, solving within a few seconds. This is particularly true for the penalty term (method = fdr), which helps the EM converge, presumably because it is helping identifiability, removing the large flat parts of the likelihood objective function.

Indeed, for small problems with the penalty term (n < 2000) EM with squarem acceleration is a little faster in these comparisons than IP.

`sessionInfo()`

```
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.14 ggplot2_2.1.0 ashr_2.0.4 REBayes_0.63
[5] Matrix_1.2-7.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.8 magrittr_1.5 MASS_7.3-45
[4] munsell_0.4.3 doParallel_1.0.10 pscl_1.4.9
[7] colorspace_1.2-7 SQUAREM_2016.8-2 lattice_0.20-34
[10] foreach_1.4.3 plyr_1.8.4 stringr_1.1.0
[13] tools_3.3.2 parallel_3.3.2 grid_3.3.2
[16] gtable_0.2.0 htmltools_0.3.5 iterators_1.0.8
[19] assertthat_0.1 yaml_2.1.13 rprojroot_1.1
[22] digest_0.6.10 reshape2_1.4.1 formatR_1.4
[25] codetools_0.2-15 evaluate_0.10 rmarkdown_1.2
[28] labeling_0.3 stringi_1.1.2 Rmosek_7.1.3
[31] scales_0.4.0 backports_1.0.3 truncnorm_1.0-7
```