Last updated: 2017-12-26

Code version: 1e53e98

Set up

library(microbenchmark)
library(mashr)
Loading required package: ashr

Equal S data

set.seed(1)
simdata.equal1 = simple_sims(5000,5,0.5)
simdata.equal2 = simple_sims2(5000, 0.5)
simdata.equal = list(B = rbind(simdata.equal1$B, simdata.equal2$B), 
                     Bhat = rbind(simdata.equal1$Bhat, simdata.equal2$Bhat), 
                     Shat = rbind(simdata.equal1$Shat, simdata.equal2$Shat))
# data contains 20000 rows
data.equal = set_mash_data(simdata.equal$Bhat, simdata.equal$Shat, alpha=0)
U.c.equal = cov_canonical(data.equal)
# data driven
m.1by1 = mash_1by1(data.equal)
strong = get_significant_results(m.1by1,0.05)
# center Z
data.equal.Z = simdata.equal$Bhat/simdata.equal$Shat
data.center = set_mash_data(Bhat = apply(as.matrix(data.equal.Z), 2, function(x) x - mean(x)), alpha = 0)
U.pca = cov_pca(data.center,5,strong)
U.e.equal = cov_ed(data.equal, U.pca, strong)

Non-Equal S data

set.seed(1)
simdata.diff1 = simple_sims(5000, 5, rep(c(0.5,0.4,0.1,1,0.2), 4000))
simdata.diff2 = simple_sims2(5000, rep(c(0.5,0.4,0.1,1,0.2), 4000))
simdata.diff = list(B = rbind(simdata.diff1$B, simdata.diff2$B), 
                    Bhat = rbind(simdata.diff1$Bhat, simdata.diff2$Bhat), 
                    Shat = rbind(simdata.diff1$Shat, simdata.diff2$Shat))
data.diff = set_mash_data(simdata.diff$Bhat, simdata.diff$Shat, alpha=0)
U.c.diff = cov_canonical(data.diff)
# data driven
m.1by1 = mash_1by1(data.diff)
strong = get_significant_results(m.1by1,0.05)
# center Z
data.diff.Z = simdata.diff$Bhat/simdata.diff$Shat
data.center = set_mash_data(Bhat = apply(as.matrix(data.diff.Z), 2, function(x) x - mean(x)), alpha = 0)
U.pca = cov_pca(data.center,5,strong)
U.e.diff = cov_ed(data.diff, U.pca, strong)

Compare likelihood

library(assertthat)
source('~/Documents/GitHub/mashr-stephens/R/likelihoods_scaleddata.R')
source('~/Documents/GitHub/mashr-stephens/R/likelihoods_origdata.R')
source('~/Documents/GitHub/mashr-stephens/R/mash.R')
source('~/Documents/GitHub/mashr-stephens/R/compute_covs.R')
source('~/Documents/GitHub/mashr-stephens/R/RcppExports.R')
source('~/Documents/GitHub/mashr-stephens/R/set_data.R')
source('~/Documents/GitHub/mashr-stephens/R/opt.R')
source('~/Documents/GitHub/mashr-stephens/R/posterior.R')
source('~/Documents/GitHub/mashr-stephens/R/posterior_common_cov.R')
source('~/Documents/GitHub/mashr-stephens/R/posterior_lowmem.R')
compute_likelihood = function(data, Ulist, algorithm.version=c('Rcpp','R')){
  algorithm.version = match.arg(algorithm.version)
  grid = autoselect_grid(data, sqrt(2))
  Ulist = normalize_Ulist(Ulist)
  xUlist = expand_cov(Ulist, grid, TRUE)
  J <- nrow(data$Bhat)
  P <- length(xUlist)
  lm <- calc_relative_lik_matrix(data, xUlist, algorithm.version)
  return(lm)
}

The time for computing likelihood for data with same se is

Unit: milliseconds

exper min lq mean median uq max neval
Rcpp 866.8975 943.1800 959.2484 957.6582 972.0100 1227.550 100
R 857.2996 945.0444 956.5874 958.6668 971.8977 1033.055 100

The time for computing likelihood for data with different se is

Unit: seconds

exper min lq mean median uq max neval
Rcpp 9.8438 10.0709 10.19134 10.1834 10.2815 10.7816 100
R 9.7953 10.0575 10.1800 10.1376 10.2890 10.8415 100

Comparisons:

  • Same SE vs Different SE: The computational time for different se data is longer.

  • Rcpp vs R: The computational time are similar.

However, from my experience, Rcpp tends to use shorter time in likelihood computation.

Compare posterior

The time for computing posterior for data with same se is

Unit: seconds

exper min lq mean median uq max neval
Rcpp 1.0243 1.1350 1.1753 1.1719 1.2050 1.4510 100
R 1.1636 1.2785 1.3250 1.3143 1.3707 1.6741 100

The time for computing posterior for data with different se is

Unit: seconds

exper min lq mean median uq max neval
Rcpp 11.2614 11.5210 11.6741 11.6312 11.773 12.2547 100
R 58.1236 58.6919 59.3430 59.0292 59.9362 61.2729 100

Comparisons:

  • Same SE vs Different SE: The computational time for different se data is much longer.

  • Rcpp vs R: Rcpp is faster.

Session information

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] assertthat_0.2.0       mashr_0.2-4            ashr_2.1-27           
[4] microbenchmark_1.4-2.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14             compiler_3.4.3          
 [3] git2r_0.20.0             plyr_1.8.4              
 [5] iterators_1.0.9          tools_3.4.3             
 [7] digest_0.6.13            evaluate_0.10.1         
 [9] tibble_1.3.4             gtable_0.2.0            
[11] lattice_0.20-35          rlang_0.1.6             
[13] Matrix_1.2-12            foreach_1.4.4           
[15] yaml_2.1.16              parallel_3.4.3          
[17] mvtnorm_1.0-6            stringr_1.2.0           
[19] knitr_1.17               REBayes_1.2             
[21] rprojroot_1.2            grid_3.4.3              
[23] rmarkdown_1.8            rmeta_2.16              
[25] ggplot2_2.2.1            magrittr_1.5            
[27] backports_1.1.2          scales_0.5.0            
[29] codetools_0.2-15         htmltools_0.3.6         
[31] MASS_7.3-47              colorspace_1.3-2        
[33] stringi_1.1.6            Rmosek_8.0.69           
[35] lazyeval_0.2.1           munsell_0.4.3           
[37] doParallel_1.0.11        pscl_1.5.2              
[39] truncnorm_1.0-7          SQUAREM_2017.10-1       
[41] ExtremeDeconvolution_1.3

This R Markdown site was created with workflowr