Last updated: 2018-01-28

Loading required package: ashr
corrplot 0.84 loaded

Generate data

The data is generated that some of them depend on two factors.

set.seed(50)
data = sim.flash(nsamp=1000, err_sd = 0.05)
  • flashr Factorization

The factor structures:

Apply mash without factor rank 1 covariance

The mash model puts the majority weights on tFlash. There are 4501 siginificant findings. The log likelihood is 1.820689610^{4}.

Apply mash with factor rank 1 covariance, without rank 2 matrices

Again, the mash model put a large postion of weights on tFlash. There are 4512 siginificant findings. The log likelihood is 2.690128710^{4}.

The mash model with rank 1 factor covariances has higher likelihood value.

Apply mash with factor rank 2 covariance

Method 1: Naive way

flash_get_pve(fb)
[1] 0.2725740 0.3605277 0.3604103 0.0000000

Add \(F_{s}L_{s}'L_{s}F_{s}'\), s is \(\{1,2\}, \{1,3\}, \{2,3\}\)

# every two factor combination
Flash12 = flash_get_l(fb, 1:2) %*% t(flash_get_f(fb,1:2))
Flash12 = t(Flash12) %*% Flash12
Flash13 = flash_get_l(fb, c(1,3)) %*% t(flash_get_f(fb,c(1,3)))
Flash13 = t(Flash13) %*% Flash13
Flash23 = flash_get_l(fb, 2:3) %*% t(flash_get_f(fb,2:3))
Flash23 = t(Flash23) %*% Flash23

U.flash.2.1 = c(mashr::cov_from_factors(t(as.matrix(flash_get_f(fb,1:3))), "Flash"),
            list("Flash12" = Flash12, "Flash13" = Flash13, "Flash23" = Flash23, "tFlash" = t(flash_get_lf(fb)) %*% flash_get_lf(fb) / nrow(mash_data$Bhat)))

U.ed.2.1 = cov_ed(mash_data, U.flash.2.1)
mash_model.2.1 = mash(mash_data, c(U.c,U.ed.2.1), algorithm.version = 'R', verbose = FALSE)
barplot(get_estimated_pi(mash_model.2.1), las=2, cex.names = 0.7)

The mash model identifies the covariacne structure correctly. There are 4516 siginificant findings. The log likelihood is 3.080212410^{4}.

With the correct covariance struture, the mash model has a much higher likelihood value. Adding rank 2 factor covariance improves the fitting.

This method is unrealistic if the number of factors is large.

Method 2: Clustering loadings

See Flash_loading_2

Session information

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] corrplot_0.84 mashr_0.2-4   ashr_2.2-3    flashr_0.4-3 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15             knitr_1.17              
 [3] magrittr_1.5             REBayes_1.2             
 [5] MASS_7.3-47              doParallel_1.0.11       
 [7] pscl_1.5.2               SQUAREM_2017.10-1       
 [9] lattice_0.20-35          ExtremeDeconvolution_1.3
[11] foreach_1.4.4            plyr_1.8.4              
[13] stringr_1.2.0            tools_3.4.3             
[15] parallel_3.4.3           grid_3.4.3              
[17] rmeta_2.16               htmltools_0.3.6         
[19] iterators_1.0.9          assertthat_0.2.0        
[21] yaml_2.1.16              rprojroot_1.2           
[23] digest_0.6.13            Matrix_1.2-12           
[25] codetools_0.2-15         evaluate_0.10.1         
[27] rmarkdown_1.8            stringi_1.1.6           
[29] compiler_3.4.3           Rmosek_8.0.69           
[31] backports_1.1.2          mvtnorm_1.0-7           
[33] truncnorm_1.0-7         

This R Markdown site was created with workflowr