Last updated: 2017-01-15

Code version: 086c3157516dcce3907fd3fd6857a0aec1f35bfa

Outline

The simplest application of ash is to z scores (so shat =1 for all observations).

In this case we can exploit the fact that for similar z the “responsibilities” that are used in the EM algorithm will be similar to one another.

Eg suppose we evaluate the responsibilities for two different observations z1 < z2 and they are very similar. Then they will also be similar for other zs between z1 and z2. In this way we could end up only evaluating responsibilities for a very small fraction of all n z scores. Here we look at how similar the responsibilities are in a simple case.

library(ashr)
set.seed(100)
n=100000
z = rnorm(n,0,2)
#now sort z so that they are in order
z = z[order(abs(z))]

res <- ash(z,1,mixcompdist="normal",outputlevel=4)
lik = res$fit_details$matrix_lik
fitted_g  = get_fitted_g(res)

# set up the initial value of pi that we typically use in ash
normalize=function(x){x/sum(x)}
pi = rep(1/n, ncomp(fitted_g))
pi[1]=1
pi = normalize(pi)

resp = apply(t(lik) * pi,2,normalize)

The following plots shows that the responsibilities for the first half of the points are very close to one another [and close to (1,0,…0)]

plot(resp[1,],ylim=c(0.9,1))

plot(resp[1,],ylim=c(0.999,1))

So how many are within 1e-5 of each responsibility?

n_within_tol = function(i,tol=1e-5){sum(apply(abs(resp-resp[,i]),2,max)<tol)}
n_within_tol(1)
[1] 31280
n_within_tol(25000)
[1] 38054
n_within_tol(50000)
[1] 9558
n_within_tol(75000)
[1] 931
n_within_tol(87500)
[1] 52
n_within_tol(93750)
[1] 7

Session information

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ashr_2.1        workflowr_0.2.0 rmarkdown_1.3  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8       rstudioapi_0.6    knitr_1.15.1     
 [4] magrittr_1.5      REBayes_0.73      MASS_7.3-45      
 [7] doParallel_1.0.10 pscl_1.4.9        SQUAREM_2016.8-2 
[10] lattice_0.20-34   foreach_1.4.3     stringr_1.1.0    
[13] tools_3.3.1       parallel_3.3.1    grid_3.3.1       
[16] git2r_0.18.0      htmltools_0.3.5   iterators_1.0.8  
[19] assertthat_0.1    yaml_2.1.14       rprojroot_1.1    
[22] digest_0.6.10     Matrix_1.2-7.1    codetools_0.2-15 
[25] evaluate_0.10     stringi_1.1.2     Rmosek_7.1.2     
[28] backports_1.0.4   truncnorm_1.0-7  

This site was created with R Markdown