Last updated: 2017-01-15
Code version: 086c3157516dcce3907fd3fd6857a0aec1f35bfa
The simplest application of ash is to z scores (so shat =1 for all observations).
In this case we can exploit the fact that for similar z the “responsibilities” that are used in the EM algorithm will be similar to one another.
Eg suppose we evaluate the responsibilities for two different observations z1 < z2 and they are very similar. Then they will also be similar for other zs between z1 and z2. In this way we could end up only evaluating responsibilities for a very small fraction of all n z scores. Here we look at how similar the responsibilities are in a simple case.
library(ashr)
set.seed(100)
n=100000
z = rnorm(n,0,2)
#now sort z so that they are in order
z = z[order(abs(z))]
res <- ash(z,1,mixcompdist="normal",outputlevel=4)
lik = res$fit_details$matrix_lik
fitted_g = get_fitted_g(res)
# set up the initial value of pi that we typically use in ash
normalize=function(x){x/sum(x)}
pi = rep(1/n, ncomp(fitted_g))
pi[1]=1
pi = normalize(pi)
resp = apply(t(lik) * pi,2,normalize)
The following plots shows that the responsibilities for the first half of the points are very close to one another [and close to (1,0,…0)]
plot(resp[1,],ylim=c(0.9,1))
plot(resp[1,],ylim=c(0.999,1))
So how many are within 1e-5 of each responsibility?
n_within_tol = function(i,tol=1e-5){sum(apply(abs(resp-resp[,i]),2,max)<tol)}
n_within_tol(1)
[1] 31280
n_within_tol(25000)
[1] 38054
n_within_tol(50000)
[1] 9558
n_within_tol(75000)
[1] 931
n_within_tol(87500)
[1] 52
n_within_tol(93750)
[1] 7
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ashr_2.1 workflowr_0.2.0 rmarkdown_1.3
loaded via a namespace (and not attached):
[1] Rcpp_0.12.8 rstudioapi_0.6 knitr_1.15.1
[4] magrittr_1.5 REBayes_0.73 MASS_7.3-45
[7] doParallel_1.0.10 pscl_1.4.9 SQUAREM_2016.8-2
[10] lattice_0.20-34 foreach_1.4.3 stringr_1.1.0
[13] tools_3.3.1 parallel_3.3.1 grid_3.3.1
[16] git2r_0.18.0 htmltools_0.3.5 iterators_1.0.8
[19] assertthat_0.1 yaml_2.1.14 rprojroot_1.1
[22] digest_0.6.10 Matrix_1.2-7.1 codetools_0.2-15
[25] evaluate_0.10 stringi_1.1.2 Rmosek_7.1.2
[28] backports_1.0.4 truncnorm_1.0-7
This site was created with R Markdown