Advised by Prof. Matthew Stephens.
Thanks to Dr. Peter Carbonetto for the substantial instruction and advice.
Project page [https://stephenslab.github.io/susie-mixture/]
August 2, 2019
Advised by Prof. Matthew Stephens.
Thanks to Dr. Peter Carbonetto for the substantial instruction and advice.
Project page [https://stephenslab.github.io/susie-mixture/]
SUm of SIngle Effect model
Linear regression
\[ \textbf y=\textbf X\textbf b+\textbf e,\textbf e\sim N(0,\sigma^2I_n). \]
Number of variables \(p=4\), number of observations \(n\) is large.
Variables are highly correlated, specifically \(\textbf x_1=\textbf x_2\) and \(\textbf x_3=\textbf x_4\).
Assume we know that exactly 2 variables out of the 4 are relevent.
\[ \begin{aligned} &\textbf y= \textbf X\textbf b+\textbf e,\textbf e\sim N(0,\sigma^2I_n),\\ &\textbf b= b\boldsymbol \gamma,\\ &\boldsymbol\gamma\sim Mult(1,\boldsymbol \pi),\\ &b\sim N(0,\sigma_0^2). \end{aligned} \]
\[ \begin{aligned} &\textbf y= \textbf X\textbf b+\textbf e,\textbf e\sim N(0,\sigma^2I_n),\\ &\textbf b= \sum_{l=1}^L\textbf b_l=\sum_{l=1}^Lb_l\boldsymbol \gamma_l,\\ &\boldsymbol\gamma_l\sim Mult(1,\boldsymbol \pi),\\ &b_l\sim N(0,\sigma_l^2). \end{aligned} \]
Marginal prior distribution of \(b_i\): spike-and-slab.
\(susie(\textbf X,\textbf y,L)\) returns the PIP.
\[ \begin{aligned} &\textbf y= \textbf X\textbf b+\textbf e,\textbf e\sim N(0,\sigma^2I_n),\\ &\textbf b= \sum_{l=0}^L\textbf b_l=\textbf b_0+\sum_{l=1}^Lb_l\boldsymbol \gamma_l,\\ &\boldsymbol\gamma_l\sim Mult(1,\boldsymbol \pi),b_l\sim N(0,\sigma_l^2),\forall l\geq 1,\\ &\textbf b_0\sim N(0,\sigma_0^2I_p). \end{aligned} \]
Marginal prior distribution of \(b_i\): mixture-Gaussian distribution.
Let
\[ H=\frac{\sigma^2_0}{\sigma^2}\textbf X\textbf X^T+I,L L^T= H. \]
And transform the data
\[ \tilde {\textbf X}=L^{-1}\textbf X,\tilde {\textbf y}=L^{-1}\textbf y. \]
Then \(susie(\tilde{\textbf X},\tilde{\textbf y}, L)\) yield the result for SuSiE-mixture model.
evidence lowerbound
(ELBO)False Discovery Rate
[1] Wang, Gao, Abhishek K. Sarkar, Peter Carbonetto, and Matthew Stephens. "A simple new approach to variable selection in regression, with application to genetic fine-mapping." bioRxiv (2018): 501114.
[2] Blei, David M., Alp Kucukelbir, and Jon D. McAuliffe. "Variational inference: A review for statisticians." Journal of the American Statistical Association 112, no. 518 (2017): 859-877.
[3] Carbonetto, Peter, and Matthew Stephens. "Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies." Bayesian analysis 7, no. 1 (2012): 73-108.
[4] Guan, Yongtao, and Matthew Stephens. "Bayesian variable selection regression for genome-wide association studies and other large-scale problems." The Annals of Applied Statistics 5, no. 3 (2011): 1780-1815.
[5] Zhou, Xiang, Peter Carbonetto, and Matthew Stephens. "Polygenic modeling with Bayesian sparse linear mixed models." PLoS genetics 9, no. 2 (2013): e1003264.