Last updated: 2019-08-01

Checks: 2 0

Knit directory: susie-mixture/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Repository version: 746720f

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/.RData
    Ignored:    analysis/.Rhistory

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File	Version	Author	Date	Message
html	034283b	Zhengyang Fang	2019-07-22	Build site.
html	6d9cbaa	Zhengyang Fang	2019-07-22	Build site.
html	2a474f9	Zhengyang Fang	2019-07-22	Build site.
html	c72a707	Zhengyang Fang	2019-07-22	Build site.
html	8092e5a	Zhengyang Fang	2019-07-19	Build site.
html	5d3e3ef	Zhengyang Fang	2019-07-11	Build site.
html	3eee187	Zhengyang Fang	2019-07-03	Build site.
html	f303e14	Zhengyang Fang	2019-07-01	Build site.
html	def2b27	Zhengyang Fang	2019-06-28	Build site.
html	05f778b	Zhengyang Fang	2019-06-28	Build site.
html	3f6dd45	Zhengyang Fang	2019-06-26	Build site.
Rmd	fd0f282	Zhengyang Fang	2019-06-26	wflow_publish(“SuSiE_summary.Rmd”)
html	9f3f1a6	Zhengyang Fang	2019-06-23	Build site.
Rmd	485bef4	Zhengyang Fang	2019-06-23	wflow_publish(“SuSiE_summary.Rmd”)

Summary of SuSiE (Sum of Single Effects)

Reference: Wang et al, 2012

I. Model

\(\textbf y=\textbf X\textbf b+\textbf e,\textbf e\sim N(0,\sigma^2 I_n),\textbf X,\textbf y\in \mathbb R^n,\textbf b\in \mathbb R^p\).

\(\textbf b=\sum_{l=1}^L b_l\boldsymbol \gamma_l,\boldsymbol \gamma_l\sim Mult(1,\pi),b_l\sim N(0,\sigma_{bl}^2)\).

Here \(L,\pi\) are given and fixed.

We want to estimate the posterior inclusion probability (PIP)

\[ \alpha_j:=\mathbb P(\beta_j\neq 0|\textbf X,\textbf y), \]

and the posterior mean \(\mu_{1j}\) and variance \(\sigma_{1j}^2\) for all \(1\leq j\leq p\).

II. Simple version: `single effect regression` model (SER)

Here we assume \(L=1\). We introduce SER model because fitting SER model is an important step in fitting SuSiE.

Model

\(\textbf y=\textbf X\beta+\textbf e,\textbf e\sim N(0,\sigma^2 I_n),\textbf X,\textbf y\in \mathbb R^n,\beta\in\mathbb R\).

\(\beta=b\boldsymbol \gamma,\gamma\sim Mult(1,\pi),b\sim N(0,\sigma_b^2)\).

Here \(\pi\) is given and fixed.

Goal: find

\[ PIP_k=\mathbb P(\gamma_k=1|X,y). \]

Method

Assume the variance \(\sigma^2\) and \(\sigma_b^2\) are known. Calculate the Bayes Factor

\[ BF(y,X;\sigma^2,\sigma_b^2)=... \]

Also, the posterior distribution

\[ \beta_k|X_k,y,\sigma^2,\sigma_b^2,\gamma_k=1\sim N(\mu_{1k},\sigma_{1k}^2). \]

The bayes factor, posterior mean \(\mu_{1k}\) and variance \(\sigma_{1k}^2\) all have a close form, and are easy to compute.

Therefore, for given \(\sigma^2,\sigma_b^2\), we have

\[ \alpha_k=\mathbb P(\gamma_k=1|X,y,\sigma^2,\sigma_b^2)=\frac{BF(y,X_k;\sigma^2,\sigma_b^2)\cdot\pi_k}{\sum_{j=1}^p BF(y,X_j;\sigma^2,\sigma_b^2)\cdot\pi_j}. \]

This is also easy to compute. By putting everything above together, we have a function SER with input \((X,y,\sigma^2,\sigma^2_b)\), whic outputs the important parameters of the posterior distribution \((\boldsymbol\alpha=(\alpha_1,\alpha_2,\dots,\alpha_k),\boldsymbol\mu_1=(\mu_{11},\mu_{12},\dots,\mu_{1p}),\boldsymbol\sigma_1^2=(\sigma^2_{11},\sigma^2_{12},\dots,\sigma^2_{1p}))\).

III. Fitting `SuSiE`: `Iterative Bayesian stepwise selection` (IBSS)

Algorithm

For given data \(\textbf X,\textbf y\), hyperparameters \(\sigma^2,\boldsymbol \sigma_0^2\), number of effects \(L\).

Initialize \(\bar{\textbf b_l}=0\) for all \(1\leq l\leq L\) (or any other initialization)
Repeat until converge
- For i in 1,\(\dots\),L do
  - \(\textbf r_l\leftarrow \textbf y-\sum_{l^\prime\neq l}\textbf X\bar{\textbf b_{l^\prime}}\)
  - \((\boldsymbol\alpha_l,\boldsymbol \mu_{1l},\boldsymbol \sigma_{1l})\leftarrow SER(\textbf r_l,\textbf X;\sigma^2,\sigma_{0l}^2)\)
  - \(\bar{\textbf b_l}\leftarrow \boldsymbol{\alpha}_l\circ \boldsymbol{\mu}_{1l}\)
Return \((\boldsymbol\alpha_l,\boldsymbol \mu_{1l},\boldsymbol \sigma_{1l})\) for all \(1\leq l\leq L\).

A variational inference explanation

Variational inference finds an approximation \(q(\textbf b_1,\dots,\textbf b_L)\) to the posterior distribution \(p_{post}:=p(\textbf b_1,\dots,\textbf b_L|\textbf y)\), which minimizes the KL-divergence from \(q\) to \(p_post\), \(D_{KL}(q,p_{post})\).

It can be hard to compute, but we can write it as

\[ D_{KL}(q,p_{post})=\log p(\textbf y|\sigma^2,\boldsymbol \sigma_0^2)-F(q;\sigma^2,\boldsymbol\sigma_0^2), \]

where \(F\) is known as the evidence lower bound (ELBO), and it is easy to compute.

SuSiE summary

Zhengyang Fang

June 23, 2019

Summary of SuSiE (Sum of Single Effects)

I. Model

II. Simple version: `single effect regression` model (SER)

Model

Goal: find

Method

III. Fitting `SuSiE`: `Iterative Bayesian stepwise selection` (IBSS)

Algorithm

A variational inference explanation

SuSiE summary

Zhengyang Fang

June 23, 2019

Summary of SuSiE (Sum of Single Effects)

I. Model

II. Simple version: single effect regression model (SER)

Model

Goal: find

Method

III. Fitting SuSiE: Iterative Bayesian stepwise selection (IBSS)

Algorithm

A variational inference explanation

II. Simple version: `single effect regression` model (SER)

III. Fitting `SuSiE`: `Iterative Bayesian stepwise selection` (IBSS)