Last updated: 2022-01-24

Checks: 2 0

Knit directory: rss/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 7f4dd4a. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (rmd/function.Rmd) and HTML (docs/function.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 7f4dd4a Xiang Zhu 2022-01-24 wflow_publish("rmd/function.Rmd")
html bab3f58 Xiang Zhu 2020-06-24 Build site.
html 843a7d9 Xiang Zhu 2020-06-23 Build site.
Rmd 3b18fce Xiang Zhu 2020-06-23 wflow_publish("rmd/function.Rmd")

Markov chain Monte Carlo (MCMC)

Details of MCMC algorithms for rss_bvsr.m and rss_bslmm.m are available in Supplementary Appendix B of Zhu and Stephens (2017) Details of MCMC algorithms for rss_ash.m are available in this unpublished note1. Note that only rss_bvsr.m and rss_bslmm.m were used to generate results in Zhu and Stephens (2017).

rss_bvsr.m

Fit the RSS-BVSR model that consists of RSS likelihood and BVSR prior (Guan and Stephens, 2011):

\[ \begin{aligned} \widehat{\boldsymbol\beta} &\sim {\cal N}({\bf SRS}^{-1}{\boldsymbol\beta},{\bf SRS}),\\ \beta_j &\sim \pi\cdot{\cal N}(0,\sigma_B^2) + (1-\pi)\cdot\delta_0, \end{aligned} \]

using a Metropolis-Hastings algorithm.

rss_bslmm.m

Fit the RSS-BSLMM model that consists of RSS likelihood and BSLMM prior (Zhou et al, 2013):

\[ \begin{aligned} \widehat{\boldsymbol\beta} &\sim {\cal N}({\bf SRS}^{-1}{\boldsymbol\beta},{\bf SRS}),\\ \beta_j &\sim \pi\cdot{\cal N}(0,\sigma_B^2+\sigma_P^2) + (1-\pi)\cdot{\cal N}(0,\sigma_P^2), \end{aligned} \]

using a component-wise MCMC algorithm.

rss_ash.m

Fit the RSS-ASH model that consists of RSS likelihood and ASH prior (Stephens, 2017):

\[ \begin{aligned} \widehat{\boldsymbol\beta} &\sim {\cal N}({\bf SRS}^{-1}{\boldsymbol\beta},{\bf SRS}),\\ \beta_j &\sim \pi_0 \cdot \delta_0 + {\textstyle\sum}_{k=1}^K \pi_k \cdot {\cal N}(0,\sigma_k^2), \end{aligned} \]

using a component-wise MCMC algorithm.

Variational Bayes (VB)

Details of VB algorithms, SQUAREM accelerator and parallel implementation are available in Supplementary Notes of Zhu and Stephens (2018). Note that only rss_varbvsr_squarem.m and rss_varbvsr_bigmem_squarem.m were used to generate results in Zhu and Stephens (2018). The other functions were developed merely for testing and benchmarking.

rss_varbvsr.m

Fit the following extended RSS-BVSR model

\[ \begin{aligned} \widehat{\boldsymbol\beta} &\sim {\cal N}({\bf SRS}^{-1}{\boldsymbol\beta},{\bf SRS}),\\ \beta_j &\sim \pi_j\cdot{\cal N}(0,\sigma_j^2) + (1-\pi_j)\cdot\delta_0, \end{aligned} \]

using a mean-field VB algorithm. The VB algorithm largely follows Carbonetto and Stephens (2012). This is an extended RSS-BVSR model because each SNP \(j\) can have its own hyper-parameters \(\{\pi_j,\sigma_j^2\}\), whereas the standard RSS-BVSR model assumes that all SNPs share the same hyper-parameters \(\{\pi,\sigma_B^2\}\).

rss_varbvsr_squarem.m

This is a variant of rss_varbvsr.m with the SQUAREM accelerator (Varadhan and Roland, 2008) added.

rss_varbvsr_parallel.m

This is a parallel implementation of rss_varbvsr.m based on MATLAB Parallel Computing Toolbox.

rss_varbvsr_pasquarem.m

This is a parallel implementation of rss_varbvsr_squarem.m based on MATLAB Parallel Computing Toolbox.

rss_varbvsr_bigmem.m

This is a memory-efficient implementation of rss_varbvsr_parallel.m.

rss_varbvsr_bigmem_squarem.m

This is a memory-efficient implementation of rss_varbvsr_pasquarem.m.

Miscellaneous

import_1000g_vcf.sh

Output 1000 Genomes phased haplotypes of a given list of SNPs in IMPUTE reference-panel format.

compute_pve.m

Use GWAS summary data to estimate PVE (or SNP heritability), a quantity defined by Equation 2.10 in Guan and Stephens (2011). This function corresponds to Equation 3.7 in Zhu and Stephens (2017).

band_storage.m

Convert a symmetric, banded matrix to a compact matrix in such a way that only the main diagonal and the nonzero super-diagonals are stored. This function is used to reduce the file size of a large LD matrix.

find_bandwidth.m

Find the bandwidth of a symmetric, banded matrix.

get_corr.m

Compute linkage disequilibrium (LD) matrix using the shrinkage estimator proposed in Wen and Stephens (2010). This function is also implemented in an R package ldshrink.

data_maker.m

Simulate phenotype data from the genome-wide multiple-SNP model described in Zhou et al (2013), and then compute the single-SNP summary statistics for each SNP. This function was used in some simulation studies of Zhu and Stephens (2017).

enrich_datamaker.m

Simulate phenotype data from the genetic association enrichment model described in Carbonetto and Stephens (2013), and then compute the single-SNP summary statistics for each SNP. This function was used in some simulation studies of Zhu and Stephens (2018).

null_single.m & null_template.m

Fit genome-wide multiple-SNP “baseline model” to single-SNP summary data, using rss/src_vb functions. These scripts were used in data analyses of Zhu and Stephens (2018).

gsea_wrapper.m & gsea_template.m

Fit genome-wide multiple-SNP “enrichment model” to single-SNP summary data, using rss/src_vb functions. These scripts were used in data analyses of Zhu and Stephens (2018).

null_wrapper_fixsb.m & gsea_wrapper_fixsb.m

Fit genome-wide multiple-SNP “baseline model” and “enrichment model” to single-SNP summary data, using a fixed prior variance of causal genetic effects (\(\sigma_B^2\)) in rss/src_vb functions. These scripts were used in simulation studies of Zhu and Stephens (2018).

ash_lrt_31traits.R

Compute a simple likelihood ratio as a sanity check for the more complicated enrichment analysis method developed in Zhu and Stephens (2018). This likelihood ratio calculation is based on an R package ashr.


Footnotes:

  1. There is a missing multiplier term \({\omega_k}/{\omega_1}\) on the right-hand side of Equation 5.1 in this note (page 4). However, this missing term was correctly reflected in the source codes of rss_ash.m, in particular, Lines 80-81 of update_bz.m. We thank Geyu Zhou for pointing out this typo on 01/14/2019.