Code and data accompanying SuSiE manuscript (Wang et al, 2018)

Overview

This repository contains code and data resources to accompany our research paper:

Wang, G., Sarkar, A., Carbonetto, P., & Stephens, M. (2020). A simple new approach to variable selection in regression, with application to genetic fine mapping. Journal of the Royal Statistical Society: Series B (Statistical Methodology). https://doi.org/10.1111/rssb.12388

We provide several sets of resources:

  1. If you are primarily interested in applying SuSiE to your own data in a generic setting, please check out the susieR package. A good place to start is the vignettes.

  2. To help understanding how SuSiE works we demonstrate here code and results for the fine-mapping challenge presented in the "Numerical Comparsions" section of our paper. Additionally the notebook here implementing the example described in the "Background and motivations" section our the manuscript, illustrate with a simple example the inference problem which motivated the development of SuSiE.

  3. If you would like to reproduce the fine-mapping benchmark of SuSiE with other methods (CAVIAR, FINEMAP and DAP-G), please see here for our implementation in the Dynamic Statistical Comparison framework, and here to reproduce figures in the "Numerical Comparisons" section of the manuscript.

  4. If you would like to use SuSiE for fine-mapping of molecular traits similar to our data application of association analysis of splice QTL data please see our analysis of Li et al 2016 data for details: we provide a fine-mapping pipeline with SuSiE and DAP-G, and a splicing QTL enrichment analysis pipeline for SuSiE signals in functional regions in genome. Although not used by the manuscript there is an additional enrichment pipeline we provide for matched analysis for other molecular QTLs (not applicable to splicing QTL) as suggested in Li et al 2016. A series of commandlines were provided on the page to reproduce the data application section of the manuscript using these pipelines.

  5. You can explore the example on change-point problem to learn about application of SuSiE for more generic problems.

Software requirement and general instructions to reproduce SuSiE manuscript results

Please note that this setup instruction is meant to reproduce the software environment that we have used to generate results for our manuscript. It involves a specific, and now out-dated release of susieR, as well as many other software packages we have compared. If your interest is to analyze data with SuSiE or to evaluate its current, most up-to-date version, please ignore the docker setup instruction on this page and instead follow the setup instructions on susieR software repository.

In addition to susieR, software required to reproduce numerical comparisons in the manuscript are CAVIAR, FINEMAP and DAP-G, and a benchmarking tool dsc. To reproduce the splicing QTL data analysis a bioinformatics pipeline tool SoS is needed.

Of course with some work it is possible to setup the computational environment on a Linux or Mac computer to reproduce & extend our work. Alternatively, we have developed a docker image that includes all software components necessary to run the analyses, configured to specified versions used in the manuscript. The image can be used both for evaluating the manuscript and for use in production. Unfortunately due to potential licensing restrictions with the FINEMAP program we cannot distribute the docker container on dockerhub. However it is straightforward to build and the docker image by yourself. In the rest of this section we discuss building and using the docker image for the manuscript.

Docker can run on most popular operating systems (Mac, Windows and Linux) and cloud computing services such as Amazon Web Services and Microsoft Azure. If you have not used Docker before, you might want to read this to learn the basic concepts and understand the main benefits of Docker.

If you find a bug in any of these steps, please post an issue.

Download and install Docker

Download Docker (note that a free community edition of Docker is available), and install it following the instructions provided on the Docker website. Once you have installed Docker, check that Docker is working correctly by following Part 1 of the "Getting Started" guide. If you are new to Docker, we recommend reading the entire "Getting Started" guide.

Note: Setting up Docker requires that you have administrator access to your computer. Singularity is an alternative that accepts Docker images and does not require administrator access.

Build the docker image

Under the root of the repository where Dockerfile is located,

docker build -t susie/susie-paper .

It will take a while (> 30min) to download and install the system and required software.

Test the docker build

Run this alias command in the shell, which will be used below to run commands inside the Docker container:

alias susie-docker='docker run --security-opt label:disable -t --rm '\
'-P -h SuSiE -w $PWD -v $HOME:/home/$USER -v /tmp:/tmp -v $PWD:$PWD '\
'-u $UID:${GROUPS[0]} -e HOME=/home/$USER -e USER=$USER susie/susie-paper'

The -v flags in this command map directories between the standard computing environment and the Docker container. Since the analyses below will write files to these directories, it is important to ensure that:

  • Environment variables $HOME and $PWD are set to valid and writeable directories (usually your home and current working directories, respectively).

  • /tmp should also be a valid and writeable directory.

If any of these statements are not true, please adjust the alias accordingly. The remaining options only affect operation of the container, and so should function the same regardless of your operating system.

Next, run a simple command in the Docker container to check that has loaded successfully:

susie-docker uname -sn

If the container was successfully run, you should see this information about the Docker container outputted to the screen:

Linux SuSiE

You can also run these commands to show the information about the image downloaded to your computer and the container that has run (and exited):

docker image list
docker container list --all

Note: If you get error "Cannot connect to the Docker daemon. Is the docker daemon running on this host?" in Linux or macOS, see here for Linux or here for Mac for suggestions on how to resolve this issue.

Reproducing SuSiE manuscript results

Simply add susie-docker at the beginning of all commands documented in other documentation pages this repository to indicate the commands are to be executed from software in the docker image. For example to export benchmark code and run the numerical comparions, the documented command is:

./export.sos
dsc susie.dsc --target run_comparison -o toy_comparison

To run from docker,

susie-docker ./export.sos
susie-docker dsc susie.dsc --target run_comparison -o toy_comparison

Citing this repository

If you find any of the source code in this repository useful for your work, please cite our manuscript, Wang et al (2018). The full citation is given above. Please also cite the Zenodo archive for this repository:

Gao Wang, Abhishek Sarkar, Peter Carbonetto and Matthew Stephens (2018), Code and data accompanying SuSiE manuscript (Wang et al, 2018), version 1.0, Zenodo, doi:10.5281/zenodo.2368676.

License

Copyright (c) 2017-2018, Gao Wang, Abhishek Sarkar, Peter Carbonetto and Matthew Stephens.

All source code and software in this repository are made available under the terms of the MIT license.


© 2017-2018 authored by Gao Wang at Stephens Lab, The University of Chicago