GTEx V8 Multivariate Analysis

QTL analysis using European only data

In this notebook I document procedures analyzing eQTL and sQTL with MASH, using European only data from GTEx.

Input data

Summary statistics for EUR only eQTL and sQTL, from fastqtl analysis.

Data format conversion for MASH input

First I obtain a list of available genes and available summary stats data,

In [1]:
[global]
from glob import glob
parameter: ss_dir = path('/project2/mstephens/gaow/gtex_v8_eqtl_eur_only')
parameter: out_dir = path(f"{str(ss_dir).rstrip('/')}_output")
data_files = glob(f'{ss_dir:a}/*')

[get_meta_1]
with open(f"{out_dir:a}/{ss_dir:b}.sumstats_list", 'w') as f:
    f.write('\n'.join(data_files))

[get_meta_2]
input: data_files, group_by = 1, concurrent = True
output: f"{out_dir:a}/{_input:bn}.genes_list"
task: trunk_workers = 1, trunk_size = 2, walltime = '1h', mem = '1G', cores = 1, tags = f'{_output:bn}'
bash: expand = True
    zcat {_input} | cut -f 1 | tail -n +2 | sort -u > {_output}
    
[get_meta_3]
# obtain a list of loci that shows up in all conditions
input: group_by = 'all'
output: f"{out_dir:a}/{ss_dir:b}.genes_list"
python: expand = "${ }"
    data = []
    for item in [${_input:r,}]:
        data.append([x.strip() for x in open(item).readlines()])
    data = sorted(list(set.intersection(*map(set,data))))
    with open(${_output:r}, 'w') as f:
        f.write('\n'.join(data))
In [ ]:
sos run European_QTL.ipynb get_meta -c midway2.yml -q midway2 --ss-dir /project2/mstephens/gaow/gtex_v8_eqtl_eur_only
sos run European_QTL.ipynb get_meta -c midway2.yml -q midway2 --ss-dir /project2/mstephens/gaow/gtex_v8_sqtl_eur_only

Then I'm ready to run the data format conversion,

In [ ]:
[convert]
parameter: cols = list
bash: expand = True
    sos run fastqtl_to_mash.ipynb \
        --cwd {out_dir:a}/fastqtl_to_mash_output \
        --data-list {out_dir:a}/{ss_dir:b}.sumstats_list \
        --gene-list {out_dir:a}/{ss_dir:b}.genes_list \
        -c midway2.yml -q midway2 --common-suffix ".txt" \
        --cols {paths(cols)}
In [ ]:
sos run European_QTL.ipynb convert --ss-dir /project2/mstephens/gaow/gtex_v8_eqtl_eur_only --cols 8 9 7
sos run European_QTL.ipynb convert --ss-dir /project2/mstephens/gaow/gtex_v8_sqtl_eur_only --cols 7 8 6

Run MASH pipeline

In [ ]:
[mash]
bash: expand = True
    sos run mashr_flashr_workflow.ipynb mash \
    --cwd {out_dir:a}/mashr_flashr_workflow_output \
    --data {out_dir:a}/fastqtl_to_mash_output/{ss_dir:b}.mash.rds \
    --vhat mle -c midway2.yml -q midway2
In [ ]:
sos run European_QTL.ipynb mash --ss-dir /project2/mstephens/gaow/gtex_v8_eqtl_eur_only
sos run European_QTL.ipynb mash --ss-dir /project2/mstephens/gaow/gtex_v8_sqtl_eur_only

© 2018 Gao Wang, University of Chicago

Exported from analysis/European_QTL.ipynb committed by Gao Wang on Tue Feb 2 19:11:23 2021 revision 1, c5fe213