A simple Snakemake pipeline to run munge_sumstats and LD score regression, including the download of all necessary dependencies.
Install Snakemake (tested with version 7). This step is described in this repository. Follow these two steps:
- Install Conda
- Install Snakemake via Conda
Add your own summary statistics to the file resources/traits.tsv. You can find some example records to modify. The columns are used as follows:
- trait_name → Name you would like to assign to the respective trait
- ss_path → Path where your summary statistics are stored
- snp → RSID column name
- a1_effect → Effect allele column name
- a2_non_effect → Non-effect allele column name
- add_munge_args → Extra parameters for
merge_sumstats - no_munge → If set to
1, the munge step will be skipped
Once configured, execute Snakemake within the root folder of the repository:
snakemake -c1 --use-conda
# Increase -c1 to e.g. -c4 to use more (here: 4) coresHeritabilities of individual traits and genetic correlations between all possible pairs of traits are calculated and stored in the results/ldsc/ folder.
Genetic correlations can be visualized using the script in the notebooks/ folder. To generate the plots:
- Run the bash commands given in
gather_genetic_correlations.sh - Run the R script
display_genetic_correlation.R
The Snakemake workflow is based on the rules from this repository.