Pipeline that takes WGS fastq or bam files as input and then returns the mean telomere length as an output. Currently, two tools are included in this pipeline, telseq and computel.
-
If you haven't yet, install mamba or micromamba. (conda works as well, but I would recommend mamba/micromamba as they are a lot faster when installing)
-
Clone this repository and cd into it:
git clone https://github.com/gmanthey/variant-calling.git cd variant-calling -
Create a new environment from the environment specs file:
Using mamba:
mamba env create -f environment.yml
-
Copy the
config.yml.templatefile toconfig.yml -
Adjust the following parameters in the
config.ymlfile:
- fastq_dir: Path to the (trimmed) fastq files per individual.
- bam_dir: Path to the alignment (bam) files per individual.
- genome_size: The estimated genome size of the organism you're looking at (in bp)
- reference_assembly: Path + filename of the index file of the reference genome assembly.
- n_chr: An integer for the haploid number of chromosomes.
- species: The name of the species; alternatively any project ID.