Skip to content

Simple Suggestions To Increase Speed of Freebayes #58

@cbird808

Description

@cbird808

suggestion

apply a minimum coverage filter to the cov.stats prior to making the mapped.*.bed files, which will ultimately reduce the number of contigs genotyped

the minimum coverage value

either the number of individuals or 2x the number of individuals. There's really not much point in evaluating a contig if there's not at least 1 read per allele per locus per individual.

speedup

2x in my current situation, plus the time saved later in filtering

implementation

# calculate 2 x NumInd - 1
minCOV=$(echo $(($(wc -l namelist | cut -d" " -f1) * 2 - 1)))
# make file of contigs with low coverage
mawk -v minCOV=$minCOV '$4 < minCOV {print $1}' cov.stats | uniq > low.cov.contigs
# isolate contigs with high coverage
grep -f low.cov.contigs -vF cov.stats > low.cov.stats
# apply changes to cov.stats
mv low.cov.stats cov.stats
# cleanup intermediate files
rm low.cov.contigs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions