Skip to content

Feature Request: parallel matching for prescored variants #82

@yangyxt

Description

@yangyxt

When I try to run CADD on a VCF file with 200k variants, I found the prescore match step executed by extract_scored.py is pretty time consuming. I think maybe this step can be accelerated by parallel matching per chromosome.

I suggest split the prescore file to 24 pieces by chromosome and split the input VCF to pieces by chormosome as well. For each chromosome, perform the extract_scored.py once and let them perform in parallel.

If it is OK for you, I can offer a PR later. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions