Hi,
Thank you for the nice and truly open source work you are doing!
I tested the code for VEP using a single example from https://github.com/calico/borzoi/blob/main/tutorials/legacy/score_variants/snps_expr.vcf, with the script https://github.com/calico/borzoi/blob/main/tutorials/legacy/score_variants/score_expr_sad.sh. However, I noticed that the size of the sad.h5 file generated was extremely large, approximately 450MB for just a single example.
Additionally, when I ran the code on my own VCF file containing about 135 examples, the resulting file size ballooned to 192GB. Am I making an error in my workflow, or is this file size expected?
Actually, I want to retain only ref and alt predictions and then compute the sad score manually. can this file size be decreased?
Hi,
Thank you for the nice and truly open source work you are doing!
I tested the code for VEP using a single example from https://github.com/calico/borzoi/blob/main/tutorials/legacy/score_variants/snps_expr.vcf, with the script https://github.com/calico/borzoi/blob/main/tutorials/legacy/score_variants/score_expr_sad.sh. However, I noticed that the size of the
sad.h5file generated was extremely large, approximately 450MB for just a single example.Additionally, when I ran the code on my own VCF file containing about 135 examples, the resulting file size ballooned to 192GB. Am I making an error in my workflow, or is this file size expected?
Actually, I want to retain only
refandaltpredictions and then compute the sad score manually. can this file size be decreased?