Skip to content

Empty output vcf and bam #17

@andosl

Description

@andosl

Hi,

I am using methphaser as a singularity image on an HPC. I have made the singularity image work with the test data, so shouldn't be any technical issues with the image.

I am doing adaptive sequencing of a 1,2 Mbp region on chromosome 14 (the IGH locus). I am assembling the reads to a personal reference of this region, and then I re-map the ROI-reads to this personal reference. This works very well for my purpose, but when I am adding methphaser to the pipeline it yields only empty vcf and bam files (only headers). Shouldn't it at least contain the information in the input bam and input vcf files? I am not getting any errors when running methphaser.

I would be extremely grateful if you could help me troubleshoot.

Below is truncated input bam file with frist two sequences, truncated input vcf file, gtf file and truncated fast-reference:

Truncated BAM file:

750c6e3f-9e8c-44b9-9e31-fc66c1b1fc2c 16 contig_1 1 60 1189S416M1I1717M1I1167M5D438M1D86M1I193M2D234M2D285M1I59M1D36M2D23M1I365M1D1397M1I793M1D8M1I4M1D198M2D12M1D1249M1D5M1I11M102S * 0 0 CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC&')+069:78A?DHBEDFJGD>C>G=:8400.---,'')255676;<;:<5>?@BFBCIDCBABCDGEBBKCED@=CBECB>DCF@@DCCFCA;@@C=?>CAD>><A@CB?BAE>A?ECFB?@A>CA?>B@EB?D-5640...0>?@>D:9*)'&&%$$$$$$$$(()*,/8...-.03656=>>=>??ESGJMFMGIBEEGHEHKPIG<=6/.4..-,+))-2DGIFGD:'&)%$$$%&&'+++'%%&%&(+,2223-,,*((&$$$%&''+++**) NM:i:42 ms:i:17172 AS:i:17172 nn:i:0 tp:A:P cm:i:1027 s1:i:5636 s2:i:324 de:f:0.0039 SA:Z:contig_1,1578,-,2107M7D7888S,1,83; rl:i:4350 CO:Z:MM:Z:C+h?,4,2,4,1,0,0,1,0,1,0,0,0,1,13,13,59,20,42,26,28,24,11,90,4,28,0,38,11,34,1,32,14,23,48,27,17,68,2,5,0,4,126,0,0,3,1,3,1,4,2,9,0,2,2,4,0,5,2,13,0,1,0,0,5,0,8,2,0,1,0,1,0,8,2,0,1,0,1,0,4,2,0,1,0,8,2,0,1,0,1,0,4,2,0,1,0,8,2,0,1,0,1,0,3,2,0,1,0,9,2,0,1,0,1,0,4,2,0,1,0,9,2,0,0,0,9,2,0,0,0,9,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,3,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,2,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,0,0,4,2,0,0,0,1,2,15,0,0,0,0,0; ML:B:C,36,8,165,2,1,16,3,3,2,2,3,264,92,253,254,253,235,248,253,254,255,253,254,190,240,248,250,206,239,9,182,209,4,221,183,65,185,15,242,0,0,241,254,255,254,244,245,227,250,250,254,254,196,230,252,65,248,254,223,248,129,252,204,250,254,253,208,123,242,253,247,254,250,13,60,196,217,255,255,255,240,116,250,254,255,68,92,0,86,119,249,255,255,254,234,252,251,248,254,253,33,238,224,254,252,253,20,1,225,171,22,182,252,253,254,247,0,242,245,254,149,52,242,239,243,54,50,62,58,123,107,134,177,248,254,156,215,205,224,202,201,241 HP:i:1 PC:i:210 PS:i:2459 40fce1e4-96a0-4a00-8695-fa9590857b1d 16 contig_1 1 60 1363S1821M1D653M24D119M1I10M1D113M6I391M1D171M1I342M1I149M1D168M9M1D41M3D760M1D45M2D395M1D2713M2D1363M2D207M1D18M2D16M2D15M32S * 0 0 AACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA&)135@AABBBF?G=A:EBDGCBDBHCA@D@EDA>DEDED>BCDEDDEBDCB>CBFFB>C@C???A@DB@@@>C>>>DADDA?CAEBA:AAD;=BC@CA?;C?CCA9CADB@BC@FB?;?>?>>=@>BBA?B=?=;;4AA46FIBAGMSLSIJKIIIJJSIISSJHSHKGEGJGSJLQKSHBBJ2?FFIIIGIISLQSIKJOIHSPSNHSFSKSSSEGJSBLEFA??=999;>SLKLSIILSJHSS??@BBBBAEDABHISMSSSLOLMLSGNHJQKJISLJSRNSLJHJIMSSOIOLFISMMMJIKILHEHFSPSKKSSSGGLOJJJLHIKSSJMIKMLMKJSJKHHSJPIJSLJND0<;=AB66666LSSIIIJEC6-,,,2015>DIE###$$$%&')&(.,+'%%%+356/..-.56((((()**')),,,,+'&%&%$$%')))'&&&%$$ NM:i:320 ms:i:112130 AS:i:112134 nn:i:0 tp:A:P cm:i:9642 s1:i:52735 s2:i:2852 de:f:0.0037 rl:i:5199 CO:Z:MM:Z:C+h?,16,0,1,3,2,1,1,5,1,2,2,7,3,0,4,1,37,0,3,1488,2,0,0,0,9,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,2,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,3,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,2,15,0,0,0; ML:B:C,29,104,69,23,0,2,62,2,2,2,1,1,1,1,3,0,0,1,0,11,29,1,9,7,3,6,8,22,2,4,1,2,49,1,2,2,68,1,28,1,0,1,92,1,5,2,224,254,8,8,3,5,1,9,2,9,3,2,1,1,1,3,144,2,18,30,9,5,0,5,1,14,1,4,4,1,3,4,0,5,0,1,4,0,1,42,0,53,2,4,16,24,2,0,1,4,1,2,26,41,115,1,2,0,1,3,2,3,9,1,1,19,1,13,7,7,0,1,3,30,7,10,1,149,29,230,252,251,189,252,255,254,247,245,253,202,0,243,254,197,236,240,253,249,252,251,241,133,120,54,240,249,235,130,196,36,249,253,252,216,228,212,233,1 HP:i:2 PC:i:210 PS:i:2459

Truncated VCF file:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##source=Clair3
##clair3_version=1.0.0
##FILTER=<ID=LowQual,Description="Low quality variant">
##FILTER=<ID=RefCall,Description="Reference call">
##INFO=<ID=P,Number=0,Type=Flag,Description="Result from pileup calling">
##INFO=<ID=F,Number=0,Type=Flag,Description="Result from full-alignment calling">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled genotype likelihoods rounded to the closest integer">
##FORMAT=<ID=AF,Number=1,Type=Float,Description="Estimated allele frequency in the range of [0,1]">
##contig=<ID=contig_1,length=2568895>
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase set identifier">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
contig_1 2459 . C A 10.87 PASS F GT:GQ:DP:AF:PS 0|1:10:24:0.2917:2459
contig_1 2465 . C A 10.99 PASS F GT:GQ:DP:AF:PS 0|1:10:24:0.2917:2459
contig_1 2475 . TACCCCAACCCCAACCCCAACCCCA T 9.78 PASS P GT:GQ:DP:AF 0/1:9:24:0.2083
contig_1 2588 . C T 9.03 PASS P GT:GQ:DP:AF:PS 0|1:9:24:0.375:2459
contig_1 2594 . C T 14.76 PASS F GT:GQ:DP:AF:PS 0|1:14:24:0.375:2459
contig_1 2600 . C T 16.54 PASS F GT:GQ:DP:AF:PS 0|1:16:24:0.4167:2459
contig_1 2618 . T TA 15.35 PASS F GT:GQ:DP:AF 0/1:15:24:0.375
contig_1 2628 . TA T 14.22 PASS F GT:GQ:DP:AF 0/1:14:24:0.3333
contig_1 2804 . G T 19.19 PASS F GT:GQ:DP:AF:PS 0|1:19:24:0.375:2459
contig_1 2894 . G T 23.33 PASS F GT:GQ:DP:AF:PS 0|1:23:24:0.3333:2459
contig_1 3300 . AACCCT A 17.11 PASS F GT:GQ:DP:AF 0/1:17:24:0.5833
contig_1 3305 . T TA 8.56 PASS F GT:GQ:DP:AF 0/1:8:24:0.375
contig_1 66619 . AT A 0 LowQual F GT:GQ:DP:AF 1/1:0:23:0.1739

GTF file:

contig_1 Phasing exon 2459 400591 . + . gene_id "2459"; transcript_id "2459.1";
contig_1 Phasing exon 641089 1001211 . + . gene_id "641089"; transcript_id "641089.1";
contig_1 Phasing exon 1230613 1244186 . + . gene_id "1230613"; transcript_id "1230613.1";
contig_1 Phasing exon 1582658 1670184 . + . gene_id "1582658"; transcript_id "1582658.1";
contig_1 Phasing exon 1858102 1886017 . + . gene_id "1858102"; transcript_id "1858102.1";
contig_1 Phasing exon 2054129 2503491 . + . gene_id "2054129"; transcript_id "2054129.1";

Truncated FASTA reference:

contig_1
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC

Thanks a lot in advance!!

Best,
Andreas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions