Skip to content

Symbolic variant length not adjusted for indels between genome builds #27

@davmlaw

Description

@davmlaw

Thanks for the tool!

I know about the warning:

Warning: input VCF includes symbolic alleles that might not properly lift over

but thought it would be useful to explicitly list issues w/Symbolic alts.

To recreate:

  • Liftover a symbolic variant that spans an indel between genome builds
# GRCh38 variant - NM_139025.4(ADAMTS13):c.1_2057del
9	133422442	1144	G	<DEL>	.	.	SVLEN=-20044;SVTYPE=DEL;END=133442486

Expected behavior:

  • The length of the symbolic variant is altered to reflect the indel in the destination build

Actual behaviour:

  • The length always remains the same. END is recalculated as pos + old variant length (and SVLEN is untouched)

Converting this to a c.HGVS gives: NM_139025.4(ADAMTS13):c.-1_2055del

Workaround:

  • If you explicitly convert the symbolic alt into explicit bases, things work fine (input grch38 ref size of 20045, output grch37 ref size 20046)

I've attached a zip file that contains a working example with a symbolic and explicit alt, and the output vcfs

bcftools_liftover_symbolic.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions