Skip to content

Value of Target attribute gains quotes when it shouldn't in round trip manipulation #230

@photocyte

Description

@photocyte

Hi there,

I've loaded a GFF file into memory like this:

 db = gffutils.create_db(args.gff,":memory:", force=True,merge_strategy="create_unique")

If I do this:

for feature in db.all_features():
        feature.seqid = pepid_to_scafid[feature.seqid]
        feature.start = json_data['p2g'][str(feature.start)]
        feature.end = json_data['p2g'][str(feature.end)]
        ## Requires a patched gffutils to work with downstream tools, where the feature.py prints the Target attribute without quotes
        print(feature)

The printed Target attribute value gains quotes, when it shouldn't:
example input:

12B1-Scaf17	Gene3D	protein_match	2346889	2347135	6.9e-15	+	.	ID="pep1__G3DSA:1.10.1200.10_40657_40739";date="11-10-2023";Target=pep1 40657 40739;Name="ACP39";status="T";Dbxref="InterPro:IPR036736"

example output:

12B1-Scaf17	Gene3D	protein_match	2346889	2347135	6.9e-15	+	.	ID="pep1-1__G3DSA:1.10.1200.10_40657_40739";date="11-10-2023";Target="pep1 40657 40739";Name="ACP39";status="T";Dbxref="InterPro:IPR036736"

The value of the Target attribute should not have quotes per the GFF3 spec:
https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md#:~:text=23%20.%20.%20.%20ID%3DMatch1-,%3BTarget%3D,-EST23%201%2021

Should I be exporting the features a different way, or is Target gaining quotes a bug within the __repr__ of the gffutils feature?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions