Skip to content

Camparee will silently fail to compile the gene_quant file when gtf contains special characters #12

@geoleven

Description

@geoleven

Hello,

I came across this problem where:
I have an annotation file created by your own util (from my gtf) which has colons (:) in the names of the gene_id row.
Camparee will run until the last step at which it fails because it opens the gene_quant and another file which only containing one line of entries with no gene_id (but a pre-occuring tab as if an empty gene_id was used).
Following the gtf standard the use of colons is not strictly prohibited so i report this here.
I have positively verified that removing colons and other special characters from the gene_id and transcript_id row fixes the problem.

Suggestions:
Proper fix: quantification step is minimally adjusted to properly handle the tab delimited file (as per your own spec).
Hack fix: if the user goes with the official util to convert the gtf, the non passing special characters are converted to something that is not causing problems (_ maybe?).

I would go with the proper fix as it is both "proper" and probably/seemingly easier to implement as you don't have to take every different possibility in mind. It will also work for users that made their own files eliminating issues like this being reported (as the use of the util is neither mandatory nor suggested in the basic readme).

Feel free to ask me if you need more info.

Best regards,
Georgios Levenits.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions