-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hello,
I came across this problem where:
I have an annotation file created by your own util (from my gtf) which has colons (:) in the names of the gene_id row.
Camparee will run until the last step at which it fails because it opens the gene_quant and another file which only containing one line of entries with no gene_id (but a pre-occuring tab as if an empty gene_id was used).
Following the gtf standard the use of colons is not strictly prohibited so i report this here.
I have positively verified that removing colons and other special characters from the gene_id and transcript_id row fixes the problem.
Suggestions:
Proper fix: quantification step is minimally adjusted to properly handle the tab delimited file (as per your own spec).
Hack fix: if the user goes with the official util to convert the gtf, the non passing special characters are converted to something that is not causing problems (_ maybe?).
I would go with the proper fix as it is both "proper" and probably/seemingly easier to implement as you don't have to take every different possibility in mind. It will also work for users that made their own files eliminating issues like this being reported (as the use of the util is neither mandatory nor suggested in the basic readme).
Feel free to ask me if you need more info.
Best regards,
Georgios Levenits.