Skip to content

"genes_with_missing_exons" not present in input gtf file #13

@Lil-Psilocybe

Description

@Lil-Psilocybe

Hello!

Thanks for a great tool! I successfully installed geneext and ran the test_data example without issue, however when working with my own data I got the same error as in issue #10 and got a lot of my genes without exons.

The genes in question came from a trinity assembly and are in the genes_with_missing_exons.txt like so:
head $geneext/genes_with_missing_exons.txt
TRINITY_DN26089_c0_g1_1
TRINITY_DN1846_c6_g1_1
TRINITY_DN11332_c0_g1_1
TRINITY_DN4372_c1_g1_1
TRINITY_DN9737_c0_g1_1
TRINITY_DN263_c7_g1_1
TRINITY_DN36841_c1_g1_1
TRINITY_DN4199_c0_g1_1
TRINITY_DN14827_c0_g1_1

When screening these genes in my gtf file, they do not return a result. However when I screen them without the final numbered suffix (i.e. TRINITY_DN26089_c0_g1, TRINITY_DN1846_c6_g1, TRINITY_DN11332_c0_g, etc), there is returned result with each of these genes having multiple exons.

I generated the input BAM file with the same gtf I am using for geneext, so I'm unsure where the problem could be.

I am working with SPLiT-Seq data from Parse BioSciences if that helps and the all_genes.csv file from the STAR alignment output does not contain the genes with this suffix either. Any input is greatly appreciated thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions