Skip to content

strand orientation #105

@mmpust

Description

@mmpust

Hi,
I am running prodigal meta on a gene catalogue in FASTA format to get the GFF and GTF file.

prodigal -f gff -p meta \
 -i $REF \
 -o ${REF%.fna}.gff

The faster headers look like this:

>lcl|CP000538.1_cds_EAQ71949.1_1 [gene=dnaA] [locus_tag=CJJ81176_0027] [protein=chromosomal replication initiator protein DnaA] [protein_id=EAQ71949.1] [location=1..1323] [gbkey=CDS]
>lcl|CP000538.1_cds_EAQ72022.1_33 [locus_tag=CJJ81176_0064] [protein=cytochrome c family protein] [protein_id=EAQ72022.1] [location=47275..49344] [gbkey=CDS]

The final GFF file after completion looks like this:

# Sequence Data: seqnum=1;seqlen=1323;seqhdr="lcl|CP000538.1_cds_EAQ71949.1_1 [gene=dnaA] [locus_tag=CJJ81176_0027] [protein=chromosomal replication initiator protein DnaA] [protein_id=EAQ71949.1] [location=1..1323] [gbkey=CDS]"
# Model Data: version=Prodigal.v2.6.3;run_type=Metagenomic;model="39|Rickettsia_conorii_Malish_7|B|32.4|11|1";gc_cont=32.40;transl_table=11;uses_sd=1
lcl|CP000538.1_cds_EAQ71949.1_1 Prodigal_v2.6.3 CDS     1       1323    168.6   +       0       ID=1_1;partial=10;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.302;conf=100.00;score=168.58;cscore=165.36;sscore=3.22;rscore=0.00;uscore=0.00;tscore=3.22;

I was wondering how prodigal is assigning the strand orientation?
If I look into the corresponding genomes from which the gene catalogue was generated, the strand orientation is "-" and not "+".
So, should I update the output GFF file with the original strand information or which algorithm is prodigal using to infer this?
Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions