All the codes in the genome analysis could be found in the code.txt.
Figshare: Shen, Fei (2023): Horseradish genome. figshare. Media. https://doi.org/10.6084/m9.figshare.21780176.v2
T2T/
T2T.pro.fa: protein file
T2T.tran.fa: Transcripts file
T2T.gene.gff: Gene annotation gff
T2T.intactTE.gff: intact TE gff
T2T.TEanno.gff: all the annotation TE in the genome
T2T.ncRNA.gff: The non-coding RNA gff
T2T.pro.fa.anno: The interproscan annotation
genome assembly file could be found in Figshare.
The important result files in the LTR-RT analysis:
clustered_full_rt.fa: the sequence file of the clustered LTR-RTs
EDTA.pass.list: The result file from EDTA
full_ltr.ltr.acc: The list contain the LTR-RT accession and the other name
full_ltr.ltr.fa: The full-length LTR-RT fasta file
Ar.cds.fa.gz: CDS file
Ar.protein.fa.gz: Protein file
Ar.gene.anno.gz: functional annotation
Ar.gene.gff.gz: Gene models
lg.1013.fa.gz: Genome Fasta
Ar.interproscan.tsv.gz: Interproscan output
all the data could be found in figureshare:
Shen, Fei (2022): horseradish genome. figshare. Dataset. https://doi.org/10.6084/m9.figshare.20003408.v4
Please contact Fei Shen (shenf1028@gmail.com), if you have any question.