Releases: PombertLab/SYNY
SYNY-v1.3.2
Bugfix release
list_maker.pl: Fixed an issue arising from LOCUS and ACCESSION tags being different in NCBI GBFF files, e.g.:
LOCUS JBLUPJ010000002
ACCESSION JBLUPJ010000002 JBLUPJ010000000
list_maker.pl now looks for the ACCESSION base (located to the right) and comprehensively reassigns the missing NCBI accessions.gff3_to_gbff.pl: Fixed an issue where transcript entries in GBFF files were improperly reassigned as mRNA; this reassignment was intended only for AGAT GBFF files.gff3_to_gbff.plandfasta_to_gbff.pl: Added a fix for single contigs/chromosomes >= 1 Gbp.
SYNY-v1.3.1
Bugfix release
paf_to_barplot.py,paf_to_dotplot.py,linear_maps.py: Added a try/except block to prevent (rare) multiprocessing pool crashes; if exceptions are detected, plots will be generated using a single-threaded for loop instead.
SYNY-v1.3.0
Changes
- Changed versioning to PEP 440
run_syny.pl: Added the--min_asizecommand line option to filter out alignments/clusters smaller than integer value (e.g. --min_asize 5000)run_syny.pl:get_paf.pl,paf_metrics.py: Added the--no_seccommand line option to filter out secondary alignments from minimap2 alignments.gff3_to_gbff.pl:- Annotations in the GFF3 files are now sorted by start positions => annots in GFF3 files can be out-of-order
- Fixed a bug that crashed the script when dealing with contigs without annotations.
- Added preliminary support for AGAT-converted files (tested on a few user-provided files).
- Added the
--idcommand line option to label products by ID tags. Can be useful with GFF3 files that do not contain product descriptions.
jgi_to_ncbi_gff.pl: New script that converts JGI GFF files to a NCBI-like GFF3 format compatible withgff3_to_gbff.pl.list_maker.pl: If locus_tags and GeneID tags are missing from GBFF files, now checks for gene tags and assign automatic locus_tags.get_synteny.pl: Fixed a bug with median calculations if # of values < 2.
SYNY-v1.2a
Changes
-
Now generates VCF files from minimap2 genome alignments (min. alignment length = 1000 bp) automatically. VCF file creation can be turned off with the new
--no_vcfflag inrun_syny.pl(and/or inget_paf.pl). Note that these files can become quite large depending on the size of the genomes being compared. -
nucleotide_biases.plnow calculates GC and AT skews. Corresponding data files are located in thePLOTS/CIRCOS_DATA/subdirectory. -
GC/AT skews are now plotted automatically with Circos. If desired, these subplots can be turned off independently with the
--no_skewsoption, or together with all nucleotide biases subplots (with--no_ntbiases). -
Added a simple Fasta + GFF3 to GBFF converter (
gff3_to_gbff.pl) in theUtils/subdirectory. This tool was tested on NCBI GFF3 files and expects the GFF3 file(s) to include gene/mRNA/exon/CDS entries in thetypecolumn and theIDandParenttags in the attributes column. It also expects the corresponding Fasta and GFF3 files to share the same prefixes (e.g. genome_1.fasta / genome_1.gff). The GBFF files thus created were designed to work with SYNY but do not adhere exactly to the GBFF format and may not work for other purposes. -
list_maker.pl/run_syny.pl: GenBank Flat file format extensions (gbk, gb, gbf) are now recognized/accepted -
check_mp_colors.py: removed obsolete references to pylab -
Added
orient_fastas_to_reference.pyto theUtils/subdirectory. This script reorients contigs in FASTA file(s) based on BLASTN homology searches against a reference. This can be useful when working with newly assembled genomes.
SYNY-v1.2
Bugfix release
- Fixed concatenation issue with isoforms in
list_maker.pl - Fixed subranges issues in
list_maker.pl - Adjusted linearmap alpha value and edge color for readability in
linear_maps.py - Slightly reduced memory usage with matplotlib
SYNY-v1.1b
Bugfix release
- Fixed extra length issues with barplots, dotplots and linemaps. Code was missing a line.strip(). Issue created visual artefacts on barplots (longer frames).
Code cleanup:
- Added
--versionoption for all scripts. - Minor code cleanup / standardisation across scripts
SYNY-v1.1a
Changes
Additions
- Added the
--includeoption to select contigs by name from text file(s); one name per line - Added the
--rangesoption to select contig subranges from text file(s); name start end - Added the
--bpmodeoption to generate pairwise (pair) and/or concatenated (cat) barplots. Possible values arepair(default),cat, andall(for both). - Added the
--bclustersoption to color clusters by alternating colors in the barplots. The colors are not related within or between contigs, they are just used to highlight collinear chunks. - Created
check_versions.plto summarize script versions; this information can now be displayed withrun_syny.pl --version.
Bugfixes
list_maker.plnow grabs GeneID tags if locus tags are absent from GBFF annotation files.- Fixed .txt file extension + added a file size check to
paf_metrics.py. Now skips plotting if file is empty. - Fixed div by zero issue in
nucleotide_biases.pl. - Added a check to detect if annotations parsed are blank.
run_syny.plno longer crashes if annotations are blank when running gene cluster inferences. If blank, it now now skips this section automatically. - Fixed perl env shebangs causing issues with conda
- Fixed wrong exit codes with readmes
Readme / logs
- Added section about memory usage with genome alignments
- Added mashmap barplot examples in the Encephalitozoon section
- Added
changes.mdsummarizing changes between versions - Improved
syny.logfile.
SYNY-v1.1
Additions:
- SYNY now generates linear maps (aka linemaps) from PAF files with
linear_maps.py. - Added support for MashMap3 genome alignments. Mashmap can be selected instead of minimap with
--aligner mashmap. It runs in a smaller memory footprint than minimap (if using its default percentage identity of 85%). It does not product exact alignments however. - Added the option to exclude contigs by name matching regular expression(s): e.g.
--exclude '^AUX' '^CPGT'. - Added an alternate SYNY installation method that does not require sudo privileges by leveraging conda packages.
Fixes:
- Fixed the
The number of annotation files (2) does not equal the number of protein files (1)error => rewrote the corresponding segment and removed the obsoleted subroutine. - Fixed the unreliable $diamond_check in
get_homology.pl(i.e. replaced which by command -v). - Changed Perl dependency Roman => Text::Roman in
nucleotide_biases.pl.
SYNY-v1.1
Changes:
Additions:
- SYNY now generates linear maps (aka linemaps) from PAF files with
linear_maps.py. - Added support for MashMap3 genome alignments. Mashmap can be selected instead of minimap with
--aligner mashmap. It runs in a smaller memory footprint than minimap (if using its default percentage identity of 85%). It does not product exact alignments however. - Added the option to exclude contigs by name matching regular expression(s): e.g.
--exclude '^AUX' '^CPGT'. - Added an alternate SYNY installation method that does not require sudo privileges by leveraging conda packages.
Fixes:
- Fixed the
The number of annotation files (2) does not equal the number of protein files (1)error => rewrote the corresponding segment and removed the obsoleted subroutine. - Fixed the unreliable $diamond_check in
get_homology.pl(i.e. replaced which by command -v). - Changed Perl dependency Roman => Text::Roman in
nucleotide_biases.pl.
SYNY-v1.0b
Changes:
run_syny.ploptions can now be set from a configuration file (requires Getopt::ArgvFile); e.g.run_syny.pl @commands.conf- Added the Getopt::ArgvFile dependency to
setup_syny.pl=>sudo cpanm Getopt::ArgvFile - Added a minimum contig size option + set defaults to all contigs, i.e. (
--minsize 1) - Added a matplotlib color palette check before computations so that plots won't crash if the color palette entered does not exist
SYNY-v1.0a
Changes:
- Added
--hfsize,--hmin,--hmaxand--hautooptions to heatmaps - Added more options to the Circos
--labelscommand line switch. Possible values are now:mixed,roman,arabicandnames - Added
--pthreadsoption to set the limit of plotting instances to run in parallel (in case each plot eats up too much RAM); defaults to the value set by--threadsif omitted. - Added SVG output to
paf_metrics.py - Set fonts as editable in SVG output files
- Removed unnecessary border frames from barplots
- Fixed ambiguous heatmap titles
- Added an example script (
Arabidopsis.sh) inExamples/to download two Arabidopsis genomes (~ 100-150 Mbp each) for testing purposes