Releases: ksahlin/isONcorrect
Releases · ksahlin/isONcorrect
v0.1.3.5
This release restructures the folders to work well with Pythons new suggested way to build packages for PyPI (.toml files).
In essence:
- A build script pyproject.toml was added to the repo.
- A src/isoncorrect folder was created instead to replace previous
modulesfolder. - The scripts
run_isoncorrectandisONcorrectwas placed in the src/isoncorrect folder and given.pyfile endings to behave as modules included in theinsoncorrectlibrary. - The build instructions now produce the binaries
run_isoncorrectandisONcorrectautomatically from therun_isoncorrect.pyandisONcorrect.pymodues by giving the entry point functionmain()in each file.
The new structure requires isONcorrect to be installed with a package manager conda/pip.
For development (downloading github source), one needs to temporarily modify line 21 in isONcorrect.py from from isoncorrect import create_augmented_reference, help_functions, correct_seqs to import create_augmented_reference, help_functions, correct_seqs.
(Version number had to be increased several increments after several unsuccessful attempts to get the new build to install properly)
v0.1.0
This version adds the following over previous versions:
- An over-correction checker: The original read and the corrected read are aligned and eventual structural over-corrections are removed. Such events should be rare. We never observed any such event with previous defaults
--k 9 --w 10but rare occurences happened with the new defaults--k 9 --w 20introduced in v0.0.8. This should be fixed now. This check adds negligible time (~1-2%) to overall runtime - Better (sparser) minimizer sampling in poly-A/C/G/T regions with two new rules: 1. sample last minimizer if ties and 2. do not resample a minimizer if last minimizer is still in the window. Reduces repetitive anchors a lot in poly-regions. This improves runtime for instances where long ploy regions are frequent.
- Related to point 2; Upper limit on how repetitive a paired-minimizer anchor can be in the data (at most 10x the number of reads). I have not observed such cases yet in ONT but setting this just in case as it happened for some degenerate pacbio reads (for which isONcorrect does not typically need to be run anyway).