long-read-RNAseq

PIPELINE for identifying and quantifying known and novel genes/isoforms in long-read RNA-seq data

OVERVIEW

Nextflow pipeline for identifying and quantifying known and novel genes/isoforms in long-read RNA-seq data. Now it works with human data from Oxford Nanopore platforms (in the future - PacBio and mice).

alignment Minimap2 mapping Oxford Nanopore reads to the genome

cleaning Transcriptclean corrects mismatches, microindels, and noncanonical splice junctions in long reads that have been mapped to the genome (to fix artifactual noncanonical splice junctions)

gene / isoform searching TALON identifying and quantifying known and novel genes / isoforms in long-read transcriptome data sets

INPUTS

In file params.config:

Variables to be changed

samples_file - csv-file with row_id (number), IDs of sample ( we use LN as ID) and full pathway to every file with reads for every sample. Header : row_id,ln,pathway 1 row - 1 pathway to read file Example - /net/seq/data2/projects/amuravyova/nf-long-reads-align/FETAL/11_20_fetal_with_pathways.csv
outdir - directory where you want to put the results
description - description of the date (does not affect the analysis)
platform - platform that was used for generating the data (does not affect the analysis)

Variables could to be changed (please don’t touch them now)

genome_fasta - fasta file containing the reference genome used in mapping
genome_gtf - gtf file containing the reference annotation
spl_jnk - high-confidence splice junction file This file is necessary if you want to correct noncanonical splice junctions
known_variants_vcf - vcf file containing variants
conda

HOW TO RUN

create samples_file
set the required variable values (samples_file, outdir, description, platform) in file params.config

Caution

save file changes !

run in tmux :

module load nextflow/22.04.3 
nextflow run test_tuples.nf  -profile Altius -entry tuple

Important

please check that nobody else runs it now !

Results will be in the folder you set as outdir in file params.config

OUTPUTS

QC description

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
README.md		README.md
Screen Shot 2024-02-12 at 10.15.20 AM.png		Screen Shot 2024-02-12 at 10.15.20 AM.png
Screen Shot 2024-02-14 at 8.57.33 AM.png		Screen Shot 2024-02-14 at 8.57.33 AM.png
aligning.nf		aligning.nf
example_for_human.slurm		example_for_human.slurm
example_for_mice.slurm		example_for_mice.slurm
nextflow.config		nextflow.config
params.config		params.config
part2.nf		part2.nf
remove_soft_clipping_part_v2_modified.py		remove_soft_clipping_part_v2_modified.py
talon.nf		talon.nf
test_tuples.nf		test_tuples.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

long-read-RNAseq

OVERVIEW

INPUTS

HOW TO RUN

OUTPUTS

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

long-read-RNAseq

OVERVIEW

INPUTS

HOW TO RUN

OUTPUTS

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages