CMSC423 (Bioinformatic Algorithms, Databases, and Tools) Project 1 (Variant Detection)

The goal of this project was to build a read aligner with the purpose of discovering variants in the Coronavirus genome.

The problem of variant finding
The problem of using a sequenced sample to determine genomic variants with respect to some reference genome is common. One of the big challenges includes potentially low sequencing coverage of the variant regions, which makes it difficult to distinguish a true variant from sequencing errors.

The task required a tool to align reads to the reference genome. Then, by looking at "pile ups" of the reads on the reference genome, it was easy to determine places where the sequenced sample differes from the reference data, as evidenced by the repeated variation in the reads that cover a certain position in the genome

The Data Provided
I was given the reference genome of a particular strain of the coronavirus (2019-nCoV/USA-lL 1/2020). It was provided as a single "contig".

The single-end sequencing reads were also provided as a gzip file to save space.

The Goal
To write a read alignment tool, based on the seed-and-extend or seed-and-vote paradigm using the FM-Index for efficient seed finding. After this the alignments of the sequencing reads was accomplished. This was done by taking the reads and turning them into a SAM format for easy processing by IGV (Integrative Genomics Viewer).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.settings		.settings
bin		bin
src		src
.classpath		.classpath
.project		.project
Project 1 Write Up.pdf		Project 1 Write Up.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMSC423 (Bioinformatic Algorithms, Databases, and Tools) Project 1 (Variant Detection)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

VladDobrin/CMSC423Project1

Folders and files

Latest commit

History

Repository files navigation

CMSC423 (Bioinformatic Algorithms, Databases, and Tools) Project 1 (Variant Detection)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages