Bioinformatics toolbox

This project contains solutions to excercises for a course on bioinformatics algorithms at the Faculty of Mathematics and Physics at Charles University. It implements several basic methods from the field of sequence and structural bioinformatics such as parsing files or assesing similarity of sequences and structures.

Instalation

The project requires biopython, which can be installed using the following command:

pip install biopython

To run the project on your computer it is sufficient to clone the repository:

git clone https://github.com/a1eska/BioToolbox.git

Implemented classes and methods

The project implements wrapping classes of parsers for PDB files, FASTA files, and CLUSTAL files. It also implements function for counting Hamming distance and edit distance of two strings. The correspoding source files are named accordingly.

Testing

The test files for the parsers are provied in the directory test_data. Every file can be run with several arguments for which the information is provided by using the argument --help. Several example follow.

FASTA parser

The following can be used to print some information about the 10-th sequence in the FASTA file ls_orchid.fasta:

python fasta_parser.py test_data/ls_orchid.fasta -s 10

PDB parser

The following prints information about the structure in the PDB file 2mpj.pdb:

python pdb_parser.py test_data/ls_orchid.fasta -i

This command gives the list of atoms in distance 2 from atom number 10:

python pdb_parser.py test_data/ls_orchid.fasta -a 10 2

CLUSTAL parser

This prints sum of pair of the multiple sequence alignment and the list of conservation-like scores of the positions:

python pdb_parser.py test_data/p53_mafft_clustal.txt

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
test_data		test_data
README.md		README.md
clustal_parser.py		clustal_parser.py
edit_distance.py		edit_distance.py
fasta_parser.py		fasta_parser.py
hamming_distance.py		hamming_distance.py
pdb_parser.py		pdb_parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bioinformatics toolbox

Instalation

Implemented classes and methods

Testing

FASTA parser

PDB parser

CLUSTAL parser

About

Uh oh!

Releases

Packages

Languages

a1eska/BioToolbox

Folders and files

Latest commit

History

Repository files navigation

Bioinformatics toolbox

Instalation

Implemented classes and methods

Testing

FASTA parser

PDB parser

CLUSTAL parser

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages