-
Notifications
You must be signed in to change notification settings - Fork 1
MirrorTree_v1.0
This is the PYT and SBI project developed by Elba Raimúndez, Clàudia Fontserè and Lucas Michel.
- What is mirrorTree 1.0
- Requirements before installing
- Installation
- Organization of the code
- How to execute mirrorTree 1.0
##What is mirrorTree 1.0 mirrorTree 1.0 is a program to predict protein-protein interactions using similarity between two protein families (Pazos and Valencia, 2001).
It can be initiated from two protein sequences in the same fasta file (with .fa/.fasta extension) or two multiple sequence alignments (two files with aln extension, containing the query sequence).
The workflow of mirrorTree is the following:
- Input a fasta file with two proteins (fasta format).
- Find orthologs from each protein. This is done by connecting to BLAST (online) and comparing to swissprot/uniprot database.
- The parameters controlling the BLAST search are % identity (according with the BLAST alignment) and e-value. By default these parameters are set to ≥30% and ≤1e-5 and ≥60% respectively, and they can be modified by the user.
- Retain sequences that belong to the same species in both families. And only 1 sequence per organism, the one with highest homology with the query is chosen.
- The resulting sequences are aligned using ClustalW (Thompson JD, 1994) with default parameters. This step is performed locally. As a result of this process, we obtain a multiple sequence alignment of the orthologs for our two query sequences.
- (The user may start the execution here)
- Afterwards, distance matrices for both alignments are computed. Phylogenetic trees are obtained from these alignments with the neighbor-joining (NJ) algorithm implemented in ClustalW (Chenna, et al., 2003) using bootstrap (100 repetitions).
- Finally, The tree similarity between the two families is calculated as the correlation between their distance matrices using Pearson correaltion (Pazos and Valencia, 2001)
##Requirements before installing
- Beware that this script is written in Python 3.4.2. Check your version before executing mirrorTree.
- Required python modules
- ClustalW locally installed is required to run this script. Remember to change the clustalw path to your local path in modules.py.
- Internet connection is needed to perform BLAST.
##Installation
The instructions for the proper installation of mirrorTree 1.0 are the following:
- Remember! Make sure your ClustalW path is modified in modules.py.
- Root privileges are needed for installation.
- The following command should be called in the command line inside the directory /install/:
sudo python3 setup.py install
##Organization of the code mirrorTree 1.0 package contains:
This program is split in different modules and scripts. Here, you can find the reference of what is contained in each file.
#####mirrorTree The main scripts are found in this directory.
- __init__.py: A python file required for a proper installation of the program.
- mirrorTree: The workflow of the program. In this script all functions needed are called. It also contains the argument parser.
- functions.py: All the core, helper functions and classes are found here. The documentation of each function is available in this file.
- modules.py: In this files you can find the ClustalW path and all modules that the program needs to run. From this file they are imported to the others.
#####setup.py The script needed to install the package.
##How to execute mirrorTree
This script is executed in the command line as following:
If starting from the two protein sequences (fasta format with extension .fa/.fasta):
mirrorTree -i input.fa
If starting form the two alignments (clustalW alignment with extension .aln):
mirrorTree -i input1.aln input2.aln
There are other parameter:
- -v: verbose. To see more detailed stderr.
- -s: save. To save tmp files.
- -o: output. To save data into a predefined output file. By default they are shown as stdout.
- -e: evalue. Change threshold of Blast filtering. By default it is set in 0.00001.
- -id: identity. Change threshold of Blast filtering. By default it is set in 30.
If any doubt:
mirrorTree -h
This will show you how to execute the script.
2015 - Elba Raimúndez, Clàudia Fontserè and Lucas Michel