Skip to content
This repository was archived by the owner on Jul 15, 2022. It is now read-only.

MirrorTree_v1.0

Elba Raimúndez edited this page Mar 16, 2015 · 6 revisions

MIRRORTREE 1.0

This is the PYT and SBI project developed by Elba Raimúndez, Clàudia Fontserè and Lucas Michel.

##What is mirrorTree 1.0 mirrorTree 1.0 is a program to predict protein-protein interactions using similarity between two protein families (Pazos and Valencia, 2001).

It can be initiated from two protein sequences in the same fasta file (with .fa/.fasta extension) or two multiple sequence alignments (two files with aln extension, containing the query sequence).

The workflow of mirrorTree is the following:

  • Input a fasta file with two proteins (fasta format).
  • Find orthologs from each protein. This is done by connecting to BLAST (online) and comparing to swissprot/uniprot database.
    • The parameters controlling the BLAST search are % identity (according with the BLAST alignment) and e-value. By default these parameters are set to ≥30% and ≤1e-5 and ≥60% respectively, and they can be modified by the user.
    • Retain sequences that belong to the same species in both families. And only 1 sequence per organism, the one with highest homology with the query is chosen.
  • The resulting sequences are aligned using ClustalW (Thompson JD, 1994) with default parameters. This step is performed locally. As a result of this process, we obtain a multiple sequence alignment of the orthologs for our two query sequences.
  • (The user may start the execution here)
  • Afterwards, distance matrices for both alignments are computed. Phylogenetic trees are obtained from these alignments with the neighbor-joining (NJ) algorithm implemented in ClustalW (Chenna, et al., 2003) using bootstrap (100 repetitions).
  • Finally, The tree similarity between the two families is calculated as the correlation between their distance matrices using Pearson correaltion (Pazos and Valencia, 2001)

##Requirements before installing

  • Beware that this script is written in Python 3.4.2. Check your version before executing mirrorTree.
  • Required python modules
  • ClustalW locally installed is required to run this script. Remember to change the clustalw path to your local path in modules.py.
  • Internet connection is needed to perform BLAST.

##Installation

The instructions for the proper installation of mirrorTree 1.0 are the following:

  • Remember! Make sure your ClustalW path is modified in modules.py.
  • Root privileges are needed for installation.
  • The following command should be called in the command line inside the directory /install/:

sudo python3 setup.py install

##Organization of the code mirrorTree 1.0 package contains:

This program is split in different modules and scripts. Here, you can find the reference of what is contained in each file.

#####mirrorTree The main scripts are found in this directory.

  • __init__.py: A python file required for a proper installation of the program.
  • mirrorTree: The workflow of the program. In this script all functions needed are called. It also contains the argument parser.
  • functions.py: All the core, helper functions and classes are found here. The documentation of each function is available in this file.
  • modules.py: In this files you can find the ClustalW path and all modules that the program needs to run. From this file they are imported to the others.

#####setup.py The script needed to install the package.

##How to execute mirrorTree

This script is executed in the command line as following:

If starting from the two protein sequences (fasta format with extension .fa/.fasta): mirrorTree -i input.fa

If starting form the two alignments (clustalW alignment with extension .aln):

mirrorTree -i input1.aln input2.aln

There are other parameter:

  • -v: verbose. To see more detailed stderr.
  • -s: save. To save tmp files.
  • -o: output. To save data into a predefined output file. By default they are shown as stdout.
  • -e: evalue. Change threshold of Blast filtering. By default it is set in 0.00001.
  • -id: identity. Change threshold of Blast filtering. By default it is set in 30.

If any doubt:

mirrorTree -h

This will show you how to execute the script.

Clone this wiki locally