Skip to content

Mugration-Analysis is a migration analysis of viral sequences using Treetime Mugration. The workflow goes from a time-scaled phylogenetic tree and metadata to region-to-region migration matrices and R-based visualization (circos plots), enabling detailed exploration of viral spread patterns.

Notifications You must be signed in to change notification settings

gabalves1/Mugration-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Migration Analysis with Treetime Mugration

This pipeline describes the migration analysis process using Treetime Mugration. Below are the files and steps required to perform the analysis.

Initial Files

  1. Timescale Tree (timetree.nwk)
  2. Metadata File (states.tsv) which should contain the following columns:
  • name: Name or identifier of the sequence
  • region: Region associated with the sequence

Example Metadata File Format

The states.tsv file should be formatted as follows:

name region
Sample1 Region1
Sample2 Region2
Sample3 Region3
...

Note: Make sure the region column is formatted without spaces, special characters, or accents.

Step by Step

1. Running Mugration

To perform the migration analysis, use the command below:

treetime mugration --tree timetree.nwk --states states.tsv --attribute region

2. Preparing the Scripts Folder

  1. After running the above command, go to the output folder generated by Mugration.
  2. Copy the annotated_tree.nexus file to a new folder containing the baltic.py and AncestralChanges.py scripts.

3. Counting Transitions Between Regions

  1. Before performing this step, you will need to make a manual adjustment to the AncestralChanges.py script.
  2. On line 12, you must insert the date of your newest sequence in the tree.

Once done, move on to the next step:

  1. In the newly created folder, run the following command in the terminal (make sure Python 3 is installed):
python3 AncestralChanges.py

This command will generate a .csv file with three columns: Year, Origin, and Destination.

4. Conversion to a Matrix

  1. The generated .csv file needs to be converted to a matrix that will serve as input for data visualization in R.
  2. The matrix structure should follow the example available in the "inputfiles" folder, with the regions arranged in rows and columns. The values ​​in the matrix correspond to the sum of the migration events between each pair of regions (from region X to region Y).
  3. To count the values ​​to be entered in the matrix, open your annotated_tree_events.csv file.
  4. Suggestion: To improve data visualization in the graph, normalize the values ​​to LOG10.

5. Plotting the Graph with RStudio

  1. Open RStudio and load the circos.R script.
  2. Use the matrix as input to generate the migration graph.

Summary of Steps

├── timeTree_final.nwk
├── states.tsv
├── newfolder
│ ├── AncestralChanges.py
│ ├── baltic.py
│ ├── annotated_tree.nexus
│ └── annotated_tree_events.csv
└── rstudio_files
├── circos.R
└── matriz.txt

Additional Notes

  • Prerequisites: Make sure python3 and the necessary R packages are installed.

About

Mugration-Analysis is a migration analysis of viral sequences using Treetime Mugration. The workflow goes from a time-scaled phylogenetic tree and metadata to region-to-region migration matrices and R-based visualization (circos plots), enabling detailed exploration of viral spread patterns.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published