Skip to content
/ POTTR Public

POTTR is a tool to identify maximum recurrent trajectories in phylogenetic data.

License

Notifications You must be signed in to change notification settings

AlBi-HHU/POTTR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

POTTR

POsets for Temporal Trajectory Resolution (POTTR) is a combinatorial method to identify maximum recurrent trajectories in biological processes described by sequences of events with a temporal ordering. POTTR models these processes, such as tumor formation or cell differentiation, as incomplete posets. A recurrent trajectory is a set of events, like mutations, that follow the same incomplete partial order across distinct incomplete posets.

We model conflicting orders between shared events in a conflict graph. POTTR identifies maximum independent sets in subgraphs of the conflict graph, which directly corresponds to a maximum recurrent trajectory shared in a subset of the input. Since the input data might be very heterogeneous, POTTR searches for recurrent trajectories that are shared in at least $k$ incomplete posets.

Setup

Conda

The easiest way to get POTTR running is by using a conda distribution like miniconda.

After setting up conda (installation + initialization for your shell), create a new environment with the provided environment.yaml which contains the required dependencies for POTTR.

In the Code directory execute:

conda env create -f environment.yaml
conda activate pottr_env

Gurobi

POTTR uses the Gurobi solver for which a license is required. Further information can be found here.

Data

MASTRO comparison

We provide tumor trees of 89 NSCLC and 120 AML patients in the data/data_mastro directory, which we originally obtained from MASTRO [1].

TRACERx data

The phylogenetic trees from TRACERx are available at zenodo. Download and unpack the zip file from zenodo first. Then follow the instructions in here to create the phylogenetic trees for the execution with POTTR.

Cell differentiation data

We provide cell differentiation maps generated with Carta [2] in the data/data_carta directory in graph exchange XML format, which can be read in by POTTR directly.

Execution

In the code directory, run POTTR via the ./run_POTTR.py Python script.

Program arguments

The following arguments are available:

Argument Argument long form Description
-h --help show help message and exit
-o --output-path Path to store output files
-d --dags File or directory containing transitively closed DAGs (incomplete posets)
-k --k Number k of incomplete posets to search for common trajectory
-c --cores Number cores / threads Gurobi should use; default 0, Gurobi will use all available cores
-parallel --parallelize Enable parallel processing for creating conflict graph
-pool --solution-pool-size Solution pool size for Gurobi to retrieve multiple solutions
-v --verbose Increase output verbosity
-dots --draw_dots Create trajectory png files (only recommended for small instances)

Example execution with test data

In the Code directory execute:

python run_POTTR.py --dags ../Data/test_data/ --output-path ../Data/output -k 3 -v

Program ouput

output
├── converted_graphs.txt -- identified maximum recurrent trajectories and all input DAGs supporting each trajectory
├── number_of_distinct_dags_per_sample.csv -- overview of distinct DAGs per sample read from the input
├── processed_graphs
│   └── processed_graphs_support.csv
├── significance_output.txt -- statistical significance values of maximum recurrent trajectories
└── trajectories_gexf
    ├── 0_trajectory.gexf -- maximum recurrent trajectory identified by POTTR
    └── traj_graphs_names.csv -- matching between trajectories and supporting input DAGs

References

  1. Leonardo Pellegrina, Fabio Vandin, Discovering significant evolutionary trajectories in cancer phylogenies, Bioinformatics, Volume 38, Issue Supplement_2, September 2022, Pages ii49–ii55, https://doi.org/10.1093/bioinformatics/btac467
  2. Palash Sashittal, et al., Inferring cell differentiation maps from lineage tracing data, International Conference on Research in Computational Molecular Biology, Cham: Springer Nature Switzerland, 2025, https://doi.org/10.1007/978-3-031-90252-9_29

About

POTTR is a tool to identify maximum recurrent trajectories in phylogenetic data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published