Skip to content

Lionward/ProleTRact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

grafik


This repository contains a Tandem Repeat Visualization Tool that serves as the companion tool to TandemTwister. The tool processes Variant Call Format (VCF) files generated by TandemTwister and visualize tandem repeats in an intuitive, interactive format. Users can explore motifs, compare alleles to the reference sequence, and gain insights into the structure of tandem repeats, enhancing their ability to interpret genomic variation.

Why ProleTRact?

TRs are complex: alleles can differ by motif composition, length, and interrupted blocks. ProleTRact visulize TR regions with color-coded motifs, highlights interruptions, and provides intuitive navigation across regions and samples, enabling quick insight into potentially pathogenic expansions or atypical structures.

Key Features

  • Individual and Cohort modes: Analyze a single VCF or an entire directory of VCFs.
  • Dynamic sequence visualization: Color-coded motifs, clear interruption highlighting, and side-by-side allele comparison.
  • Pathogenic TR reference overlay: Built-in pathogenic_TRs.bed provides context for known loci (disease, gene, thresholds).
  • Fast navigation: Move across TR records with Previous/Next controls or jump to a specific region.

Installation

Requirements: Python 3.9, 3.10, 3.11, or 3.12 (Python 3.13+ may require building dependencies from source)

Install from PyPI:

pip install proletract
proletract --install-deps  # launches the web application

The launcher starts both the backend API server (port 8502) and frontend web server (port 3000). The application will open in your browser automatically. On headless machines, access the frontend at http://localhost:3000 after starting the application.

Note: If you encounter build errors (e.g., with Python 3.13+), ensure you're using Python 3.9–3.12, or install system dependencies: liblzma-dev (Ubuntu/Debian) or xz-devel (RHEL/CentOS/Fedora).

Quickstart

  1. Launch the app with the command above: proletract
  2. Open the browser tab to http://localhost:3000 (the URL will be shown in the terminal if you're running headless).
  3. Load an individual VCF or cohort folder from the sidebar and start exploring tandem repeats.

Usage

Individual mode 👤

  1. Select individual sample in the sidebar.
  2. Provide the absolute path to a bgzipped and tabix-indexed VCF (.vcf.gz with .tbi):
    • Enter the path in the sidebar input, then click Load VCF.
    • The app will parse records and enable navigation across TR variants.
  3. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  4. Inspect motif blocks, interruptions, and per-allele differences.

Cohort mode 👥👥

Reads-based VCF

  1. Select Cohort in the sidebar and choose Reads-based VCF view.
  2. Provide the absolute path to a directory containing TandemTwister VCF files.
  3. Click Load Cohort to scan the directory and enable cohort navigation.
  4. Browse records and compare across samples.
  5. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  6. Inspect motif blocks, interruptions, and per-allele differences.

Assembly VCF

  1. Select Cohort in the sidebar and choose Assembly VCF view.
  2. Provide the absolute path to a directory containing TandemTwister VCF files.
  3. Click Load Cohort to scan the directory and enable cohort navigation.
  4. Browse records and compare across samples.
  5. Use Previous/Next to step through records or jump to a region like chr1:1000-2000.
  6. Inspect motif blocks, interruptions, and per-allele differences.

Input Requirements

  • VCF format: Standard VCF generated by TandemTwister.
  • Cohort directory: A folder with multiple .vcf.gz files generated by TandemTwister is required for cohort mode.

Demo / Examples

Example screenshots and short walkthrough GIFs will be added here. For now, you can open example.svg for a preview:

Tandem Repeat Visualization Example

  • Planned: Individual-mode walkthrough
  • Planned: Cohort-mode walkthrough

Contributing

Contributions are welcome! Please open an issue to discuss changes.

License

This project is licensed under the BSD 3-Clause Non-Commercial License — see LICENSE for details. Commercial use is prohibited. This software is intended for academic research, educational purposes, and personal/private use only. For commercial licensing inquiries, please contact the author.

Citation

If you use ProleTRact in your work, please cite this repository. A formal citation entry will be added once available.

About

Tandem Repeat Visualization Tool that serves as the companion tool to TandemTwister.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published