Primera is a comprehensive web-based application designed to orchestrate bioinformatics pipelines for primer design. The system integrates a React frontend for parameter configuration, a Flask backend for job management, and a Nextflow pipeline running inside Docker containers for reproducible bioinformatics analysis.
Primera GUI allows researchers to upload PSL alignment files, configure Primer3 parameters, and filter results based on genomic coordinates. The system ensures reproducibility and scalability by utilizing containerized bioinformatics tools.
Core Stack:
- Frontend: React.js (Vite)
- Backend: Python Flask
- Pipeline Orchestration: Nextflow
- Environment: Docker (
musakrgzn/primera_test:v15)
The frontend manages user input, state validation, and result retrieval.
- Main Component (
App.jsx):- State Management: Handles
formData(PSL file, chromosomes, filter modes) andprimerSettings(Tm, GC content, size). - Event Handling: Converts types (checkboxes to integers, strings to floats) before submission.
- Submission: Merges configs and adds hardcoded constraints (e.g.,
PRIMER_PRODUCT_SIZE_RANGE: [[100, 300]]).
- State Management: Handles
Component Hierarchy:
App
├── Header
├── PipelineForm # Main pipeline config (Files, Chromosomes, Filter Modes)
├── PrimerSettingsForm # Primer3 specific parameters (Size, Tm, GC)
└── ResultsPanel # Conditional rendering of download links
PROJECT_ROOT/
├── backend/ # Flask application
├── pipeline/ # Nextflow scripts (*.nf)
├── src/tools/ # Python helper scripts
├── data/ # Input files (PSL, 2bit)
└── jobs/ # Dynamic output directories per run
graph TD
A[Input: PSL & Config] --> B[FILTER_BLAT]
B -->|Filtered FASTA| C[RUN_PRIMER3]
C -->|Output .prim| D[MATCH_PRIMERS]
D -->|Matched Pairs| E[PREPARE_FOR_ISPCR]
E -->|Split into 16 chunks| F[RUN_ISPCR (Parallel)]
F -->|BED Files| G[FILTER_SUCCESSFUL_PRIMERS]
G -->|TSV Data| H[WRITE_RESULTS]
H --> I[Final Output: TSV & BED]
* FILTER_BLAT: Extracts sequences from 2bit genome based on PSL alignment.
* RUN_PRIMER3: Designs primers in batches of 500 sequences.
* MATCH_PRIMERS: Pairs forward/reverse primers with probes.
* RUN_ISPCR: Validates primers against the genome (16 parallel instances).
* FILTER_SUCCESSFUL_PRIMERS: Filters amplicons based on exact/intersect/subset modes.
-
Configuration: User fills PipelineForm and PrimerSettingsForm.
-
Submission: handleSubmit() sends POST request to /api/run.
-
Processing: Backend creates a job ID, generates params.json, and starts Nextflow.
-
Monitoring: Frontend waits for response.
-
Completion: On success, ResultsPanel appears with download links.
-
Retrieval: Files are downloaded via GET /api/results/{runId}/{filename}.
-
PSL → FASTA (via filter.py)
-
FASTA → PRIMER3_OUT (via runPrimer3.py)
-
PRIMER3_OUT → MATCHED_PRIMERS (via matchPrimers.py)
-
MATCHED_PRIMERS → AMPLICON_BED (via isPcr)
-
AMPLICON_BED → RESULTS.TSV (via filterBED.py)
-
Controlled Input: React forms utilize controlled components with strict type checking.
-
Security: Path traversal prevention and file whitelisting in the API.
-
Parallelization: The pipeline splits Primer3 and isPcr tasks into chunks/channels for speed.
-
Containerization: All bioinformatics tools run in the musakrgzn/primera_test:v15 Docker image.
-
Reproducibility: Every run preserves its own params.json and logs.
-
Endpoint:
POST /api/run -
Body: JSON object containing
formDataandprimer_settings. -
Response:
JSON
{
"status": "success",
"run_id": "uuid-string",
"stdout": "Pipeline output log..."
}
-
Endpoint :
GET /api/results/<run_id>/<filename> -
Allowed Files :
results.tsv, results.bed, report.html. -
Logic :
Searches in jobs/{run_id}/primera_results_*/ or the job root.