CORR Cohort Builder

Democratizing Access to Clinical Data for Retrospective Research

The CORR Cohort Builder is a designed to bridge the gap between high-dimensional intensive care unit data and clinical researchers.

🚀 Key Features

🏥 Interactive Cohort Definition

Define inclusion and exclusion criteria using a visual logic builder.
Filter patients based on demographics, clinical events, or temporal logic (e.g., "Sepsis within 24h of admission").

📊 Advanced Variable Management**

Seamlessly integrates pre-validated clinical concepts from the CORR-Vars library.
Supports both Native (raw DB extraction) and Derived (calculated via Python/Polars) variables.
Allows customization of time windows and aggregation methods (e.g., min, max, mean) directly in the UI.

📈 Instant Feasibility & Analytics**

Real-time feedback on cohort size and attrition.
Automated generation of a publication-ready "Table One" (baseline characteristics).
Integrated data profiling reports (distributions, missingness, correlations).

🔒 Security & Governance**

Offline Mode: Allows safe feasibility checks on an intranet server without direct database access.
Online Mode: Enables authorized extraction of full datasets on secure research servers.
Project-based access control (Owner, Editor, Read-Only).

🛠️ Technical Architecture

The application is built on a scalable microservices architecture to separate user interaction from heavy data processing.

Component	Technology	Description
Frontend	Streamlit	Reactive web interface for visual configuration. Handles state management and input validation.
Backend	FastAPI	Central REST API managing metadata persistence, authentication (JWT), and orchestration.
Executor	Python Worker	Async engine that runs heavy data queries. It translates visual definitions into executable `CORR-Vars` code.
Queue	Redis	Manages async job distribution between the Backend and Executor to prevent blocking the UI.
Database	SQLite	Stores project metadata, cohort definitions, and user settings.

📂 Repository Overview

The codebase is organized by service, facilitating independent development and deployment.

streamlit/: Frontend Application. Contains the UI layout, plotting logic, and session state management.
fastapi/: Backend API. Contains API routes, Pydantic models, and database repositories.
cohort_executor/: Execution Engine. Contains the logic for translating configurations to CORR-Vars objects and executing queries.
shared/: Shared Library. Common schemas, enums, and utilities used across all services.
docker-compose.yml: Deployment Config. Defines the multi-container setup for Docker.

🐳 Deployment with Docker

Prerequisites

1. Clone the Repository

git clone https://github.com/CUB-CORR/cohort-builder.git
cd cohort-builder

2. Configuration

Create a .env file in the root directory. You can start by copying the example configuration:

cp .env.example .env

Critical Environment Variables:

SECRET_KEY: Set a secure random string for JWT token generation.
DB_CONNECTION_STRING: (Optional) Connection string for the metadata database. Defaults to a local SQLite file if not set.
CORR_DB_USER / CORR_DB_PASSWORD: Credentials for the clinical source database (if running in extraction mode).

3. Build and Run

Use Docker Compose to build the images and start the services in detached mode:

docker compose up -d --build

4. Verification

Once the containers are running, the services will be available at:

🖥️ Frontend UI: http://localhost:5201
⚙️ Backend API: http://localhost:5200
📄 API Documentation: http://localhost:5200/docs

To view the logs and monitor the startup process:

docker-compose logs -f

Production Deployment

For production environments, use the production-specific compose file which may include stricter restart policies and resource limits:

docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

To manage the lifecycle of the application in production, you can utilize the provided helper scripts:

./deploy-prod.sh: Pulls changes, rebuilds images, and restarts the stack.
./backup.sh: Creates backups of the metadata database.

Part of this README was generated with the help of AI tools. It may not be fully accurate or up-to-date.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CORR Cohort Builder

🚀 Key Features

🏥 Interactive Cohort Definition

📊 Advanced Variable Management**

📈 Instant Feasibility & Analytics**

🔒 Security & Governance**

🛠️ Technical Architecture

📂 Repository Overview

🐳 Deployment with Docker

Prerequisites

1. Clone the Repository

2. Configuration

3. Build and Run

4. Verification

Production Deployment

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1,285 Commits
cohort_executor		cohort_executor
docs		docs
fastapi		fastapi
scripts		scripts
shared		shared
streamlit		streamlit
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.frontend.yml		docker-compose.frontend.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
test_execution_flow.py		test_execution_flow.py

Folders and files

Latest commit

History

Repository files navigation

CORR Cohort Builder

🚀 Key Features

🏥 Interactive Cohort Definition

📊 Advanced Variable Management**

📈 Instant Feasibility & Analytics**

🔒 Security & Governance**

🛠️ Technical Architecture

📂 Repository Overview

🐳 Deployment with Docker

Prerequisites

1. Clone the Repository

2. Configuration

3. Build and Run

4. Verification

Production Deployment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages