Evalynx

Backend control plane for reproducible computational runs.

Evalynx is a backend-first platform for submitting, executing, tracking, and reproducing runs across external computational systems such as simulations and analytics pipelines. The project focuses on the service layer around computation rather than the computation engines themselves.

Why Evalynx Exists

Many technical projects already have strong computational cores but weak execution management around them. In practice, that usually means:

jobs are launched through ad-hoc scripts
execution metadata is incomplete or inconsistent
results are scattered across logs, files, and local folders
comparing runs requires manual reconstruction
failures are hard to diagnose and retries are poorly tracked

Evalynx addresses that gap with a backend control plane that can:

accept structured run requests through an API
persist run metadata in a relational model
execute jobs asynchronously
expose lifecycle states such as queued, running, succeeded, and failed
retain summaries, metrics, and artifact references
capture reproducibility metadata such as normalized config and execution provenance

What This Project Demonstrates

Evalynx is designed as a backend portfolio project, with emphasis on:

API design and request validation
relational data modeling and migrations
background job processing
execution lifecycle management
integration with external systems through explicit adapters
reproducibility-focused backend engineering

Tech Stack

FastAPI
SQLAlchemy + Alembic
PostgreSQL
Redis + RQ
Docker Compose
GitHub Actions

Backend Signals

The strongest backend signals in this MVP are:

explicit separation between a logical Run and concrete RunAttempt records
asynchronous execution through Redis + RQ with a dedicated worker process
subprocess-based runner adapters for bounded integration with external systems
structured persistence of summaries, metrics, artifacts, warnings, errors, and reproducibility metadata
retry semantics that preserve prior failure context instead of overwriting it

MVP Scope

The initial MVP is intentionally narrow and aims to prove one strong vertical slice:

FastAPI application
PostgreSQL persistence
Redis + RQ background execution
Project and Run lifecycle management
one real runner integration
stored summaries, metrics, artifacts, and failure information
retry support for failed runs

The first integrated real runner is solo-wargame-ai, chosen because it already has strong reproducibility surfaces and does not depend on private personal data.

Planned User Flow

The MVP is centered around one main workflow:

Create a project.
Submit a run with a runner type and config.
Persist the run as queued.
Execute the run asynchronously through a worker.
Store terminal state, summary data, metrics, and artifact references per execution attempt.
Inspect the latest run state and attempt history through the API.
Retry failed runs without erasing prior failure context.

Documentation Map

Run It Locally

There are two intended local paths:

Reviewer quickstart: Docker Compose plus the built-in stub runner. This is the fastest way to validate the asynchronous lifecycle with no external checkout.
Host development with the real runner: a local Python environment plus the same Dockerized PostgreSQL and Redis services, with a local checkout of solo-wargame-ai.

If you want the fastest path to understanding the project, start with the reviewer quickstart below. It proves the queue, worker, persistence, retry, and lifecycle story without needing any external repository.

Reviewer Quickstart

Requirements:

Docker Desktop or Docker Engine with Compose support

Setup:

Copy the example environment if you want a local .env: cp .env.example .env
Start infrastructure: docker compose up -d postgres redis
Apply migrations: docker compose run --rm migrate
Start the API and worker: docker compose up -d api worker
Confirm the API is healthy: curl http://localhost:8000/health

If you want a clean demo state, run docker compose down -v before repeating the example requests below. The sample IDs assume a fresh database; otherwise, use the IDs returned by your own API responses.

Create a project:

curl -X POST http://localhost:8000/projects \
  -H 'Content-Type: application/json' \
  -d '{"name":"Reviewer Demo"}'

Submit a successful async stub run:

curl -X POST http://localhost:8000/runs \
  -H 'Content-Type: application/json' \
  -d '{
    "project_id": 1,
    "runner_type": "stub",
    "config": {
      "scenario": "compose-demo-success"
    }
  }'

Submit a failing run that can be retried:

curl -X POST http://localhost:8000/runs \
  -H 'Content-Type: application/json' \
  -d '{
    "project_id": 1,
    "runner_type": "stub",
    "config": {
      "should_fail": true,
      "failure_message": "Reviewer demo failure"
    }
  }'

Inspect the lifecycle and retry the failed run:

curl http://localhost:8000/runs/1
curl http://localhost:8000/runs/2
curl -X POST http://localhost:8000/runs/2/retry
curl http://localhost:8000/runs/2
docker compose logs -f worker

Expected reviewer outcomes:

the health endpoint returns status: "ok"
the first stub run reaches succeeded and persists a summary
the second stub run reaches failed with a stored failure message
retrying the failed run increments attempt_count and current_attempt_number

Artifacts written by the stack are stored under ./artifacts.

Shut the stack down with:

docker compose down

If you also want to remove the PostgreSQL volume:

docker compose down -v

Host Development With The Real Runner

Requirements:

Python 3.11+
Docker Desktop or Docker Engine with Compose support
a local checkout of solo-wargame-ai when you want the real runner path

Setup:

Copy the environment template: cp .env.example .env
Start PostgreSQL and Redis: docker compose up -d postgres redis
Create a virtual environment: python -m venv .venv
Activate it: source .venv/bin/activate
Install dependencies: pip install -e '.[dev]'
Export or edit the required env vars:
- EVALYNX_DATABASE_URL=postgresql+psycopg://evalynx:evalynx@localhost:5432/evalynx
- EVALYNX_REDIS_URL=redis://localhost:6379/0
- EVALYNX_SOLO_WARGAME_REPO_PATH=/absolute/path/to/solo-wargame-ai
- EVALYNX_SOLO_WARGAME_PYTHON_COMMAND=/absolute/path/to/solo-wargame-ai/.venv/bin/python
- EVALYNX_ARTIFACT_ROOT=./artifacts
Apply migrations: alembic upgrade head
Start the API: uvicorn app.main:app --reload
Start the worker in a second shell: python -m app.workers.entrypoint
Run tests: python -m pytest

For the first real runner family, POST /runs accepts runner_type: "solo_wargame" with a logical config shaped like:

{
  "project_id": 1,
  "runner_type": "solo_wargame",
  "config": {
    "mission_path": "configs/missions/mission_01_secure_the_woods_1.toml",
    "policy": {
      "kind": "builtin",
      "name": "heuristic"
    },
    "seed_spec": {
      "kind": "range",
      "start": 0,
      "stop": 4
    },
    "write_episode_rows": true
  }
}

Evalynx materializes the upstream request file, allocates the artifact directory, invokes the external CLI through a subprocess adapter, and persists summary, metrics, execution metadata, artifacts, warnings, and structured error details back onto the run record and its current attempt snapshot.

Failed runs can be retried through POST /runs/{id}/retry. Run detail responses keep the latest run snapshot easy to inspect while also exposing bounded attempt history, and solo_wargame artifacts are now written into per-attempt directories under each run.

Current Status

Evalynx is at a finished MVP milestone.

A reviewer can now:

run the Docker Compose stack locally
create a project and submit runs through the API
observe asynchronous execution through Redis + RQ worker processing
inspect persisted summaries, metrics, artifacts, and attempt history
retry failed runs without overwriting prior failure context
exercise the real solo_wargame integration from a host-based app and worker when the external repository is available

The public docs, Compose workflow, and CI checks are aligned around that reviewer-facing backend story.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
alembic		alembic
app		app
docs		docs
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evalynx

Why Evalynx Exists

What This Project Demonstrates

Tech Stack

Backend Signals

MVP Scope

Planned User Flow

Documentation Map

Run It Locally

Reviewer Quickstart

Host Development With The Real Runner

Current Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Evalynx

Why Evalynx Exists

What This Project Demonstrates

Tech Stack

Backend Signals

MVP Scope

Planned User Flow

Documentation Map

Run It Locally

Reviewer Quickstart

Host Development With The Real Runner

Current Status

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages