Hypothesis Lab — Automated What-If Analysis Engine

An autonomous research tool that generates hypotheses from trading data, replays historical trades with modifications, and validates results with statistical rigor.

Built for: Answering "what if we changed X?" with data, not opinion.

How It Works

Trade Data (135+ closed trades)
        │
        ▼
┌─────────────────────┐
│  Hypothesis Engine   │
│                     │
│  Generates what-if  │
│  scenarios:         │
│  • Remove a setup   │
│  • Filter by symbol │
│  • Change thresholds│
│  • Conditional rules│
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│   Trade Replay      │
│                     │
│  Replays all trades │
│  with modification  │
│  applied. Computes  │
│  new P&L, WR, Sharpe│
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│  Statistical Tests  │
│                     │
│  • Permutation test │
│    (p-value)        │
│  • Bootstrap CI     │
│    (95% confidence) │
│  • Monte Carlo sim  │
│    (ruin probability│
│     + wealth dist)  │
└─────────┬───────────┘
          │
          ▼
   Validated / Rejected
   with confidence scores

What Makes This Different

Most backtesting tools test ONE hypothesis you come up with. This tool generates hypotheses automatically from your data, tests all of them, and ranks by statistical significance.

Example output:

"Remove Setup X" → +$840 improvement, Monte Carlo confidence 87.7%
"Crypto only" → +$797 improvement, Monte Carlo confidence 85.2%
"Add condition Y" → -$120, rejected (p=0.43)

Statistical Validation

Every hypothesis is validated three ways:

Method	What It Measures
Permutation test	P-value: is the improvement statistically significant, or could it be random?
Bootstrap CI	95% confidence interval: what's the realistic range of improvement?
Monte Carlo	10,000 portfolio simulations: what's the ruin probability and wealth distribution?

A hypothesis must pass ALL THREE to be considered "validated."

Features

Auto-generates hypotheses from trade performance data
Replays historical trades with modifications applied
Permutation testing with configurable iterations (default: 1,000)
Bootstrap confidence intervals (default: 95%)
Monte Carlo portfolio simulation (default: 10,000 runs)
Ranked output by statistical significance
Runs autonomously on schedule (weekly via launchd)

Tech Stack

Python 3.11, NumPy, pandas, SciPy (statistics)

Use Cases

Post-soak analysis: which setups to keep, modify, or remove
Parameter sensitivity: how robust is each threshold
Strategy optimization: data-driven improvements, not gut feel
Risk assessment: Monte Carlo ruin probability before going live

Part of an autonomous trading system with 54 services and 3,778 tests. Full system details at portfolio.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hypothesis Lab — Automated What-If Analysis Engine

How It Works

What Makes This Different

Statistical Validation

Features

Tech Stack

Use Cases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Hypothesis Lab — Automated What-If Analysis Engine

How It Works

What Makes This Different

Statistical Validation

Features

Tech Stack

Use Cases

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages