Skip to content

glitchymagic/hypothesis-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Hypothesis Lab — Automated What-If Analysis Engine

An autonomous research tool that generates hypotheses from trading data, replays historical trades with modifications, and validates results with statistical rigor.

Built for: Answering "what if we changed X?" with data, not opinion.

How It Works

Trade Data (135+ closed trades)
        │
        ▼
┌─────────────────────┐
│  Hypothesis Engine   │
│                     │
│  Generates what-if  │
│  scenarios:         │
│  • Remove a setup   │
│  • Filter by symbol │
│  • Change thresholds│
│  • Conditional rules│
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│   Trade Replay      │
│                     │
│  Replays all trades │
│  with modification  │
│  applied. Computes  │
│  new P&L, WR, Sharpe│
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│  Statistical Tests  │
│                     │
│  • Permutation test │
│    (p-value)        │
│  • Bootstrap CI     │
│    (95% confidence) │
│  • Monte Carlo sim  │
│    (ruin probability│
│     + wealth dist)  │
└─────────┬───────────┘
          │
          ▼
   Validated / Rejected
   with confidence scores

What Makes This Different

Most backtesting tools test ONE hypothesis you come up with. This tool generates hypotheses automatically from your data, tests all of them, and ranks by statistical significance.

Example output:

  • "Remove Setup X" → +$840 improvement, Monte Carlo confidence 87.7%
  • "Crypto only" → +$797 improvement, Monte Carlo confidence 85.2%
  • "Add condition Y" → -$120, rejected (p=0.43)

Statistical Validation

Every hypothesis is validated three ways:

Method What It Measures
Permutation test P-value: is the improvement statistically significant, or could it be random?
Bootstrap CI 95% confidence interval: what's the realistic range of improvement?
Monte Carlo 10,000 portfolio simulations: what's the ruin probability and wealth distribution?

A hypothesis must pass ALL THREE to be considered "validated."

Features

  • Auto-generates hypotheses from trade performance data
  • Replays historical trades with modifications applied
  • Permutation testing with configurable iterations (default: 1,000)
  • Bootstrap confidence intervals (default: 95%)
  • Monte Carlo portfolio simulation (default: 10,000 runs)
  • Ranked output by statistical significance
  • Runs autonomously on schedule (weekly via launchd)

Tech Stack

Python 3.11, NumPy, pandas, SciPy (statistics)

Use Cases

  • Post-soak analysis: which setups to keep, modify, or remove
  • Parameter sensitivity: how robust is each threshold
  • Strategy optimization: data-driven improvements, not gut feel
  • Risk assessment: Monte Carlo ruin probability before going live

Part of an autonomous trading system with 54 services and 3,778 tests. Full system details at portfolio.

About

Automated what-if analysis — generates hypotheses, replays trades, validates with permutation tests + bootstrap CI + Monte Carlo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors