Skip to content

Evolution

sylvain legland edited this page Feb 26, 2026 · 1 revision

🧬 Workflow Evolution — Genetic Algorithm + RL

Software Factory continuously improves its workflow templates and runtime decisions using two AI optimization layers.

Architecture

                    COLD START
                        │
              ┌─────────▼──────────┐
              │  Mission Simulator  │
              │  N synthetic runs   │
              │  P(success)=f(...)  │
              └────┬──────────┬─────┘
                   │          │
          ┌────────▼─┐   ┌────▼──────────┐
          │ GA Fitness│   │ RL Experience │
          │ function  │   │ replay buffer │
          └────┬──────┘   └────┬──────────┘
               │               │
     ┌─────────▼────┐    ┌──────▼──────────┐
     │ GA Engine    │    │ RL Q-Learning   │
     │ 40 pop       │    │ offline batch   │
     │ 30 gen/nuit  │    │ nightly retrain │
     └─────┬────────┘    └──────┬──────────┘
           │                    │
    ┌──────▼──────┐      ┌──────▼──────────┐
    │ Proposals   │      │ engine.py hook  │
    │ UI review   │      │ mid-mission     │
    └─────────────┘      └─────────────────┘

Genetic Algorithm

The GA runs nightly at 02:00 UTC. It takes existing workflows and evolves better versions by:

  1. Encoding each workflow as a genome (phases × agents × patterns × gates)
  2. Evaluating fitness using historical mission outcomes
  3. Selecting top performers via tournament selection
  4. Breeding new workflows via crossover + mutation
  5. Proposing the top-3 evolved workflows for human review

See Darwin Teams for full details.

Reinforcement Learning

The RL policy observes mission state in real-time and recommends pattern adjustments:

  • Q-table: maps (state, action) → expected_reward
  • Training: nightly batch on accumulated experience
  • Inference: at each phase start, if confidence > 0.7

See Darwin Teams for full details.

Mission Simulator

Since real mission data is scarce at cold start, the simulator generates synthetic training data:

P(phase_success) = (
    base_rate
    × pattern_modifier      # parallel+15%, hierarchical+10%
    × gate_modifier         # all_approved+8%, no_veto+5%
    × agent_seniority_bonus # from hierarchy_rank
    + gaussian_noise(σ=0.1)
)

The simulator generates N=200 synthetic runs per workflow by default.

API

Endpoint Description
GET /api/evolution/proposals List pending evolution proposals
POST /api/evolution/proposals/{id}/approve Approve and merge proposal
POST /api/evolution/proposals/{id}/reject Reject proposal
POST /api/evolution/run/{wf_id} Trigger manual GA run
GET /api/evolution/runs List past GA runs
GET /api/rl/policy/stats Q-table coverage and confidence
GET /api/rl/decisions/{mission_id} RL decisions for a mission

UI

  • /workflowsEvolution tab: pending proposals with diff view
  • /metricsAgent Intelligence: Thompson + GA + RL metrics
  • /artThompson Sampling tab: A/B dashboard (MiniMax vs Azure OpenAI)

Clone this wiki locally