agent-failure-debugger

A deterministic pipeline that diagnoses, explains, and fixes failures in LLM-based agent systems.

Detection tells you WHAT failed.
This tool tells you WHY — through which causal path, starting from which root.
And then fixes it.

Related Repositories

Repository	Role
llm-failure-atlas	Failure pattern definitions, causal graph, matcher, evaluation, KPI
agent-pld-metrics (PLD)	Behavioral stability framework this tool applies to

What It Does

Input: Matcher output (detected failures with confidence scores) + causal graph

Output:

Root cause identification with causal ranking
Full causal path reconstruction
Deterministic fix generation with safety classification
Confidence-gated auto-apply with rollback
Learning-aware priority adjustment

Quickstart

git clone https://github.com/kiyoshisasano/agent-failure-debugger.git
cd agent-failure-debugger
pip install -r requirements.txt

Run with sample data:

# Diagnosis only
python main.py ../llm-failure-atlas/examples/simple/matcher_output.json

# Full pipeline (recommended)
python pipeline.py ../llm-failure-atlas/examples/simple/matcher_output.json --use-learning

Output:

=== PIPELINE RESULT ===
  Root cause:  premature_model_commitment (confidence: 0.85)
  Failures:    3
  Fixes:       1
  Gate:        auto_apply (score: 0.9218)
  Applied:     no

Use as an API

from pipeline import run_pipeline
import json

with open("matcher_output.json") as f:
    matcher_output = json.load(f)

result = run_pipeline(
    matcher_output,
    use_learning=True,   # adjust priority using learning data
    top_k=1,             # number of fixes to generate
    auto_apply=False,    # set True to auto-apply safe fixes
)

print(result["summary"]["root_cause"])   # "premature_model_commitment"
print(result["summary"]["gate_mode"])    # "auto_apply"

Individual steps are also available:

from pipeline import run_diagnosis, run_fix

diag = run_diagnosis(matcher_output)
fix_result = run_fix(diag, use_learning=True, top_k=2)

External Evaluation (Phase 25-lite)

You can plug in your own test environment for fix evaluation:

def my_staging_test(bundle):
    """Run fixes in your staging environment."""
    fixes = bundle["autofix"]["recommended_fixes"]
    # ... apply fixes in your own test/staging env ...
    return {
        "success": True,
        "failure_count": 0,
        "root": None,
        "has_hard_regression": False,
        "notes": f"applied {len(fixes)} fixes in staging"
    }

result = run_pipeline(
    matcher_output,
    use_learning=True,
    auto_apply=True,
    evaluation_runner=my_staging_test,
)

If evaluation_runner is not provided, the built-in counterfactual simulation is used. If provided and the gate passes, your function is called instead.

Pipeline

flowchart LR
    Log[(Raw Agent Log)] --> Adapter[Adapter<br>Phase 24]
    Adapter --> Telemetry[Matcher Input<br>Telemetry]
    Telemetry --> Matcher[matcher.py<br>Detection]
    Matcher --> Graph[(Failure Graph<br>Atlas)]
    Graph --> Debugger[Debugger Pipeline<br>Fix & Auto-apply]
    
    classDef data fill:#e1f5fe,stroke:#0288d1,stroke-width:2px,color:#000;
    classDef process fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000;
    classDef atlas fill:#f3e5f5,stroke:#8e24aa,stroke-width:2px,color:#000;
    
    class Log,Telemetry data;
    class Adapter,Matcher process;
    class Graph,Debugger atlas;

Debugger internal steps:

matcher_output.json   (produced by llm-failure-atlas matcher)
  → main.py             causal resolution + root ranking
  → abstraction.py      top-k path selection + clustering
  → explainer.py        deterministic draft + optional LLM smoothing
  → decision_support.py priority scoring + action plan
  → autofix.py          fix selection + patch generation
  → auto_apply.py       confidence gate → apply / review / proposal
  → execute_fix.py      dependency ordering + staged apply
  → evaluate_fix.py     before/after simulation + regression detection

Prerequisite: Matcher

This tool expects matcher output as input. The matcher converts logs into detected failures:

log → signals → failure detection (matcher)

Pattern definitions are maintained in llm-failure-atlas under failures/*.yaml. Pre-generated matcher_output.json files are available in each examples/ directory for immediate use.

Input Format

Matcher output: a JSON array of failure results. Each entry must include failure_id, diagnosed, and confidence.

[
  {
    "failure_id": "premature_model_commitment",
    "diagnosed": true,
    "confidence": 0.7,
    "signals": {
      "ambiguity_without_clarification": true,
      "assumption_persistence_after_correction": true
    }
  }
]

Failures with "diagnosed": false are silently excluded.

Output Format

{
  "root_candidates": ["premature_model_commitment"],
  "root_ranking": [{"id": "premature_model_commitment", "score": 0.85}],
  "failures": [
    {"id": "premature_model_commitment", "confidence": 0.7},
    {"id": "semantic_cache_intent_bleeding", "confidence": 0.7,
     "caused_by": ["premature_model_commitment"]},
    {"id": "rag_retrieval_drift", "confidence": 0.6,
     "caused_by": ["semantic_cache_intent_bleeding"]}
  ],
  "causal_paths": [
    ["premature_model_commitment", "semantic_cache_intent_bleeding", "rag_retrieval_drift"]
  ],
  "explanation": "..."
}

Root Ranking

score = 0.5 × confidence + 0.3 × normalized_downstream + 0.2 × (1 - normalized_depth)

A failure with more downstream impact ranks higher, even if its confidence is lower. This reflects causal priority, not detection confidence alone.

Auto-Apply Gate

Fix application is controlled by a deterministic confidence gate:

Score	Mode	Behavior
≥ 0.85	`auto_apply`	Apply → evaluate → keep or rollback
0.65–0.85	`staged_review`	Write to patches/, await human approval
< 0.65	`proposal_only`	Present fix proposal only

Hard blockers (override score, force proposal_only):

safety != "high"
review_required == true
fix_type == "workflow_patch"
Execution plan has conflicts or failed validation

File Structure

Core pipeline:

File	Responsibility
`pipeline.py`	API entry point (recommended)
`main.py`	CLI entry point (diagnosis only)
`config.py`	Centralized paths, weights, thresholds
`graph_loader.py`	Load failure_graph.yaml, exclude planned nodes
`causal_resolver.py`	normalize → roots → paths → ranking
`formatter.py`	Path scoring + conflict resolution + evidence
`labels.py`	SIGNAL_MAP (28 entries) + FAILURE_MAP (15 entries)
`abstraction.py`	Top-k path selection + clustering
`explainer.py`	Deterministic draft + optional LLM smoothing
`decision_support.py`	Failure → action mapping + priority scoring
`autofix.py`	Fix selection + patch generation
`fix_templates.py`	12 failure × fix definitions
`execute_fix.py`	Dependency ordering + staged apply + rollback
`evaluate_fix.py`	Counterfactual simulation + regression detection
`auto_apply.py`	Confidence gate + auto-apply + rollback
`policy_loader.py`	Read-only access to learning stores

CLI wrappers:

File	Wraps
`explain.py`	`explainer.py`
`summarize.py`	`abstraction.py`
`advise.py`	`decision_support.py`
`apply_fix.py`	`execute_fix.py` (dry-run display)

Data:

File	Note
`failure_graph.yaml`	Causal graph (canonical source is Atlas; see below)
`templates/`	Prompt templates for LLM-based explanation

Graph Sync

failure_graph.yaml exists in both repositories. The canonical source is always llm-failure-atlas/failure_graph.yaml. The copy in this repository must be kept in sync manually.

To verify sync:

diff failure_graph.yaml ../llm-failure-atlas/failure_graph.yaml

If you set ATLAS_ROOT, config.py can resolve the Atlas graph path directly.

Configuration

Environment variables override defaults:

Variable	Default	Description
`ATLAS_ROOT`	`../llm-failure-atlas`	Path to Atlas repository
`DEBUGGER_ROOT`	`.` (this repository)	Path to this repository
`ATLAS_LEARNING_DIR`	`$ATLAS_ROOT/learning`	Learning store location

All settings (scoring weights, gate thresholds, KPI targets) are centralized in config.py.

What This Is

This tool is a deterministic causal debugging pipeline — not an ML-based anomaly detector.

Deterministic: Same matcher output always produces the same root cause, causal path, fix, and gate decision. The core pipeline uses no LLM inference (LLM is optional, for explanation smoothing only).
Causal, not statistical: Root ranking uses graph structure and confidence scores, not learned weights or embeddings.
Consistent over correct: The system produces a structurally consistent explanation under its scoring and resolution rules. It finds the best-supported root cause given the defined causal graph — not necessarily the "true" cause.

Key implications:

Root cause ranking is reproducible and auditable
Auto-apply decisions are governed by a deterministic confidence gate, not LLM judgment
The evaluate_fix stage applies a deterministic structural intervention model, not an empirical simulation: targeted failures and all their downstream descendants are removed from the causal graph, and the system state is recomputed
Learning adjusts weights, never structure (patterns, graph, and templates are never auto-modified)

Design Principles

Graph is not used for diagnosis — only for causal interpretation
Signal names are system-wide contracts — no redefinition allowed
Adding a failure to the Atlas requires no changes to this tool
Learning is suggestion-only — patterns, graph, and templates are never auto-modified
Auto-apply safety hierarchy: high → auto candidate, medium → review, low → excluded

Relationship to Atlas

This tool depends on LLM Failure Atlas:

failure_graph.yaml is sourced from the Atlas
Node and edge definitions are maintained there
This tool does not define failures itself

Relationship to PLD

Phase Loop Dynamics (PLD) is a runtime governance layer that stabilizes multi-turn LLM agent execution through the loop: Drift → Repair → Reentry → Continue → Outcome.

This tool is not a PLD runtime. It implements a single control step spanning analysis, intervention, and evaluation within the PLD loop — the post-incident causal analysis and intervention decision.

How this pipeline maps to PLD concepts:

Drift: Root causes provide a structural explanation of drift after it has been detected. root_ranking identifies the most impactful failure, but does not directly measure real-time misalignment.
Repair: autofix generates fixes and auto_apply gate governs intervention decisions (decision_support → autofix → auto_apply). These produce structured proposals that PLD Repair strategies can consume.
Reentry / Continue: evaluate_fix provides a structural reentry check (before/after comparison, not full task-level reentry validation). Explicit re-verification and task resumption are external to this pipeline.
Outcome: Refers to intervention results (keep / review / rollback), not full session termination states as defined by PLD.

System state is defined by the set of active failures and their causal relationships. The pipeline transforms this state through a single pass, not a multi-turn loop.

KPIs (6 internal stability metrics) measure pipeline health and do not directly correspond to PLD operational metrics (PRDR, REI, VRL, MRBF, FR).

This system functions as a control layer that governs intervention decisions within the PLD loop. PLD provides the runtime governance framework; this tool provides the causal analysis and remediation that operates within one step of it.

Reproducible Examples

10 examples are maintained in llm-failure-atlas under examples/. Each contains log.json, matcher_output.json, and expected_debugger_output.json.

Run and compare:

python main.py ../llm-failure-atlas/examples/simple/matcher_output.json

Output should match expected_debugger_output.json exactly.

License

MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-failure-debugger

Related Repositories

What It Does

Quickstart

Use as an API

External Evaluation (Phase 25-lite)

Pipeline

Prerequisite: Matcher

Input Format

Output Format

Root Ranking

Auto-Apply Gate

File Structure

Graph Sync

Configuration

What This Is

Design Principles

Relationship to Atlas

Relationship to PLD

Reproducible Examples

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
docs		docs
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
abstraction.py		abstraction.py
advise.py		advise.py
apply_fix.py		apply_fix.py
auto_apply.py		auto_apply.py
autofix.py		autofix.py
causal_resolver.py		causal_resolver.py
config.py		config.py
decision_support.py		decision_support.py
evaluate_fix.py		evaluate_fix.py
execute_fix.py		execute_fix.py
explain.py		explain.py
explainer.py		explainer.py
failure_graph.yaml		failure_graph.yaml
fix_templates.py		fix_templates.py
formatter.py		formatter.py
graph_loader.py		graph_loader.py
labels.py		labels.py
main.py		main.py
pipeline.py		pipeline.py
policy_loader.py		policy_loader.py
requirements.txt		requirements.txt
summarize.py		summarize.py

Folders and files

Latest commit

History

Repository files navigation

agent-failure-debugger

Related Repositories

What It Does

Quickstart

Use as an API

External Evaluation (Phase 25-lite)

Pipeline

Prerequisite: Matcher

Input Format

Output Format

Root Ranking

Auto-Apply Gate

File Structure

Graph Sync

Configuration

What This Is

Design Principles

Relationship to Atlas

Relationship to PLD

Reproducible Examples

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages