Theory-to-experiment lab for search stability in long-running agents under finite context, with exact simulator tests and lightweight mechanistic probe tasks.
simulator ai reproducible-research ai-agents structured-output state-compression llm-agents agent-evaluation long-running-agents search-stability finite-context bounded-memory long-horizon-reasoning hypothesis-management reset-policy mechanistic-probes scientific-audit
-
Updated
Mar 9, 2026 - Python