You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MockProvider (src/agentloom/providers/mock.py) replays pre-recorded successful responses. There is no way to make it return a failure (timeout, 5xx, rate-limit response, malformed JSON, partial stream). #62 (chaos/fault injection testing mode) addresses fault injection at the gateway level — random/probabilistic failure injection for stress tests. That is complementary but does not solve the same problem.
The PhD's Simulator (per agenttest-planteamiento.md) includes a fault injection mode that simulates an agent failing — timeout, invalid response, crash — to measure cascading failures and recovery. For this to work in a reproducible test harness, faults must be deterministic and replayable, not probabilistic. A failed scenario must produce the same failure on every run, just like a successful scenario produces the same response on every run.
Today the only way to test failure handling deterministically is to mock at the httpx layer per test — outside AgentLoom, bypassing the gateway, brittle.
Proposal
Extend MockProvider's response file format to support fault declarations, and add gateway-level integration so faults flow through the same code paths as real failures (circuit breaker, retry, fallback chain).
await anyio.sleep(after_ms) then raise TimeoutError
http_error
Raise ProviderError(status_code=N) mimicking provider HTTP error; RateLimitError when 429
malformed_response
Return invalid bytes that fail to parse — exercises adapter error paths
stream_truncate
Yield N chunks then raise specified exception mid-stream — exercises gateway stream cancellation logic (#106)
connection_reset
Immediate httpx.ConnectError
partial_response
Returns content but with usage.completion_tokens=0 and finish_reason="length" — exercises usage parsing
3. Determinism:
Each step keyed by step_id (same as success replay). The same fault fires every time the workflow runs that step. No probability — that's #62's domain. This one is "scenario X always times out at iteration 2 of the agent loop."
4. Composition with success replay:
A workflow can mix faults and successes in the same run by keying the response file by step_id:
Combined with the workflow's retry policy, this exercises the full retry-after-failure path deterministically.
5. Stream fault support:
For stream_truncate, MockProvider's stream() method (today missing — see #107) yields N chunks from the recorded content then raises. This validates that the gateway's _wrapped_iter correctly distinguishes consumer cancellation from provider failure (#106's fix).
6. Observability:
When a fault fires, MockProvider increments a counter:
Complements add chaos/fault injection testing mode #62 — that issue covers probabilistic chaos for stress testing; this one covers deterministic faults for reproducible regression testing.
Description
MockProvider(src/agentloom/providers/mock.py) replays pre-recorded successful responses. There is no way to make it return a failure (timeout, 5xx, rate-limit response, malformed JSON, partial stream). #62 (chaos/fault injection testing mode) addresses fault injection at the gateway level — random/probabilistic failure injection for stress tests. That is complementary but does not solve the same problem.The PhD's Simulator (per
agenttest-planteamiento.md) includes a fault injection mode that simulates an agent failing — timeout, invalid response, crash — to measure cascading failures and recovery. For this to work in a reproducible test harness, faults must be deterministic and replayable, not probabilistic. A failed scenario must produce the same failure on every run, just like a successful scenario produces the same response on every run.Today the only way to test failure handling deterministically is to mock at the httpx layer per test — outside AgentLoom, bypassing the gateway, brittle.
Proposal
Extend
MockProvider's response file format to support fault declarations, and add gateway-level integration so faults flow through the same code paths as real failures (circuit breaker, retry, fallback chain).1. Extended response file format:
{ "step_classify": { "content": "question", "model": "gpt-4o-mini", "usage": {"prompt_tokens": 10, "completion_tokens": 1, "total_tokens": 11}, "cost_usd": 0.0001 }, "step_answer": { "fault": { "type": "timeout", "after_ms": 5000 } }, "step_summarize": { "fault": { "type": "http_error", "status_code": 429, "headers": {"Retry-After": "30"}, "body": "Rate limit exceeded" } }, "step_explain": { "fault": { "type": "http_error", "status_code": 500, "body": "Internal server error" } }, "step_translate": { "fault": { "type": "malformed_response", "raw": "not valid json" } }, "step_stream_long": { "fault": { "type": "stream_truncate", "after_chunks": 3, "raise": "ConnectionError" } } }2. Fault types:
timeoutawait anyio.sleep(after_ms)then raiseTimeoutErrorhttp_errorProviderError(status_code=N)mimicking provider HTTP error;RateLimitErrorwhen 429malformed_responsestream_truncateconnection_resethttpx.ConnectErrorpartial_responseusage.completion_tokens=0andfinish_reason="length"— exercises usage parsing3. Determinism:
Each step keyed by
step_id(same as success replay). The same fault fires every time the workflow runs that step. No probability — that's #62's domain. This one is "scenario X always times out at iteration 2 of the agent loop."4. Composition with success replay:
A workflow can mix faults and successes in the same run by keying the response file by step_id:
{ "first_attempt": {"fault": {"type": "http_error", "status_code": 500}}, "second_attempt": {"content": "success after retry"} }Combined with the workflow's retry policy, this exercises the full retry-after-failure path deterministically.
5. Stream fault support:
For
stream_truncate, MockProvider'sstream()method (today missing — see #107) yields N chunks from the recordedcontentthen raises. This validates that the gateway's_wrapped_itercorrectly distinguishes consumer cancellation from provider failure (#106's fix).6. Observability:
When a fault fires, MockProvider increments a counter:
And sets a span attribute
mock.fault_typeso test runs are visible in traces.Scope
src/agentloom/providers/mock.py— extended response file format, fault dispatch.src/agentloom/providers/mock.py::stream— implement streaming with fault support (depends on / coordinates with fix record/replay: concurrent write race, streaming capture, hash key coverage #107).src/agentloom/observability/metrics.py— new mock fault counter.tests/providers/test_mock.py— comprehensive fault coverage.examples/fault_replay.yaml— example workflow + response file showing each fault type.docs/— fault scenarios chapter in record/replay docs.Regression tests
test_mock_fault_timeout_raises_timeout_errortest_mock_fault_http_error_429_raises_rate_limit_errortest_mock_fault_http_error_500_raises_provider_errortest_mock_fault_malformed_response_fails_adapter_parsingtest_mock_fault_stream_truncate_raises_mid_streamtest_mock_fault_connection_reset_immediatetest_mock_fault_partial_response_finish_reason_lengthtest_mock_fault_in_workflow_triggers_retry_policytest_mock_fault_propagates_to_circuit_breakertest_mock_fault_metric_recordedNotes
tool_call_fault— model returns a tool call that the workflow can mock as failing.