You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
No validation on writes. A step writing state.classification = "complaint" followed by another step reading it as int produces a runtime error far from the cause.
Resume-from-checkpoint surprises. A schema change between v1 and v2 of a workflow can resurrect old state that no longer matches what the steps expect — silent corruption, hard-to-diagnose bugs.
At workflow load: validate state initial values against the schema; reject the workflow if invalid.
On every StateManager.set(key, value): validate the resulting state against the schema; raise StateSchemaError on violation.
On every StateManager.get(key): type-check the returned value matches the schema field; raise on mismatch (typically caught earlier by the set, but defends against direct dict manipulation).
On checkpoint resume: validate restored state matches current schema; raise with a clear message if the schema has evolved (with optional migration_callback).
3. Strict mode for templates:
When a state schema is declared, SafeFormatDict operates in strict mode (per #110): missing keys raise TemplateError instead of returning empty string. The schema declares which keys exist; anything else is a typo.
Workflow author writes the model once; AgentLoom uses it for validation, the lint pass uses it for static checks, the LSP/IDE uses it for autocomplete.
5. Step output schemas:
Steps that write to state declare what they produce:
When resuming a checkpoint with version=1.1 into a workflow at version=1.2, the migration callback runs to coerce old state to new shape. Without it, the resume fails clearly instead of corrupting silently.
Migrations are optional: the cheapest correct behavior is "raise on schema mismatch" — that alone prevents silent corruption. Migration callbacks are for users who choose to maintain compatibility actively.
This is the largest single quality-of-life improvement for non-trivial workflows. Pays for itself the first time a typo gets caught at lint instead of as confused LLM output.
Description
StateManager._state: dict[str, Any]is the only state contract — workflows have no way to declare a schema for their state. Consequences:{state.user_inpt}(typo) renders as empty string perSafeFormatDict(issue fix state manager lock bypass, template rendering gaps, and approval gate flow #110), and the LLM produces a confused answer. No error.state.classification = "complaint"followed by another step reading it asintproduces a runtime error far from the cause.state.foois impossible; the LSP work in add agentloom lint for semantic workflow validation #66 (semantic lint) hits this wall.Proposal
1. Optional Pydantic state schema per workflow:
Or inline JSON Schema:
2. Schema enforcement:
When
state_schemais declared:stateinitial values against the schema; reject the workflow if invalid.StateManager.set(key, value): validate the resulting state against the schema; raiseStateSchemaErroron violation.StateManager.get(key): type-check the returned value matches the schema field; raise on mismatch (typically caught earlier by the set, but defends against direct dict manipulation).migration_callback).3. Strict mode for templates:
When a state schema is declared,
SafeFormatDictoperates in strict mode (per #110): missing keys raiseTemplateErrorinstead of returning empty string. The schema declares which keys exist; anything else is a typo.4. Pydantic-first ergonomics:
Workflow author writes the model once; AgentLoom uses it for validation, the lint pass uses it for static checks, the LSP/IDE uses it for autocomplete.
5. Step output schemas:
Steps that write to state declare what they produce:
Combined with #117 (response schema for LLM calls), this makes the entire state flow type-checked end-to-end.
6. Lint integration:
When #66 (
agentloom lint) lands, it cross-references:{state.X}template reference against the schema.output: state.Yagainst the schema (Y must be a writable field).state.X == ...against the schema (X must exist; comparison type must match).Today's silent runtime failures become validation errors at lint time.
7. Migration support:
When resuming a checkpoint with
version=1.1into a workflow atversion=1.2, the migration callback runs to coerce old state to new shape. Without it, the resume fails clearly instead of corrupting silently.Scope
src/agentloom/core/models.py—WorkflowDefinition.state_schema,StepDefinition.output_schema.src/agentloom/core/state.py— schema validation onset/get/from_checkpoint.src/agentloom/core/templates.py— strict mode triggered by schema presence.src/agentloom/core/parser.py— load and resolve Pydantic model references at workflow load.src/agentloom/exceptions.py—StateSchemaError.src/agentloom/cli/lint.py(after add agentloom lint for semantic workflow validation #66) — cross-reference schemas with template references.src/agentloom/checkpointing/base.py— version field + migration callback dispatch.examples/typed_state/— example workflow with Pydantic state.Regression tests
test_state_initial_validates_against_schematest_state_set_invalid_value_raisestest_state_get_after_external_mutation_raises_on_type_mismatchtest_template_missing_key_raises_when_schema_presenttest_template_missing_key_silent_when_no_schema(backward compat)test_checkpoint_resume_with_matching_schema_succeedstest_checkpoint_resume_with_evolved_schema_runs_migrationtest_checkpoint_resume_with_evolved_schema_no_migration_raisestest_step_output_schema_validates_written_valueNotes
state_schemakeep working with current dict-based, lenient behavior.output_schema.