You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current StepType enum (core/models.py:11-18) has five values: LLM_CALL, TOOL, ROUTER, SUBWORKFLOW, APPROVAL_GATE. With the additions tracked in #38 (loop), #39 (evaluator), #43 (map / reduce), and #118 (embed), there are still gaps that force users to write either custom tools or contorted YAML for routine operations:
transform — pure data transformation without an LLM call. JSON parse, regex extract, jmespath query, base64 decode, CSV parse, XML parse, list slicing. Today every transformation requires either a custom Python tool or chaining a router + state mutations. Belongs as a first-class step.
moderate — content moderation. OpenAI moderation endpoint, Azure Content Safety, AWS Comprehend toxicity. Critical for any workflow producing user-facing output (the AgentTest evaluator's safety metric in particular). Should be a primitive, not a tool.
delay — explicit timed wait. Useful for polling external systems, rate-limit-aware pacing in batch workflows, simulating real-time conversation latencies in the Simulator. await anyio.sleep(...) should be a step, not require a custom tool.
Ops are composable in order. Each op operates on the output of the previous. The whole chain is pure-Python, no LLM, no I/O — fast, cheap, deterministic.
If on_violation: fail and any category exceeds threshold, the step fails (workflow follows error semantics). If redact, the input is replaced with a redaction marker before being written to state.
Moderation provider adapters: start with OpenAI (free, easy). Azure Content Safety and AWS Comprehend are follow-up.
The transform op registry is intentionally extensible — users can register custom ops via the same plugin mechanism that lives outside this issue (or via the existing tool decorator pattern).
delay may seem trivial but it's commonly requested for polling patterns; keeping it as a step makes the workflow's intent explicit and keeps timing visible in traces.
Description
The current
StepTypeenum (core/models.py:11-18) has five values:LLM_CALL,TOOL,ROUTER,SUBWORKFLOW,APPROVAL_GATE. With the additions tracked in #38 (loop), #39 (evaluator), #43 (map/reduce), and #118 (embed), there are still gaps that force users to write either custom tools or contorted YAML for routine operations:transform— pure data transformation without an LLM call. JSON parse, regex extract, jmespath query, base64 decode, CSV parse, XML parse, list slicing. Today every transformation requires either a custom Python tool or chaining a router + state mutations. Belongs as a first-class step.moderate— content moderation. OpenAI moderation endpoint, Azure Content Safety, AWS Comprehend toxicity. Critical for any workflow producing user-facing output (the AgentTest evaluator'ssafetymetric in particular). Should be a primitive, not a tool.delay— explicit timed wait. Useful for polling external systems, rate-limit-aware pacing in batch workflows, simulating real-time conversation latencies in the Simulator.await anyio.sleep(...)should be a step, not require a custom tool.embed— embedding generation, depending on add embeddings API across providers #118. Listed here for completeness; details in add embeddings API across providers #118.Proposal
Add four new step types. Each is small, self-contained, and orthogonal — they can ship independently if scope-trimming is needed.
1.
transformstep:Built-in ops:
json_parse,json_serializeregex_extract(withpatternand optionalgroup)regex_replace(withpattern,replacement)jmespath(query language for JSON)xml_parse,csv_parsebase64_encode,base64_decodelower,upper,strip,truncate(max_chars)list_slice(start, end),list_filter(predicate),list_map(template)Ops are composable in order. Each op operates on the output of the previous. The whole chain is pure-Python, no LLM, no I/O — fast, cheap, deterministic.
2.
moderatestep:Output shape:
If
on_violation: failand any category exceeds threshold, the step fails (workflow follows error semantics). Ifredact, the input is replaced with a redaction marker before being written to state.3.
delaystep:Trivial wrapper over
await anyio.sleep(seconds). Cancellable per the engine's normal cancellation semantics.4.
embedstep (per #118):Scope
src/agentloom/core/models.py— extendStepTypeenum; per-step config models.src/agentloom/steps/transform.py— newTransformStepwith the op registry.src/agentloom/steps/moderate.py— newModerateStep; provider adapters for OpenAI moderation, Azure Content Safety, AWS Comprehend.src/agentloom/steps/delay.py— newDelayStep.src/agentloom/steps/embed.py— newEmbedStep(after add embeddings API across providers #118).src/agentloom/steps/registry.py— register the new types.examples/— one example per new step.Regression tests
test_transform_jmespath_extracts_correctlytest_transform_op_chain_composes_in_ordertest_transform_unknown_op_raises_at_validationtest_moderate_openai_flags_violation_above_thresholdtest_moderate_redact_replaces_inputtest_moderate_fail_aborts_workflowtest_delay_blocks_for_specified_durationtest_delay_cancellable_via_workflow_cancel(after cancellation API exists)test_embed_step_writes_vectors_to_state(covered in add embeddings API across providers #118 if it lands first)Notes
embedblocks on add embeddings API across providers #118; the others have no prereqs.transformop registry is intentionally extensible — users can register custom ops via the same plugin mechanism that lives outside this issue (or via the existing tool decorator pattern).delaymay seem trivial but it's commonly requested for polling patterns; keeping it as a step makes the workflow's intent explicit and keeps timing visible in traces.