Feature Request: Semantic Assertions (LLM-as-a-judge) in Behavior Specs

## Summary
The Tactus website ([specifications page](https://tactus.anth.us/specifications/)) describes behavior specifications that can verify "invariants" rather than just exact string matches (e.g., "text must be polite" or "text must contain the answer").

While Tactus supports LLM-as-a-judge concepts in evaluations (potentially via DSPy integration), we do not yet have direct, first-class support for semantic assertions within the Gherkin-style behavior specifications language itself.

## Proposed Feature
Add a "Semantic Assertion" step type to the spec language that allows users to define a natural language invariant that is verified by an LLM.

Example syntax concept:
```gherkin
Then the output should be polite
And the output should contain the answer to the user's question
```

## Implementation Notes
- We should investigate building this on top of **DSPy's assertion/teleprompter capabilities** since it is already a dependency for evaluations.
- Alternatively, we could implement this using Tactus's own agent primitives (spinning up a ephemeral "judge agent" to evaluate the assertion).
- The goal is to close the gap between the high-level promise of "verifying intent" and the current implementation reality of deterministic checks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Semantic Assertions (LLM-as-a-judge) in Behavior Specs #31

Summary

Proposed Feature

Implementation Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Semantic Assertions (LLM-as-a-judge) in Behavior Specs #31

Description

Summary

Proposed Feature

Implementation Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions