Skip to content

feat(testing): Add Semantic Assertion Matchers for Intent-Level Validation#1608

Open
Mustafa11300 wants to merge 4 commits intomofa-org:mainfrom
Mustafa11300:feat/issue5-semantic-assertion-matchers
Open

feat(testing): Add Semantic Assertion Matchers for Intent-Level Validation#1608
Mustafa11300 wants to merge 4 commits intomofa-org:mainfrom
Mustafa11300:feat/issue5-semantic-assertion-matchers

Conversation

@Mustafa11300
Copy link
Copy Markdown
Contributor

🔍 Description

Adds semantic assertion matchers to the mofa-testing crate. These validate meaning and intent rather than exact text, making tests resilient to harmless wording changes while maintaining strict policy and safety checks.

This is Issue 5 from the testing platform roadmap (Assertion Layer, Phase 3).

Closes #1605
Depends on #1599

📌 Changes

New: tests/src/semantic.rs

Core semantic assertion module with the following public components:

Component Purpose Confidence Score
SemanticMatcher trait Extensible base for all assertion types
ContainsAllFactsMatcher Validates required facts are present (case-insensitive) Fraction of facts found
ExcludesContentMatcher Validates prohibited content is absent (policy/safety) Fraction of clean terms
IntentMatcher Keyword-based intent classification with any/all modes Fraction of intents matched
SimilarityMatcher Jaccard token-overlap with configurable threshold Similarity coefficient
RegexIntentMatcher Structural pattern validation via regex 1.0 or 0.0
SemanticAssertionSet Composable bundle of matchers with aggregated reporting
SemanticExpectation YAML/JSON file-backed assertion definitions
SemanticMatchResult Per-matcher outcome with confidence and explanation
SemanticReport Aggregated pass/fail with average confidence
SemanticAssertionError Structured error type (invalid patterns, thresholds, empty sets)

Key design decisions:

  • All matchers are deterministic — no external API calls, safe for CI and offline environments
  • All text matching is case-insensitive by default
  • Every result includes a confidence score in [0.0, 1.0] for nuanced analysis
  • Matchers are composable — combine any number in a SemanticAssertionSet
  • File-backed YAML/JSON definitions enable non-Rust contributors to author assertions

Modified: tests/src/lib.rs

  • Added pub mod semantic module registration
  • Added public re-exports for all semantic types

New: tests/tests/semantic_tests.rs

35+ comprehensive tests covering:

  • ContainsAllFactsMatcher: all present, some missing, case insensitivity, empty facts, none found
  • ExcludesContentMatcher: clean response, single violation, multiple violations, case insensitivity, empty banned list
  • IntentMatcher: single intent, no match, any-mode, require-all mode, all match, case insensitivity, empty intents
  • SimilarityMatcher: high overlap, no overlap, identical text, threshold boundary, invalid thresholds, zero threshold, case insensitivity
  • RegexIntentMatcher: pattern match, no match, invalid pattern
  • SemanticAssertionSet: all pass, partial failure, empty set error, single matcher, average confidence
  • SemanticMatchResult: constructors, serialization roundtrip
  • SemanticExpectation: YAML loading (all 5 types), JSON loading, into_matcher conversions, invalid pattern error
  • Real-world scenarios: weather agent (5 matchers), policy violation detection, mixed intent+facts

New: examples/semantic_assertions/

File Description
README.md Usage guide with Rust builder API, file-backed YAML, and design rationale
weather_assertions.yaml 5-matcher weather agent validation: facts, intents, similarity, pattern, policy
safety_assertions.yaml Safety-focused assertions: PII exclusion, acknowledgment intent, error exclusion

🧪 Testing

All new functionality is covered by tests/tests/semantic_tests.rs with 35+ test cases.

Key test categories:

  1. Unit tests — Each matcher type in isolation
  2. Composition tests — SemanticAssertionSet with multiple matchers
  3. File loading tests — YAML/JSON deserialization and conversion to matchers
  4. Error path tests — Invalid patterns, out-of-bounds thresholds, empty sets
  5. Real-world tests — Weather validation, policy compliance, mixed assertion scenarios
  6. Serialization tests — SemanticMatchResult JSON roundtrip

💡 Usage Example

Rust Builder API

use mofa_testing::{
    ContainsAllFactsMatcher, ExcludesContentMatcher, IntentMatcher,
    SemanticAssertionSet, SimilarityMatcher,
};

let assertions = SemanticAssertionSet::new()
    .add(ContainsAllFactsMatcher::new(vec!["Berlin", "temperature"]))
    .add(ExcludesContentMatcher::new(vec!["password", "error"]))
    .add(IntentMatcher::new()
        .expect_intent("weather", vec!["weather", "temperature"]))
    .add(SimilarityMatcher::new(
        "The temperature in Berlin is 22 degrees", 0.3
    ).unwrap());

let report = assertions.evaluate(
    "The current temperature in Berlin is 22 degrees Celsius."
).unwrap();

assert!(report.passed);
println!("confidence: {:.2}", report.average_confidence);

File-backed (YAML)

- kind: contains_all_facts
  facts: [Berlin, temperature]
- kind: excludes_content
  banned: [password, error]
- kind: matches_intent
  intents:
    - name: weather
      indicators: [weather, temperature, forecast]
  require_all: false
- kind: similar_to
  reference: "The temperature in Berlin is 22 degrees"
  threshold: 0.3
use mofa_testing::SemanticExpectation;

let yaml = std::fs::read_to_string("assertions.yaml")?;
let expectations = SemanticExpectation::from_yaml_str(&yaml)?;
let set = SemanticExpectation::into_assertion_set(expectations)?;
let report = set.evaluate("Berlin is 22 degrees and sunny.")?;
assert!(report.passed);

Policy Compliance Check

let safety = SemanticAssertionSet::new()
    .add(ExcludesContentMatcher::new(vec![
        "password", "SSN", "credit card number"
    ]));

let report = safety.evaluate(&agent_response).unwrap();
assert!(report.passed, "agent leaked sensitive information");

✅ Checklist

  • Code follows project conventions and style
  • New module registered in lib.rs with public re-exports
  • Comprehensive tests added (semantic_tests.rs)
  • Examples added (examples/semantic_assertions/)
  • README with usage documentation included
  • All matchers are deterministic (no external API calls)
  • Structured error types with SemanticAssertionError
  • Confidence scores on all match results
  • File-backed YAML/JSON assertion definitions
  • No breaking changes to existing APIs
  • Builds on the DSL foundation from feat(testing): Add Agent Test DSL with Declarative Scenario Builder #1599

Implements parameterized scenario expansion for the mofa-testing crate.
One scenario template can now expand into many concrete test cases by
substituting {{variable}} placeholders with values from parameter sets.

New components:
- ParameterSet: named variable bindings for one test variant
- ParameterMatrix: Cartesian product expansion with safety limits
- ParameterizedScenario: template + parameter sets -> expanded scenarios
- ParameterizedScenarioFile: YAML/TOML/JSON file-backed loading

Includes:
- 30+ comprehensive tests covering expansion, substitution, file
  loading, execution, edge cases, and error handling
- Example scenarios in examples/parameterized_test/ with YAML, TOML,
  and matrix expansion demonstrations
- README with usage guide and code samples

Closes #<ISSUE_NUMBER>
Implements golden response (snapshot) testing for the mofa-testing crate.
Agent outputs are recorded as baselines, then future runs are compared
against them to detect regressions automatically.

New components:
- GoldenSnapshot: serializable record of turn outputs (JSON/YAML)
- GoldenStore: filesystem-backed snapshot persistence
- GoldenTestConfig: strict (validate) vs. update (record) modes
- GoldenDiff: structured per-field diff reporting
- Normalizer trait + WhitespaceNormalizer, RegexNormalizer, NormalizerChain
- run_golden_test: end-to-end golden test runner integrated with TestReport
- compare_golden: standalone comparison engine

Includes:
- 30+ comprehensive tests covering serialization, store operations,
  diff detection, normalizers, update/strict mode, multi-turn, and
  tool call verification
- Example golden snapshots and README in examples/golden_response_test/
- tempfile added as dev-dependency for test isolation

Closes #<ISSUE_NUMBER>
…ation

Implements semantic assertion matchers for the mofa-testing crate.
These validate meaning and intent rather than exact text, making tests
resilient to harmless wording changes while maintaining strict policy
checks.

New components:
- SemanticMatcher trait: extensible base for all assertion types
- ContainsAllFactsMatcher: validates required facts are present
- ExcludesContentMatcher: validates prohibited content is absent
- IntentMatcher: keyword-based intent classification (any/all modes)
- SimilarityMatcher: Jaccard token-overlap with configurable threshold
- RegexIntentMatcher: structural pattern validation via regex
- SemanticAssertionSet: composable bundle with aggregated reporting
- SemanticExpectation: YAML/JSON file-backed assertion definitions
- SemanticMatchResult: per-matcher outcome with confidence score
- SemanticReport: aggregated pass/fail with average confidence

All matchers are deterministic and require no external API calls,
making them safe for offline environments and CI pipelines.

Includes:
- 35+ comprehensive tests covering all matchers, assertion sets,
  file loading, error handling, and real-world composite scenarios
- Example assertion files in examples/semantic_assertions/
- README with builder API and file-backed usage guide

Closes #<ISSUE_NUMBER>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(testing): Add Semantic Assertion Matchers for Intent-Level Validation

1 participant