Merged
Conversation
Replace hand-rolled stemmer with Snowball (rust-stemmers), add synonym expansion for query tokens, graduate trigger/name scores from binary to fractional, soften structural scoring from all-or-nothing to proportional (10 per check), and unify trigger phrase definitions across linter, tester, and upgrade modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR improves the semantics and documentation of the probe/score heuristics by upgrading token normalization (Snowball stemming), adding query-side synonym expansion, graduating trigger/name scoring from binary to proportional, softening structural scoring to be proportional per check, and centralizing trigger phrase detection across modules.
Changes:
- Replace the hand-rolled stemmer with
rust-stemmers(Snowball) and add query-side synonym expansion for probe matching. - Make trigger/name scoring proportional (fraction matched) and unify trigger phrase detection via shared
TRIGGER_PHRASES. - Change structural scoring from all-or-nothing 60 points to proportional per-check points; update CLI docs/examples and changelog accordingly.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/cli.rs | Updates a CLI test fixture description to include a trigger phrase line. |
| src/tester.rs | Adds Snowball stemming, synonym expansion, and graduated trigger/name scoring; reuses shared trigger phrases. |
| src/scorer.rs | Implements proportional structural scoring (10 points per passing structural check). |
| src/linter.rs | Exposes TRIGGER_PHRASES publicly for shared use across modules. |
| src/cli/upgrade.rs | Uses shared TRIGGER_PHRASES for trigger phrase detection. |
| docs/cli.md | Documents new probe/score limitations and updates scoring descriptions/examples. |
| Cargo.toml | Adds rust-stemmers dependency. |
| Cargo.lock | Locks rust-stemmers and transitive deps. |
| CHANGES.md | Notes breaking changes due to updated probe/score semantics. |
…ural max - Fix synonym expansion inflating denominator: use original query_set.len() instead of expanded_query.len() so synonyms can only help, never hurt - Cache Snowball stemmer in LazyLock instead of creating per call - Derive structural max from checks.len() * STRUCTURAL_POINTS_PER_CHECK at runtime; remove STRUCTURAL_MAX from production code - Fix W002 double-counting: "No unknown fields" now checks W001 specifically instead of any warning - Add dedicated synonym expansion tests (group expansion, unmatched tokens, Snowball stem verification, score non-regression) - Restore strict Strong assertion in test_skill_returns_result_for_valid_skill Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Owner
Author
Review response (b6220bd)
532 lib + 144 CLI + 27 plugin + 64 integration tests passing. Clippy + fmt clean. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
rust-stemmers), add synonym expansion, graduate trigger/name scores, soften structural scoring, unify trigger phrases across modulesCloses [TASK] Improve probe/score matching semantics #168
Changes
Matching improvements (
src/tester.rs):TRIGGER_PHRASESfrom linter withcontains(wasstarts_withon 2 phrases)Scoring improvements (
src/scorer.rs):STRUCTURAL_POINTS_PER_CHECK = 10constantShared constants (
src/linter.rs,src/cli/upgrade.rs):TRIGGER_PHRASESpub(was private) for cross-module useTRIGGER_PHRASESinstead of hardcoded "use when"/"use this when"Documentation (
docs/cli.md):Breaking changes: Probe scores and score totals change for existing skills.
Test plan
cargo test— 527 lib + 144 CLI + 27 plugin + 1 doc = 699 passingcargo clippy -- -D warnings— cleancargo fmt --check— cleancargo test --test anthropics_skills -- --ignored— 64 integration tests pass🤖 Generated with Claude Code