Skip to content

headline_voice:verify CONCRETE_LEAD_PATTERNS regex produces false negatives on specific headlines #91

@AndreRobitaille

Description

@AndreRobitaille

Observation

lib/tasks/headline_voice.rake's verify task reports a Criterion 4 "concrete lead" failure when a headline doesn't match any of its regex patterns in the first 40 characters. In the 2026-04-11 validation run against 56 homepage-eligible briefings, the rake task reported 26/56 concrete leads (below the 38/56 threshold) — but manual inspection of the 10 flagged headlines revealed that 7 of them are actually specific and well-written, and the regex is simply missing them.

Example false negatives

From the actual verification run:

  • "Two Rivers weighs ending two TIF districts early and putting up to $27,536 into motel/hotel upgrades" — the $27,536 appears past the 40-char window
  • "Harbor plan lists dredging, seawall rebuild, and a breakwater extension for 2027–2029 WisDOT funding" — "2027" appears past the 40-char window
  • "Construction signs: Council weighs a 30-day removal rule after work ends or before occupancy" — "30-day removal rule" is specific but the patterns only match years, ordinals, dollars, percentages
  • "Kay Koach honored; Tracey Koach appointed to the Plan Commission seat she held" — named people and named commission, but no pattern catches proper nouns
  • "Two PUD projects: St. Mark's Square amendment at 1110 Victory St, and Elite Builds housing plan at 3000 Forest Ave" — addresses and street names present, but "Victory" and "1110" aren't in the regex
  • "Council lined up votes to end TIF district 13 and 16 early; TIF district 16 value returns to tax rolls in 2027" — "Council lined up votes" doesn't match \bCouncil\s+(?:votes|picks) because "lined" is between them

Actual concrete-lead count on 2026-04-11 was closer to 53/56 (~95%), well above the 2/3 threshold. The verifier's threshold failure was misleading.

Root cause

The regex set is too narrow:

CONCRETE_LEAD_PATTERNS = [
  /\$[\d,.]+/,
  /\b20\d\d\b/,
  /\b\d+(?:st|nd|rd|th)\b/i,
  /\b\d+%/,
  /\b(?:Lincoln|Washington|Main|Memorial|Forest|Twin)/i,
  /\bCouncil\s+(?:votes|picks)/i,
  /\b(?:Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|Jan|Feb|Mar)\b/
].freeze

Issues:

  1. 40-char window is too tight. A headline like "Two Rivers weighs ending two TIF districts early and putting up to \$27,536..." has its concrete detail at char 64+.
  2. Missing pattern families: generic numeric specifics (30-day, 2,200 feet, Bump importmap-rails from 2.2.2 to 2.2.3 #3), proper nouns (Hamilton, St. Mark's, Kay Koach), named actions beyond "votes/picks" (weighs, picks, honored, lined up, reviews), full street-name library (only 6 streets currently, Two Rivers has dozens).
  3. "Council lined up votes" regression: the pattern \bCouncil\s+(?:votes|picks) requires immediate adjacency, which misses any intervening verb or modifier.

Suggested fixes

Two paths, in order of preference:

Option A: Widen the regex set (cheap, low-risk)

  • Expand window from 40 chars to 60 or 80
  • Add number + unit patterns: \b\d+(?:-day|-year|-month|-week)\b, \b\d[\d,]*\s*(?:feet|acres|households?|lots?|blocks?|units?)\b
  • Add capital-case proper-noun detection: \b[A-Z][a-z]+\s+[A-Z][a-z]+ for two-word proper nouns (catches "Sandy Bay", "Kay Koach", "St. Mark's", etc.)
  • Replace the fixed street-name list with a broader pattern like \b(?:Lincoln|Washington|Main|Memorial|Forest|Twin|Hamilton|Jefferson|Victory|Jackson|Roosevelt|Pierce|Mishicot|Zlatnik|Emmet|Neshotah|Lighthouse|Cool City)\b OR auto-derive the list from city GIS data
  • Allow "Council" action to accept any verb within 2 words: \bCouncil\s+\w+(?:\s+\w+)?\s+(?:votes?|picks?|approves?|denies?|weighs?|reviews?)

Option B: Invert the logic — check for meta-commentary instead of specificity

Rather than asking "does this lead with a concrete detail?", ask "does this lead with a vague abstract noun or meta-commentary phrase?". That's a shorter banlist and easier to maintain:

  • Leading with abstract nouns: "Updates", "Discussion", "Review", "Activity", "Status"
  • Meta-commentary phrases: "keeps", "stays", "remained", "agendas list", "shows up", "lines up"
  • Bureaucratic openings: "The city", "Committee", "Board"

This is more aligned with how the prompt itself is structured (banned closers + banned phrases rather than required phrases).

Priority

Low. The verifier is a one-off dev tool used after prompt changes; it doesn't run in production. Real homepage quality is measured by reading the cards, not by the rake task output. This issue exists so the verifier gives more trustworthy automated signals on the next prompt iteration rather than requiring manual review every time.

Context

Found during: the 2026-04-11 prod backfill validation after the homepage headline voice prompt rewrite (spec: `docs/superpowers/specs/2026-04-10-homepage-headline-voice-design.md`). The verifier reported FAIL on Criterion 4 but manual inspection showed the actual pass rate was ~95%.

Related files:

  • `lib/tasks/headline_voice.rake` (the verify task)
  • `docs/superpowers/specs/2026-04-10-homepage-headline-voice-design.md` (the spec that defines the success criteria)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions