Skip to content

fix: screening/evaluating data quality and Cypher query bugs#53

Merged
sonesuke merged 3 commits intomainfrom
fix/screening-abstract-and-legal-status
Apr 5, 2026
Merged

fix: screening/evaluating data quality and Cypher query bugs#53
sonesuke merged 3 commits intomainfrom
fix/screening-abstract-and-legal-status

Conversation

@sonesuke
Copy link
Copy Markdown
Owner

@sonesuke sonesuke commented Apr 5, 2026

Summary

  • Screening: Separate legal_status from judgment in schema, use fetch_patent.abstract_text instead of search snippet, add batch parallel fetch (10 patents)
  • Evaluating: Fix Cypher query to use MATCH (c:claims) RETURN c.number, c.text without ORDER BY, which causes c.text to return null due to a parser bug
  • Tests: Add checks for legal_status_recorded, all_patents_screened, patent_fetch_invoked in screening/evaluating tests

Root Cause

Two Cypher parser bugs in google-patent-cli were identified:

  1. ORDER BY toInteger(c.number) causes c.text to return expression: null
  2. Relationship pattern (p:Patent)-[:claims]->(c:claims) also causes null

Workaround: use MATCH (c:claims) RETURN c.number, c.text (no ORDER BY, direct node match)

Test plan

  • screening/functional-with-data — PASS (101s)
  • evaluating/functional — claims correctly retrieved with new query pattern
  • Full test suite via mise run test

🤖 Generated with Claude Code

web-flow and others added 3 commits April 5, 2026 08:35
…source in screening

The screening skill was storing search snippets as abstract_text instead
of the official abstract from fetch_patent. This was caused by:

1. No explicit instruction to use fetch_patent.abstract_text (not snippet)
2. No legal_status column in screened_patents table, causing judgment
   to conflate relevance assessment with legal status
3. Tests only verified fetch_patent was invoked, not that results were used

Changes:
- Add legal_status column to screened_patents table
- Change judgment CHECK constraint to only allow relevant/irrelevant
- Update screening SKILL.md to explicitly distinguish abstract_text
  from snippet and batch fetch patents (up to 10 in parallel)
- Add CRITICAL warnings against using snippet as abstract
- Update record-screening.md with legal_status parameter
- Add test checks: all_patents_screened, legal_status_recorded,
  patent_fetch_invoked for both screening and evaluating
- Update all test fixtures to include legal_status in INSERT statements

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Change from relationship pattern MATCH (p:Patent)-[:claims]->(c:claims)
  to direct node match MATCH (c:claims) to avoid c.text returning null
- Remove ORDER BY toInteger(c.number) which also causes c.text null bug
- Add batch parallel fetch (up to 10 patents) for performance
- Add CRITICAL warnings documenting the Cypher parser bugs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sonesuke sonesuke merged commit f3fcb62 into main Apr 5, 2026
3 checks passed
@sonesuke sonesuke deleted the fix/screening-abstract-and-legal-status branch April 5, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants