Skip to content

fix: bypass LLM generation for claims recording#54

Merged
sonesuke merged 10 commits intomainfrom
fix/screening-abstract-and-legal-status
Apr 6, 2026
Merged

fix: bypass LLM generation for claims recording#54
sonesuke merged 10 commits intomainfrom
fix/screening-abstract-and-legal-status

Conversation

@sonesuke
Copy link
Copy Markdown
Owner

@sonesuke sonesuke commented Apr 5, 2026

Summary

  • Claims text recording now uses sqlite3 readfile() + json_each() + json_extract() to INSERT directly from fetch_patent output_file
  • Eliminates LLM text regeneration which was corrupting long repetitive claim structures

Problem

LLM was regenerating claim text when constructing SQL INSERT statements, causing:

  • Merged elements (2 independent means combined into 1)
  • Dropped modifiers ("アップロードされた", "収集した情報を基に")
  • Summarized repetitive structures

Solution

output_filereadfile()json_extract('$.claims')json_each()INSERT

Claims text flows through the file system, not through LLM generation.

Test plan

  • Verified: 23 claims from US12231380B2 recorded with complete text (2331 chars for claim 1)
  • evaluating/functional test

🤖 Generated with Claude Code

web-flow and others added 7 commits April 5, 2026 08:35
…source in screening

The screening skill was storing search snippets as abstract_text instead
of the official abstract from fetch_patent. This was caused by:

1. No explicit instruction to use fetch_patent.abstract_text (not snippet)
2. No legal_status column in screened_patents table, causing judgment
   to conflate relevance assessment with legal status
3. Tests only verified fetch_patent was invoked, not that results were used

Changes:
- Add legal_status column to screened_patents table
- Change judgment CHECK constraint to only allow relevant/irrelevant
- Update screening SKILL.md to explicitly distinguish abstract_text
  from snippet and batch fetch patents (up to 10 in parallel)
- Add CRITICAL warnings against using snippet as abstract
- Update record-screening.md with legal_status parameter
- Add test checks: all_patents_screened, legal_status_recorded,
  patent_fetch_invoked for both screening and evaluating
- Update all test fixtures to include legal_status in INSERT statements

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Change from relationship pattern MATCH (p:Patent)-[:claims]->(c:claims)
  to direct node match MATCH (c:claims) to avoid c.text returning null
- Remove ORDER BY toInteger(c.number) which also causes c.text null bug
- Add batch parallel fetch (up to 10 patents) for performance
- Add CRITICAL warnings documenting the Cypher parser bugs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nctions

Claims text was being corrupted by LLM regeneration during recording.
Now uses readfile() + json_each() + json_extract() to INSERT directly
from fetch_patent output_file, eliminating LLM text intermediary.

- Claims: output_file → sqlite3 JSON functions → DB (mechanical)
- Elements: still via investigation-recording skill (LLM interpretation)
- claim_type: initial INSERT defaults claim 1 to independent, then
  LLM reads from DB and UPDATEs correct independent claim numbers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use sqlite3 readfile() + json_extract() to extract abstract_text and
legal_status directly from fetch_patent output_file, eliminating LLM
text regeneration that could corrupt patent abstracts.

- abstract_text and legal_status: output_file → readfile() → json_extract() → DB (mechanical)
- judgment and reason: LLM analysis (correct - these are interpretation tasks)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
output_file already contains all needed data (abstract_text, legal_status).
No need to query the graph DB with execute_cypher.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t-and-legal-status

# Conflicts:
#	plugin/skills/evaluating/SKILL.md
#	plugin/skills/screening/SKILL.md
@sonesuke sonesuke force-pushed the fix/screening-abstract-and-legal-status branch from eea7c53 to 982028f Compare April 6, 2026 11:14
web-flow and others added 3 commits April 6, 2026 11:17
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avoids version mismatch between CI cached versions and local npx.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sonesuke sonesuke merged commit 7a143b8 into main Apr 6, 2026
3 checks passed
@sonesuke sonesuke deleted the fix/screening-abstract-and-legal-status branch April 6, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants