Fix case-sensitive search in PDF term matching by rootdevss · Pull Request #3 · jafrank88/CaseStrainer

rootdevss · 2026-02-09T21:01:09Z

Summary

This PR addresses a bug in the PDF search functionality where term matching was case-sensitive, potentially causing valid matches to be missed.

Problem

The current implementation in check_pdf.py uses case-sensitive string comparison:

if term in text:

This causes the search to fail when the case differs between the search term and the PDF content. For a tool designed to identify legal citations, this is particularly problematic because:

Legal citations may appear in various case formats (e.g., F.3d vs f.3d)
Case names can be written differently (e.g., Singh-Kaur vs singh-kaur vs SINGH-KAUR)
Federal Reporter citations may vary in capitalization
The tool could miss legitimate citations simply due to case differences

Solution

Implemented case-insensitive string matching by converting both the search term and extracted text to lowercase before comparison:

if term.lower() in text.lower():

Additionally updated the context extraction logic to use the same case-insensitive approach when finding the position of matches:

pos = text.lower().find(term.lower())

Testing Recommendation

To verify this fix works correctly, test with PDFs containing:

Mixed case legal citations
All uppercase case names
All lowercase reporter citations
Variations in spacing and punctuation with different cases

Impact

This change ensures that all legitimate matches are found regardless of case formatting, improving the reliability and accuracy of the hallucination detection tool.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature
Breaking change
Documentation update

The previous implementation used case-sensitive string matching, which could miss valid matches when the case differed between the search term and the PDF content. This is particularly problematic for legal citations which may appear in various case formats (e.g., 'F.3d' vs 'f.3d', 'Singh-Kaur' vs 'singh-kaur'). This commit makes the search case-insensitive by converting both the search term and the extracted text to lowercase before comparison, ensuring all valid matches are found regardless of case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix case-sensitive search in PDF term matching#3

Fix case-sensitive search in PDF term matching#3
rootdevss wants to merge 1 commit intojafrank88:mainfrom
rootdevss:fix-case-sensitive-search

rootdevss commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rootdevss commented Feb 9, 2026

Summary

Problem

Solution

Testing Recommendation

Impact

Type of Change

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant