feat(windows): add hidden OCR context capture from frontmost window for post-processing by evrenesat · Pull Request #808 · cjpais/Handy

evrenesat · 2026-02-12T23:27:07Z

Summary

This PR adds Windows support for the hidden OCR context template variable (${OCR} / ${ocr}) used in post-processing prompts.(Windows counter-part of #770)

The captured text is read from the current foreground window and injected only into the prompt payload.

Why

This improves prompt quality for post-processing while keeping the app local-first and extensible:

Better context-aware rewriting/correction prompts
No cloud dependency added
Clear platform-specific implementation that is easy to fork/extend

What changed

Added Windows OCR capture module:
- src-tauri/src/windows_ocr.rs
Wired Windows OCR into prompt-building flow:
- src-tauri/src/actions.rs
Registered Windows OCR module:
- src-tauri/src/lib.rs
Added required Windows API feature flags:
- src-tauri/Cargo.toml

Behavior

OCR context is only used if prompt contains ${OCR} or ${ocr}.
If OCR capture fails, processing continues with empty OCR context.
On unsupported platforms for this implementation path, OCR context resolves to empty string.

Testing

Tested on Windows:

Configure a post-processing prompt containing ${OCR}.
Start transcription with post-processing enabled.
Verified prompt receives OCR text from frontmost window.
Verified fallback to empty OCR context when capture fails.
Verified normal transcription flow is unchanged.

Note: I have built and manually tested this on a Windows 11 ARM guest running on MacOS. Even with help of Codex, it took hours to build on arm64. I tested that post-processing works both with and without OCR prompts. In my limited testing, with OCR, model consistently managed to use correct symbols in dictation result (e.g. createUpdaterArtifacts instead of "create-updater-artifacts" or "Create Updater Artifacts" or "create updater artifacts").

Scope / Non-goals

This PR is Windows-only OCR.
No local machine build-workaround changes are included in this PR. (hopefully)
No updater/signing config changes included.

Breaking changes

None.

AI Assistance Disclosure

AI used: Yes
Tools used: GPT-5 Codex / ChatGPT
Extent: Assisted with implementation scaffolding, conflict resolution, and debugging workflow; final code reviewed and validated manually.

Note: I'm planning to try to create another PR for Linux as well, using Tesseract.

cjpais · 2026-02-16T14:42:52Z

@evrenesat would you mind merging this into the same PR as #770? Would rather have the feature come in all at once when I decide to merge it. Closing this PR for now, I'm trying to slim down the number of active PR's so I can have some sanity and manage maintenance.

feat(windows): add OCR context capture for frontmost window

78de1d5

cjpais closed this Feb 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

feat(windows): add hidden OCR context capture from frontmost window for post-processing#808

feat(windows): add hidden OCR context capture from frontmost window for post-processing#808
evrenesat wants to merge 1 commit intocjpais:mainfrom
evrenesat:windows-ocr-pr-main

evrenesat commented Feb 12, 2026

Uh oh!

cjpais commented Feb 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Comments

Conversation

evrenesat commented Feb 12, 2026

Summary

Why

What changed

Behavior

Testing

Scope / Non-goals

Breaking changes

AI Assistance Disclosure

Uh oh!

cjpais commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cjpais commented Feb 16, 2026 •

edited

Loading