Skip to content

fix(ocr_client): default to 'Text Recognition:' prompt in ollama_generate mode#208

Merged
JaredforReal merged 2 commits intozai-org:mainfrom
mufradhossain:fix/ollama-empty-prompt
Apr 21, 2026
Merged

fix(ocr_client): default to 'Text Recognition:' prompt in ollama_generate mode#208
JaredforReal merged 2 commits intozai-org:mainfrom
mufradhossain:fix/ollama-empty-prompt

Conversation

@mufradhossain
Copy link
Copy Markdown
Contributor

Problem

When using api_mode: ollama_generate, the SDK sends requests to Ollama
with an empty prompt. Ollama requires a non-empty prompt to trigger the
model — without it, the model returns empty content and json_result is [[]].

Root Cause

The SDK builds the prompt from message content, but GLM-OCR's internal
pipeline sends image-only requests with no text. This works against the
native GLM-OCR API but Ollama silently ignores empty prompts.

Fix

Default to "Text Recognition:" when the prompt is empty, matching
Ollama's official documented usage for GLM-OCR.

Note: any non-empty prompt works (even "."), but "Text Recognition:"
is used as it matches the official Ollama docs for this model.

Fixes #207

@JaredforReal
Copy link
Copy Markdown
Collaborator

@mufradhossain unit tests failed, can u update the unit test for this scenario?

@mufradhossain
Copy link
Copy Markdown
Contributor Author

The test update is already in the second commit 7ddd841. It updates both test_empty_messages and test_no_messages_key to expect "Text Recognition:". The CI is pending your workflow approval to run.

Copy link
Copy Markdown
Collaborator

@JaredforReal JaredforReal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! @mufradhossain

@JaredforReal JaredforReal merged commit cef4d0e into zai-org:main Apr 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OCR not extracing text?

2 participants