fix(ocr_client): default to 'Text Recognition:' prompt in ollama_generate mode#208
Merged
JaredforReal merged 2 commits intozai-org:mainfrom Apr 21, 2026
Merged
Conversation
Closed
Collaborator
|
@mufradhossain unit tests failed, can u update the unit test for this scenario? |
Contributor
Author
|
The test update is already in the second commit 7ddd841. It updates both test_empty_messages and test_no_messages_key to expect "Text Recognition:". The CI is pending your workflow approval to run. |
JaredforReal
approved these changes
Apr 21, 2026
Collaborator
JaredforReal
left a comment
There was a problem hiding this comment.
LGTM, thanks! @mufradhossain
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using
api_mode: ollama_generate, the SDK sends requests to Ollamawith an empty prompt. Ollama requires a non-empty prompt to trigger the
model — without it, the model returns empty content and
json_resultis[[]].Root Cause
The SDK builds the prompt from message content, but GLM-OCR's internal
pipeline sends image-only requests with no text. This works against the
native GLM-OCR API but Ollama silently ignores empty prompts.
Fix
Default to
"Text Recognition:"when the prompt is empty, matchingOllama's official documented usage for GLM-OCR.
Note: any non-empty prompt works (even "."), but
"Text Recognition:"is used as it matches the official Ollama docs for this model.
Fixes #207