Skip to content

Fix: flush partial stop string when <EOG> is reached in /completion endpoint in streaming mode #15007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

matteoserva
Copy link
Contributor

In the /completion endpoint the server correctly holds the partial stop string until a decision can be made.

If EOG is reached and a stop string was not found, then the generated content should be flushed to the user.

To reproduce:
query: Repeat exactly the following text: UNO DUE TRE
stop: ["TRE\nAAAAAAAA"]
stream: True

Previous result:
UNO DUE

After this PR:
UNO DUE TRE

Note:
The openAI compatible api has the opposite problem, the partial stop string is always flushed even if the full stop string is found.
I don't know how to fix that code path.

@matteoserva matteoserva requested a review from ngxson as a code owner August 1, 2025 08:29
@matteoserva matteoserva changed the title flush partial stop string when <EOG> is reached in /completion endpoint in streaming mode Fix: flush partial stop string when <EOG> is reached in /completion endpoint in streaming mode Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant