Skip to content

refactor: clean up reasoning implementation#5429

Open
robinnarsinghranabhat wants to merge 4 commits intoogx-ai:mainfrom
robinnarsinghranabhat:chore/reasoning-cleanup-followups
Open

refactor: clean up reasoning implementation#5429
robinnarsinghranabhat wants to merge 4 commits intoogx-ai:mainfrom
robinnarsinghranabhat:chore/reasoning-cleanup-followups

Conversation

@robinnarsinghranabhat
Copy link
Copy Markdown
Contributor

Summary

Follow-up cleanups to #5206 (reasoning output in responses API) and #5407 (type fixes), addressing items from mattf's review.

Code changes

  • Extracted shared reasoning util (src/llama_stack/providers/utils/inference/reasoning.py): All three providers (vLLM, Ollama, Bedrock) had identical message mapping and chunk wrapping code. Pulled it into a shared reasoning.py util with no defaults -- each provider explicitly passes its field names.

  • Simplified return type: openai_chat_completions_with_reasoning is streaming-only across all providers. Removed OpenAIChatCompletionWithReasoning from return type in the API protocol, router, and all provider implementations.

  • Removed dead code: Deleted no-op _prepare_reasoning_params stubs from vLLM and Bedrock providers. Inlined Ollama's version (2-line default for reasoning_effort).

  • Removed type ignores: Eliminated # type: ignore on _separate_tool_calls assignment (reordered if/else to assign base type first) and on tool_call.index usage in streaming chunk processing.

Test changes:

  • Added test_reasoning_basic_streaming and test_reasoning_multi_turn_with_tool_call to the ollama-reasoning test suite.
  • Un-skipped vLLM reasoning tests -- the CI action.yaml now configures a reasoning parser, so vLLM actually returns reasoning tokens.

Addressed from review

Item Status
openai_chat_completions_with_reasoning is streaming only, remove non-streaming return type Done
Remove unused _prepare_reasoning_params from vLLM Done
Remove unused _prepare_reasoning_params from Bedrock Done
Inline _prepare_reasoning_params for Ollama Done
bedrock, vllm, ollama need same transformations -- consider a util Done
Put message = OpenAIAssistantMessageParam case first to resolve type ignore Done
Figure out why there are new type ignores Done (removed 5 type ignores)
Type of completion_result confused by storing both CC and CC-with-reasoning Done (removed non-streaming type from union)

TODO ( needs design discussion )

Item Question
Non-adjacent reasoning items _get_preceding_reasoning only checks index - 1, based on what I observed from official openai's response output. Is there a missing Edge case ?
Error vs silent fallback When a user asks for reasoning and the provider doesn't support it, should we return an error or silently proceed without reasoning? Currently falls back silently with a critical log.
Move merge logic to provider Suggestions Appreciated ...
Gemini implementation Gemini is strict on input shape and may use different field names (e.g. "thinking"). The shared util now supports custom field names, so this is ready to implement.
n != 1 simplification Not sure what this means
Reduce model_copy() calls For safety, currently copying params in multiple places. Need to trace mutation paths to determine redundant copies.
Complete types on _separate_tool_calls Didn't tamper for now

Test plan

  • BFCL Eval maintains accuracy
  • vllm server renders prompt correctly with reasoning inside analysis channel
  • uv run pre-commit run mypy-full --hook-stage manual --all-files # passes

…ntations

Remove duplicated reasoning logic from vLLM, Ollama, and Bedrock providers
into a shared reasoning.py utility. Remove dead _prepare_reasoning_params
stubs, simplify return types to streaming-only, eliminate type ignores in
streaming.py, and add multi-turn reasoning + tool call integration test.

Signed-off-by: robinnarsinghranabhat <robinnarsingha123@gmail.com>
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 3, 2026
Signed-off-by: robinnarsinghranabhat <robinnarsingha123@gmail.com>
@robinnarsinghranabhat robinnarsinghranabhat force-pushed the chore/reasoning-cleanup-followups branch from 0556918 to 8d69201 Compare April 3, 2026 15:50
@robinnarsinghranabhat robinnarsinghranabhat changed the title Chore/reasoning cleanup followups refactor: clean up reasoning implementation Apr 3, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 9, 2026

This pull request has merge conflicts that must be resolved before it can be merged. @robinnarsinghranabhat please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Apr 9, 2026
@cdoern
Copy link
Copy Markdown
Collaborator

cdoern commented Apr 16, 2026

@robinnarsinghranabhat what is the status of this one?

@nidhishgajjar

This comment was marked as spam.

1 similar comment
@nidhishgajjar

This comment was marked as spam.

@nidhishgajjar

This comment was marked as spam.

@robinnarsinghranabhat
Copy link
Copy Markdown
Contributor Author

@cdoern I might have to rebase. but just awaiting review. But it's mainly effort towards code clean up and get rid of type-ignores for changes introduced in reasoning PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. needs-rebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants