fix(responses): drop reasoning_effort in fall back branch by asimurka · Pull Request #5609 · ogx-ai/ogx

asimurka · 2026-04-23T13:38:22Z

What does this PR do?

When the Responses path tries openai_chat_completions_with_reasoning and the provider rejects reasoning, it falls back to openai_chat_completion. The fallback reused the same params, so reasoning_effort was still sent and the plain chat call failed with the same “unsupported reasoning” style error instead of succeeding.

This change clears reasoning_effort on that fallback only, so the non-reasoning chat completion is a valid plain request.

Closes #5607

Test Plan

Request:

curl -sS -X POST "http://localhost:8321/v1/responses" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Say hello in one short sentence.",
    "model": "openai/gpt-4o-mini",
    "stream": false,
    "reasoning": {
      "effort": "low"
    }
  }'

Response:

{
  "background": false,
  "created_at": 1776950444,
  "completed_at": 1776950445,
  "error": null,
  "frequency_penalty": 0.0,
  "id": "resp_f892dba8-4f0a-452a-aa2f-90f5d73d97d7",
  "incomplete_details": null,
  "model": "openai/gpt-4o-mini",
  "object": "response",
  "output": [
    {
      "content": [
        {
          "text": "Hello!",
          "type": "output_text",
          "annotations": [],
          "logprobs": []
        }
      ],
      "role": "assistant",
      "type": "message",
      "id": "msg_e30172d5-5007-4298-8b0e-28aeac2def73",
      "status": "completed"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": null,
  "prompt_cache_key": null,
  "prompt": null,
  "status": "completed",
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "top_p": 1.0,
  "top_logprobs": 0,
  "tools": [],
  "tool_choice": "auto",
  "truncation": "disabled",
  "usage": {
    "input_tokens": 14,
    "output_tokens": 2,
    "total_tokens": 16,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "instructions": null,
  "max_tool_calls": null,
  "reasoning": {
    "effort": "low",
    "summary": null
  },
  "max_output_tokens": null,
  "safety_identifier": null,
  "service_tier": "default",
  "metadata": null,
  "presence_penalty": 0.0,
  "store": true
}

skamenan7

LGTM, thanks.

skamenan7 · 2026-04-23T19:13:01Z

                        )
-                        completion_result = await self.inference_api.openai_chat_completion(params)
+                        completion_result = await self.inference_api.openai_chat_completion(
+                            params.model_copy(deep=True, update={"reasoning_effort": None})


nit: line 563 does not use deep copy, though harmless here in this context, it is inconsistent.

skamenan7 · 2026-04-23T19:14:06Z

nit : pre-existing, but logger.critical on line 566 is used for a successful fallback. logger.warning fits better since the system recovers automatically. not blocking for this PR.

asimurka · 2026-04-24T06:49:39Z

@skamenan7 addressed your comments. However, tests are not passing. I suspect that the issue may be related to request hashing (my fix modifies the request) because it results in RuntimeError (testing/api_recorder.py:1194). Any chance someone more experienced with your test suite can take a look, please?

skamenan7 · 2026-04-27T19:22:55Z

@skamenan7 addressed your comments. However, tests are not passing. I suspect that the issue may be related to request hashing (my fix modifies the request) because it results in RuntimeError (testing/api_recorder.py:1194). Any chance someone more experienced with your test suite can take a look, please?

I dug into the CI failure a bit more. The Ollama AttributeError: 'dict' object has no attribute 'content' looks like an existing provider issue that this PR exposed, not something you introduced. I’ll track that separately.

For this PR, I think the main thing is to keep the fallback change targeted. Dropping reasoning_effort fixes the gpt- 4o-mini fallback case, but doing it for every fallback also changes replay requests for reasoning-capable models like o4-mini. Could you make the stripping conditional, then update only the recordings for request bodies that are intentionally changed?

skamenan7 · 2026-04-27T21:23:20Z

FYI, I created this bug - #5636

asimurka · 2026-04-28T08:48:26Z

@skamenan7 Thanks for help. I managed to find a fix with a temporary workaround for ollama. WDYT?

asimurka requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners April 23, 2026 13:38

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 23, 2026

skamenan7 approved these changes Apr 23, 2026

View reviewed changes

asimurka force-pushed the fix_reasoning_effort_drop_on_fallback branch from 2af46b8 to 6d81a10 Compare April 24, 2026 06:21

skamenan7 mentioned this pull request Apr 27, 2026

bug: ollama reasoning path passes dict messages into OpenAIMixin code that expects .content #5636

Open

asimurka force-pushed the fix_reasoning_effort_drop_on_fallback branch 3 times, most recently from 3a98995 to f0bd8e6 Compare April 28, 2026 08:32

fix(responses): drop reasoning_effort in fall back branch

15ab2f6

asimurka force-pushed the fix_reasoning_effort_drop_on_fallback branch from f0bd8e6 to 15ab2f6 Compare April 28, 2026 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(responses): drop reasoning_effort in fall back branch#5609

fix(responses): drop reasoning_effort in fall back branch#5609
asimurka wants to merge 1 commit intoogx-ai:mainfrom
asimurka:fix_reasoning_effort_drop_on_fallback

asimurka commented Apr 23, 2026

Uh oh!

skamenan7 left a comment

Uh oh!

skamenan7 Apr 23, 2026

Uh oh!

skamenan7 Apr 23, 2026

Uh oh!

asimurka commented Apr 24, 2026 •

edited

Loading

Uh oh!

skamenan7 commented Apr 27, 2026

Uh oh!

skamenan7 commented Apr 27, 2026

Uh oh!

asimurka commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

asimurka commented Apr 23, 2026

What does this PR do?

Test Plan

Uh oh!

skamenan7 left a comment

Choose a reason for hiding this comment

Uh oh!

skamenan7 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

skamenan7 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

asimurka commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skamenan7 commented Apr 27, 2026

Uh oh!

skamenan7 commented Apr 27, 2026

Uh oh!

asimurka commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asimurka commented Apr 24, 2026 •

edited

Loading