Skip to content

feat: Add DeepSeek thinking mode support for OpenAI compatibility layer#206

Merged
claude-code-best merged 2 commits intoclaude-code-best:mainfrom
guunergooner:main
Apr 8, 2026
Merged

feat: Add DeepSeek thinking mode support for OpenAI compatibility layer#206
claude-code-best merged 2 commits intoclaude-code-best:mainfrom
guunergooner:main

Conversation

@guunergooner
Copy link
Copy Markdown
Contributor

@guunergooner guunergooner commented Apr 8, 2026

Summary

  • Add DeepSeek reasoning models support (deepseek-reasoner and DeepSeek-V3.2) in OpenAI-compatible API layer
  • Automatic thinking mode detection based on model name
  • Inject thinking parameters in request body (both official API and vLLM formats)
  • Preserve reasoning_content in message conversion for tool call iterations
  • Comprehensive test coverage

Details

Thinking Mode Detection

Thinking mode is enabled when:

  1. OPENAI_ENABLE_THINKING=1 is set (explicit enable), OR
  2. Model name contains deepseek-reasoner or deepseek-v3.2 (auto-detect, case-insensitive)

Explicit disable (OPENAI_ENABLE_THINKING=0/false/no/off) overrides model auto-detect.

Request Body Parameters

Two formats are added simultaneously to support different deployments:

  • Official DeepSeek API: thinking: { type: 'enabled' }
  • Self-hosted DeepSeek-V3.2/vLLM: enable_thinking: true + chat_template_kwargs: { thinking: true }

Message Conversion

  • Preserve reasoning_content on assistant messages for tool call iterations
  • Strip reasoning_content from previous turns to save bandwidth (DeepSeek ignores it anyway)

Test Plan

  • Unit tests for isOpenAIThinkingEnabled() function
  • Tests for env var enable/disable priority
  • Tests for model name auto-detect (deepseek-reasoner, deepseek-v3.2)
  • Tests for thinking request parameters format
  • All 88 OpenAI tests pass

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added DeepSeek "thinking" mode support with automatic model detection and opt-in via configuration.
    • Messages can now preserve assistant reasoning content alongside visible replies when thinking mode is enabled.
  • Tests

    • Added thorough tests covering thinking-mode toggles, model detection, request shaping, and message-conversion behaviors.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

Adds DeepSeek "thinking" support to OpenAI integration: detects enablement via env/model heuristics, preserves Anthropic thinking blocks as OpenAI reasoning_content when enabled, extends message conversion and request-body construction, and adds tests covering detection, payload shapes, and conversion behavior.

Changes

Cohort / File(s) Summary
Core Implementation
src/services/api/openai/convertMessages.ts, src/services/api/openai/index.ts
Added ConvertMessagesOptions and enableThinking propagation; turn-boundary detection to decide when to preserve Anthropic thinking blocks as reasoning_content; new helpers isOpenAIThinkingEnabled and buildOpenAIRequestBody; request payload now conditionally includes DeepSeek thinking fields and omits temperature when thinking is enabled.
Tests
src/services/api/openai/__tests__/convertMessages.test.ts, src/services/api/openai/__tests__/thinking.test.ts
New test suites validating: conversion preserving/stripping thinking blocks across turns and tool calls; concatenation/skipping of thinking blocks; behavior when assistant message contains only thinking+tool_use; isOpenAIThinkingEnabled env parsing and model-name heuristics; buildOpenAIRequestBody shapes for thinking on/off and tool/temperature handling.

Sequence Diagram

sequenceDiagram
    participant Client
    participant QueryOpenAI as queryModelOpenAI()
    participant ThinkingCheck as isOpenAIThinkingEnabled()
    participant MsgConverter as anthropicMessagesToOpenAI()
    participant OpenAIAPI as OpenAI API

    Client->>QueryOpenAI: queryModelOpenAI(messages, model)
    QueryOpenAI->>ThinkingCheck: isOpenAIThinkingEnabled(model)
    ThinkingCheck-->>QueryOpenAI: enableThinking (bool)
    QueryOpenAI->>MsgConverter: anthropicMessagesToOpenAI(messages, systemPrompt, {enableThinking})
    Note over MsgConverter: detect turn boundaries\npreserve `thinking` ⇒ `reasoning_content` when enabled
    MsgConverter-->>QueryOpenAI: converted OpenAI messages
    QueryOpenAI->>QueryOpenAI: buildOpenAIRequestBody(..., enableThinking)
    QueryOpenAI->>OpenAIAPI: POST chat.completions.create (with/without thinking fields)
    OpenAIAPI-->>QueryOpenAI: streamed response
    QueryOpenAI-->>Client: stream output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰
I nibble logic, stitch each thought,
Turn boundaries found, no pondering lost.
Thinking kept clear, tool calls still play,
A hop, a patch — DeepSeek leads the way! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Add DeepSeek thinking mode support for OpenAI compatibility layer' clearly and accurately summarizes the main change: adding support for DeepSeek thinking/reasoning mode to the OpenAI API layer.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/services/api/openai/__tests__/thinking.test.ts (1)

143-191: Assert the payload built by queryModelOpenAI().

These cases only assert local constants, so they'll still pass if the request builder stops sending thinking, enable_thinking, or chat_template_kwargs. Please mock the OpenAI client and inspect the actual body for both enabled and disabled paths.

As per coding guidelines, "For mock modules with heavy dependencies, use mock.module() with await import() inline in test files, not in shared helpers."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/api/openai/__tests__/thinking.test.ts` around lines 143 - 191,
Replace the current tests that assert only local constants with ones that call
queryModelOpenAI and inspect the actual request body sent to the OpenAI client:
mock the OpenAI client used by queryModelOpenAI (using mock.module() with an
inline await import() inside the test file per guidelines) and capture the
request payload for both "thinking enabled" and "thinking disabled" code paths,
asserting that the payload includes thinking: { type: 'enabled' },
enable_thinking: true and chat_template_kwargs.thinking when enabled and that
none of those keys exist when disabled; ensure the tests reference
queryModelOpenAI and the parameter names thinking, enable_thinking, and
chat_template_kwargs so failures point to the request builder.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/services/api/openai/__tests__/thinking.test.ts`:
- Around line 14-17: The afterEach teardown currently assigns
process.env.OPENAI_ENABLE_THINKING = originalEnv.OPENAI_ENABLE_THINKING which
leaves the env key present when the original was undefined; update the afterEach
block (referencing originalEnv.OPENAI_ENABLE_THINKING and the afterEach
teardown) to check whether originalEnv.OPENAI_ENABLE_THINKING is undefined and,
if so, delete process.env.OPENAI_ENABLE_THINKING, otherwise restore the original
value, so the environment key is not leaked into subsequent tests.

In `@src/services/api/openai/convertMessages.ts`:
- Around line 60-68: The current check in convertMessages.ts only treats a user
message as a new turn if it contains text (variables: msg.type === 'user',
hasSeenAssistant, content, hasText) which misses multimodal-only inputs (e.g.,
images) and allows prior turn reasoning to persist (reasoning_content); update
the condition that decides to call turnBoundaries.add(i) so that it also treats
non-text user inputs as new turns — detect when content is an array and contains
any non-text blocks (e.g., block.type === 'image' or other media) or when text
is empty but there are attachments, and then add the boundary; ensure the
modified logic still respects hasSeenAssistant and continues to call
turnBoundaries.add(i) when either textual content exists or multimodal/non-text
content is present.

---

Nitpick comments:
In `@src/services/api/openai/__tests__/thinking.test.ts`:
- Around line 143-191: Replace the current tests that assert only local
constants with ones that call queryModelOpenAI and inspect the actual request
body sent to the OpenAI client: mock the OpenAI client used by queryModelOpenAI
(using mock.module() with an inline await import() inside the test file per
guidelines) and capture the request payload for both "thinking enabled" and
"thinking disabled" code paths, asserting that the payload includes thinking: {
type: 'enabled' }, enable_thinking: true and chat_template_kwargs.thinking when
enabled and that none of those keys exist when disabled; ensure the tests
reference queryModelOpenAI and the parameter names thinking, enable_thinking,
and chat_template_kwargs so failures point to the request builder.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4569cf2b-0f38-4630-9056-ab72ce164f67

📥 Commits

Reviewing files that changed from the base of the PR and between 73a18c3 and 72dfb8f.

⛔ Files ignored due to path filters (1)
  • contributors.svg is excluded by !**/*.svg
📒 Files selected for processing (4)
  • src/services/api/openai/__tests__/convertMessages.test.ts
  • src/services/api/openai/__tests__/thinking.test.ts
  • src/services/api/openai/convertMessages.ts
  • src/services/api/openai/index.ts

guunergooner added a commit to guunergooner/claude-code-best that referenced this pull request Apr 8, 2026
…ing mode

- Fix afterEach env var leak: use delete instead of assigning undefined
- Fix turn boundary detection: treat multimodal inputs (e.g. images)
  as new turns, not just text content
- Extract buildOpenAIRequestBody() for testability, replace constant-
  assertion tests with real function output verification

Signed-off-by: guunergooner <tongchao0923@gmail.com>
- Add DeepSeek reasoning models support (deepseek-reasoner and DeepSeek-V3.2)
- Automatic thinking mode detection based on model name
- Inject thinking parameters in request body (both official API and vLLM formats)
- Preserve reasoning_content in message conversion for tool call iterations
- Extract buildOpenAIRequestBody() for testability
- Treat multimodal inputs (e.g. images) as new turn boundaries
- Fix env var cleanup in tests to prevent state leak

Signed-off-by: guunergooner <tongchao0923@gmail.com>
@guunergooner
Copy link
Copy Markdown
Contributor Author

Hi @coderabbitai[bot], all review suggestions from your feedback have been addressed in this PR:

  1. afterEach env var leak — Use delete instead of assigning undefined when original env var was undefined.
  2. Turn boundary detection for multimodal inputs — Treat non-text user inputs (e.g. images) as new turns, not just text content.
  3. Improved test coverage — Extracted buildOpenAIRequestBody() from queryModelOpenAI for testability; replaced constant-assertion tests with real function output verification.

All 94 OpenAI tests pass.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

🧩 Analysis chain

Let me verify the changes you've made to ensure everything looks good.


🏁 Script executed:

#!/bin/bash
# Check the current state of thinking.test.ts for env var cleanup
fd "thinking.test.ts" --exec cat {}

# Check convertMessages.ts for multimodal turn boundary detection
fd "convertMessages.ts" src/services/api/openai --exec cat {}

# Check index.ts for buildOpenAIRequestBody extraction
fd "index.ts" src/services/api/openai --exec cat {}

Unrecoverable error: Failed to clone repository


@guunergooner, thanks for the detailed update! Unfortunately the repository couldn't be cloned for me to inspect the code directly, but the three fixes you've described all look like solid, correct improvements:

  1. delete for env var cleanup — Using delete process.env.OPENAI_ENABLE_THINKING (instead of assigning undefined) is the proper pattern for restoring env state in tests, ensuring no cross-test leakage.

  2. Multimodal turn boundary detection — Treating non-text user content (e.g. image blocks) as turn boundaries is semantically correct. The previous text-only heuristic would have incorrectly preserved reasoning_content across what are logically new conversation turns.

  3. buildOpenAIRequestBody() extraction — Extracting the request body construction into a testable function and replacing constant-assertion tests with real function output verification significantly improves test confidence and maintainability.

The jump from 88 → 94 passing tests is a good sign that the new coverage is solid. Let me trigger a fresh review to check the updated diff.


🧠 Learnings used
Learnt from: CR
Repo: claude-code-best/claude-code PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T06:23:32.787Z
Learning: Applies to src/services/api/openai/**/*.{ts,tsx} : OpenAI compatibility layer (CLAUDE_CODE_USE_OPENAI env var): convert Anthropic request format to OpenAI format and adapt SSE stream responses in `src/services/api/openai/`
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/services/api/openai/__tests__/thinking.test.ts`:
- Around line 50-63: Add parity tests to cover the remaining hard-disable
environment values for isOpenAIThinkingEnabled: add two tests in the same test
block in src/services/api/openai/__tests__/thinking.test.ts that set
process.env.OPENAI_ENABLE_THINKING = 'no' and process.env.OPENAI_ENABLE_THINKING
= 'off' respectively and assert
expect(isOpenAIThinkingEnabled('deepseek-reasoner' or 'gpt-4o')).toBe(false);
follow the existing test style (same test name pattern) so the new cases run
alongside the existing '0', 'false', and empty-string tests.

In `@src/services/api/openai/convertMessages.ts`:
- Around line 60-76: The current logic in convertMessages that sets
startsNewUserTurn and adds i to turnBoundaries (when msg.type === 'user' &&
hasSeenAssistant) erroneously marks a new turn if any non-tool_result block
exists even when the message is a mix of tool_result plus user content; change
the predicate so a boundary is added only when the message contains
user-authored content and contains no 'tool_result' blocks (i.e., detect
presenceOfToolResult and presenceOfUserContent separately and require
presenceOfUserContent && !presenceOfToolResult before calling
turnBoundaries.add(i)), and add a regression test that sends a user message with
content [tool_result, "some text"] to ensure it does not start a new turn.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 883724b5-d88b-45fe-9285-0fc7c058d58f

📥 Commits

Reviewing files that changed from the base of the PR and between 72dfb8f and 8184677.

⛔ Files ignored due to path filters (1)
  • contributors.svg is excluded by !**/*.svg
📒 Files selected for processing (4)
  • src/services/api/openai/__tests__/convertMessages.test.ts
  • src/services/api/openai/__tests__/thinking.test.ts
  • src/services/api/openai/convertMessages.ts
  • src/services/api/openai/index.ts
✅ Files skipped from review due to trivial changes (1)
  • src/services/api/openai/tests/convertMessages.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/services/api/openai/index.ts

Comment on lines +50 to +63
test('returns false when OPENAI_ENABLE_THINKING=0', () => {
process.env.OPENAI_ENABLE_THINKING = '0'
expect(isOpenAIThinkingEnabled('deepseek-reasoner')).toBe(false)
})

test('returns false when OPENAI_ENABLE_THINKING=false', () => {
process.env.OPENAI_ENABLE_THINKING = 'false'
expect(isOpenAIThinkingEnabled('deepseek-reasoner')).toBe(false)
})

test('returns false when OPENAI_ENABLE_THINKING is empty', () => {
process.env.OPENAI_ENABLE_THINKING = ''
expect(isOpenAIThinkingEnabled('gpt-4o')).toBe(false)
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the remaining hard-disable env cases.

The new contract includes no and off, but this block only locks down 0 and false. A couple of parity tests here will keep those disable spellings from regressing silently.

Suggested tests
     test('returns false when OPENAI_ENABLE_THINKING=false', () => {
       process.env.OPENAI_ENABLE_THINKING = 'false'
       expect(isOpenAIThinkingEnabled('deepseek-reasoner')).toBe(false)
     })
+
+    test('returns false when OPENAI_ENABLE_THINKING=no', () => {
+      process.env.OPENAI_ENABLE_THINKING = 'no'
+      expect(isOpenAIThinkingEnabled('deepseek-reasoner')).toBe(false)
+    })
+
+    test('returns false when OPENAI_ENABLE_THINKING=off', () => {
+      process.env.OPENAI_ENABLE_THINKING = 'off'
+      expect(isOpenAIThinkingEnabled('deepseek-reasoner')).toBe(false)
+    })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/api/openai/__tests__/thinking.test.ts` around lines 50 - 63, Add
parity tests to cover the remaining hard-disable environment values for
isOpenAIThinkingEnabled: add two tests in the same test block in
src/services/api/openai/__tests__/thinking.test.ts that set
process.env.OPENAI_ENABLE_THINKING = 'no' and process.env.OPENAI_ENABLE_THINKING
= 'off' respectively and assert
expect(isOpenAIThinkingEnabled('deepseek-reasoner' or 'gpt-4o')).toBe(false);
follow the existing test style (same test name pattern) so the new cases run
alongside the existing '0', 'false', and empty-string tests.

Comment on lines +60 to +76
if (msg.type === 'user' && hasSeenAssistant) {
const content = msg.message.content
// A user message starts a new turn if it contains any non-tool_result content
// (text, image, or other media). Tool results alone do NOT start a new turn
// because they are continuations of the previous assistant tool call.
const startsNewUserTurn = typeof content === 'string'
? content.length > 0
: Array.isArray(content) && content.some(
(b: any) =>
typeof b === 'string' ||
(b &&
typeof b === 'object' &&
'type' in b &&
b.type !== 'tool_result'),
)
if (startsNewUserTurn) {
turnBoundaries.add(i)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't split mixed tool_result user messages into a new turn.

This boundary flips on as soon as any non-tool_result block is present, but the converter below already supports a single user message that contains tool_result plus fresh text/images. In that mixed case, the preceding assistant message is still part of the same tool-call iteration, so stripping its reasoning_content here can break the follow-up. Only mark a boundary when the message has user-authored content and no tool_result blocks, and add a regression case for [tool_result, text].

Suggested fix
-        const startsNewUserTurn = typeof content === 'string'
-          ? content.length > 0
-          : Array.isArray(content) && content.some(
-              (b: any) =>
-                typeof b === 'string' ||
-                (b &&
-                  typeof b === 'object' &&
-                  'type' in b &&
-                  b.type !== 'tool_result'),
-            )
+        const hasToolResult = Array.isArray(content) && content.some(
+          (b: any) =>
+            b &&
+            typeof b === 'object' &&
+            'type' in b &&
+            b.type === 'tool_result',
+        )
+        const hasUserContent = typeof content === 'string'
+          ? content.length > 0
+          : Array.isArray(content) && content.some(
+              (b: any) =>
+                typeof b === 'string' ||
+                (b &&
+                  typeof b === 'object' &&
+                  'type' in b &&
+                  b.type !== 'tool_result'),
+            )
+        const startsNewUserTurn = hasUserContent && !hasToolResult
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/api/openai/convertMessages.ts` around lines 60 - 76, The current
logic in convertMessages that sets startsNewUserTurn and adds i to
turnBoundaries (when msg.type === 'user' && hasSeenAssistant) erroneously marks
a new turn if any non-tool_result block exists even when the message is a mix of
tool_result plus user content; change the predicate so a boundary is added only
when the message contains user-authored content and contains no 'tool_result'
blocks (i.e., detect presenceOfToolResult and presenceOfUserContent separately
and require presenceOfUserContent && !presenceOfToolResult before calling
turnBoundaries.add(i)), and add a regression test that sends a user message with
content [tool_result, "some text"] to ensure it does not start a new turn.

@claude-code-best claude-code-best merged commit a3505ae into claude-code-best:main Apr 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants