Phase 0: default inference parameters — backend#441
Phase 0: default inference parameters — backend#441turisanapo wants to merge 12 commits intomainfrom
Conversation
Add a `parameters` field to ModelConfigSchema that allows teams to configure default inference parameters (temperature, top_p, max_tokens, service_tier, reasoning) on model aliases. Parameters are injected into request bodies using ??= semantics so client values always win. Resolves #425 (Phase 1 — backend) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis pull request introduces model parameter defaults configuration across the backend API and gateway. Changes include defining a Changes
Sequence DiagramsequenceDiagram
participant Client
participant Gateway as Gateway<br/>(hooks.ts)
participant ModelCatalog as Model Catalog
participant ModelConfig as Model Config<br/>(types.ts)
Client->>Gateway: Request with body
Gateway->>ModelCatalog: Resolve model alias
ModelCatalog-->>Gateway: Model config + defaults
Note over Gateway: readMeta() extracts<br/>defaults from additionalProperties
Gateway->>ModelConfig: Get ModelParameters<br/>from config
ModelConfig-->>Gateway: ModelParameters schema
Note over Gateway: injectModelParameters()<br/>merges defaults with request body<br/>using ??= semantics
Gateway->>Gateway: Translate parameters<br/>by operation type<br/>(chat/messages/responses)
Note over Gateway: max_tokens → specific fields<br/>reasoning → reasoning_effort/thinking<br/>stop → stop_sequences
Gateway-->>Client: Enhanced request body
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
|
@CodeRabbit please review |
|
Tip For best results, initiate chat on the files or code changes.
[review] |
|
I am a bit confused by the implemented approach. What's the purpose of tying defaults to a configured provider that cannot be changed at runtime? This should live on model aliases instead, should't it? (minor: don't see a reason to include |
Defaults are introduced at two levels: 1/ Model config: configurable per alias, resolved at runtime. These are "user defaults" and will be editable from the console in a different PR (Phase 1 from the issue design) 2/ Catalog model default: system-level. These provide out-of-the-box defaults for selected models. The resolution chain is: Where the client always wins. In this PR, I'm introducing the Do we agree on having system-level defaults? Alternatively, we should drop them and only support user-configurable defaults. |
|
I don't see a requirement for catalogue / system level defaults right now. On the opposite, it can be quite confusing, since consumers will expect similar defaults as from other gateways / the upstream providers itself. Correct me if you have a clear customer requirement here. One more setting to add is caching. |
|
Plus: to validate whether preset parameters can be overridden by request parameters in OpenRouter. |
This has already been explored. In OperRouter the client wins. We have investigated other providers too, you can see more details in the parent issue: #425, Industry context section. |
|
Has the OpenRouter behavior been validated or is it just theoretical research? |
Summary
Phase 0 (backend) of #425 — default inference parameters on model configuration.
parametersfield toModelConfigSchemawith temperature, top_p, max_tokens, frequency_penalty, presence_penalty, seed, stop, service_tier, and reasoning (includingthinkingtranslation for Anthropic Messages with display/summary/exclude support)resolveModelAliasusing??=semantics — client values always winadditionalPropertiesfor key models (temperature, reasoning effort)Resolution chain
Parameter support
max_completion_tokensmax_tokensmax_output_tokensstopstop_sequences(array)reasoning+reasoning_effortthinking(with display)reasoning+reasoning_effortTest plan
Closes #454
🤖 Generated with Claude Code
Summary by CodeRabbit