Add model profiles and Gemma provider routing#42
Conversation
There was a problem hiding this comment.
Pull request overview
Adds provider-agnostic model profiles with capability-aware selection (including route/session preferences), integrates Gemma-family models via existing provider seams (Ollama + OpenAI-compatible), and exposes model status/evaluation via admin + CLI surfaces.
Changes:
- Introduces model profile configuration/types, a configured registry, and a default selection policy based on capabilities/tags/fallbacks.
- Updates gateway execution + OpenAI/admin endpoints to use profile-aware routing and to surface model profile status.
- Adds a built-in model evaluation runner with new admin endpoints, CLI commands, tests, and documentation.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/OpenClaw.Tests/ModelProfileSelectionTests.cs | Adds tests for implicit default profiles, capability fallback behavior, config binding, and evaluation report persistence. |
| src/OpenClaw.MicrosoftAgentFrameworkAdapter/MafAgentRuntime.cs | Handles ModelSelectionException for non-streaming and streaming MAF turns. |
| src/OpenClaw.Gateway/RuntimeOperationsState.cs | Adds ModelProfiles to runtime operations state with an empty default implementation. |
| src/OpenClaw.Gateway/Models/ModelEvaluationRunner.cs | Adds evaluation harness with scenarios + report persistence (JSON/Markdown). |
| src/OpenClaw.Gateway/Models/DefaultModelSelectionPolicy.cs | Implements capability/tag-aware model profile selection and explicit-profile fallback logic. |
| src/OpenClaw.Gateway/Models/ConfiguredModelProfileRegistry.cs | Builds model profiles from config (incl. implicit default), validates, and instantiates IChatClients. |
| src/OpenClaw.Gateway/GatewayLlmExecutionService.cs | Refactors execution to be profile-aware; records profile metadata and selection explanation. |
| src/OpenClaw.Gateway/Extensions/GatewayWorkers.cs | Applies per-route model profile/tag/fallback/requirements onto sessions; clears them when no route match. |
| src/OpenClaw.Gateway/Endpoints/OpenAiEndpoints.cs | Interprets OpenAI model as profile-id when it matches a registered profile; otherwise keeps legacy model override. |
| src/OpenClaw.Gateway/Endpoints/AdminEndpoints.cs | Adds /admin/models, /admin/models/doctor, /admin/models/evaluations endpoints and includes model profiles in provider admin response. |
| src/OpenClaw.Gateway/Composition/RuntimeInitializationExtensions.cs | Wires ConfiguredModelProfileRegistry into runtime services/operations. |
| src/OpenClaw.Gateway/Composition/IntegrationApiFacade.cs | Extends integration providers response with model profile status. |
| src/OpenClaw.Gateway/Composition/CoreServicesExtensions.cs | Registers model profile registry, selection policy, and evaluation runner in DI. |
| src/OpenClaw.Core/Validation/DoctorCheck.cs | Adds a doctor-mode check intended to validate profile config consistency. |
| src/OpenClaw.Core/Validation/ConfigValidator.cs | Validates model profile configuration and route references to profile IDs. |
| src/OpenClaw.Core/Models/Session.cs | Adds session-level model profile selection fields + JSON source-gen registrations for new model types. |
| src/OpenClaw.Core/Models/OperatorApiModels.cs | Extends route health snapshot/admin response to include profile metadata (tags/validation issues). |
| src/OpenClaw.Core/Models/ModelSelectionException.cs | Adds a dedicated exception type for model selection failures. |
| src/OpenClaw.Core/Models/ModelProfiles.cs | Adds core model profile/config/capabilities/evaluation DTOs. |
| src/OpenClaw.Core/Models/IntegrationApiModels.cs | Extends integration providers response with model profiles status. |
| src/OpenClaw.Core/Models/GatewayConfig.cs | Adds Models config root and route-level profile preferences/requirements. |
| src/OpenClaw.Core/Abstractions/IModelProfiles.cs | Defines profile registry + selection policy abstractions and DTOs. |
| src/OpenClaw.Client/OpenClawHttpClient.cs | Adds client methods for new model profile admin endpoints and evaluations. |
| src/OpenClaw.Cli/Program.cs | Adds models and eval CLI commands (list/doctor/run/compare). |
| src/OpenClaw.Cli/OpenClawHttpClient.cs | Exposes the new model admin/eval client APIs through the CLI wrapper. |
| src/OpenClaw.Agent/ILlmExecutionService.cs | Extends execution results with ProfileId + selection explanation fields. |
| src/OpenClaw.Agent/AgentRuntime.cs | Handles ModelSelectionException for non-streaming and streaming turns. |
| README.md | Documents model profiles/Gemma usage and new CLI commands. |
| docs/MODEL_PROFILES.md | Adds detailed documentation for model profiles, Gemma configuration, routing, and evaluation harness. |
Comments suppressed due to low confidence (1)
src/OpenClaw.Gateway/GatewayLlmExecutionService.cs:111
ResetProvider()matches route keys viaContains($":{providerId}:")on the composite{profileId}:{providerId}:{modelId}key. This is brittle if profile/model IDs can contain:or ifproviderIdis a substring in those segments. Consider using a structured key (e.g., a value tuple) or splitting once and comparing the provider segment explicitly.
public void ResetProvider(string providerId)
{
foreach (var key in _routes.Keys.Where(key => key.Contains($":{providerId}:", StringComparison.OrdinalIgnoreCase)).ToArray())
{
if (_routes.TryRemove(key, out var state))
state.CircuitBreaker.Reset();
}
| public GatewayLlmExecutionService( | ||
| GatewayConfig config, | ||
| LlmProviderRegistry registry, | ||
| ProviderPolicyService policyService, | ||
| RuntimeEventStore eventStore, | ||
| RuntimeMetrics runtimeMetrics, | ||
| ProviderUsageTracker providerUsage, | ||
| ILogger<GatewayLlmExecutionService> logger) | ||
| : this( | ||
| config, | ||
| new ConfiguredModelProfileRegistry(config, NullLogger<ConfiguredModelProfileRegistry>.Instance), | ||
| new DefaultModelSelectionPolicy(new ConfiguredModelProfileRegistry(config, NullLogger<ConfiguredModelProfileRegistry>.Instance)), | ||
| policyService, |
There was a problem hiding this comment.
The compatibility constructor takes an LlmProviderRegistry registry but never uses it, and it constructs two different ConfiguredModelProfileRegistry instances (one for _modelProfiles, another for the DefaultModelSelectionPolicy). This breaks callers/tests that inject a fake provider via providerRegistry.RegisterDefault(...) because the execution service will ignore that registry and instead create new clients from config. Consider either removing this overload (and updating call sites) or wiring it to reuse the provided registry / a single shared ConfiguredModelProfileRegistry instance so injected clients and selection agree.
| public CircuitState DefaultCircuitState | ||
| => GetRouteState(_config.Llm.Provider, _config.Llm.Model).CircuitBreaker.State; | ||
| => GetRouteState( | ||
| _modelProfiles.DefaultProfileId ?? "default", | ||
| _config.Llm.Provider, | ||
| _config.Llm.Model).CircuitBreaker.State; |
There was a problem hiding this comment.
DefaultCircuitState is computed using _config.Llm.Provider/_config.Llm.Model, which may not match the default model profile when OpenClaw:Models:DefaultProfile points at a different provider/model. This can report an unrelated circuit breaker state. Consider resolving the default profile registration and using its provider/model when computing the default circuit state.
| public IReadOnlyList<ProviderRouteHealthSnapshot> SnapshotRoutes() | ||
| => _registry.Snapshot() | ||
| .SelectMany(registration => | ||
| => _modelProfiles.ListStatuses() | ||
| .Select(profile => | ||
| { | ||
| var models = registration.Models.Length > 0 ? registration.Models : [_config.Llm.Model]; | ||
| return models.Distinct(StringComparer.OrdinalIgnoreCase).Select(modelId => | ||
| var state = GetRouteState(profile.Id, profile.ProviderId, profile.ModelId); | ||
| return new ProviderRouteHealthSnapshot | ||
| { | ||
| var state = GetRouteState(registration.ProviderId, modelId); | ||
| return new ProviderRouteHealthSnapshot | ||
| { | ||
| ProviderId = registration.ProviderId, | ||
| ModelId = modelId, | ||
| IsDefaultRoute = registration.IsDefault && string.Equals(modelId, _config.Llm.Model, StringComparison.OrdinalIgnoreCase), | ||
| IsDynamic = registration.IsDynamic, | ||
| OwnerId = registration.OwnerId, | ||
| CircuitState = state.CircuitBreaker.State.ToString(), | ||
| Requests = Interlocked.Read(ref state.Requests), | ||
| Retries = Interlocked.Read(ref state.Retries), | ||
| Errors = Interlocked.Read(ref state.Errors), | ||
| LastError = state.LastError, | ||
| LastErrorAtUtc = state.LastErrorAtUtc | ||
| }; | ||
| }); | ||
| ProfileId = profile.Id, | ||
| ProviderId = profile.ProviderId, | ||
| ModelId = profile.ModelId, | ||
| IsDefaultRoute = profile.IsDefault, | ||
| CircuitState = state.CircuitBreaker.State.ToString(), | ||
| Requests = Interlocked.Read(ref state.Requests), | ||
| Retries = Interlocked.Read(ref state.Retries), | ||
| Errors = Interlocked.Read(ref state.Errors), | ||
| LastError = state.LastError, | ||
| LastErrorAtUtc = state.LastErrorAtUtc, | ||
| Tags = profile.Tags, | ||
| ValidationIssues = profile.ValidationIssues | ||
| }; | ||
| }) | ||
| .OrderBy(static item => item.ProviderId, StringComparer.OrdinalIgnoreCase) | ||
| .ThenBy(static item => item.ModelId, StringComparer.OrdinalIgnoreCase) | ||
| .OrderBy(static item => item.ProfileId, StringComparer.OrdinalIgnoreCase) | ||
| .ToArray(); |
There was a problem hiding this comment.
SnapshotRoutes() only reports one route per profile (profile.ModelId). The execution path can create route states for fallback model IDs (and Session.ModelOverride model IDs), so those circuit breaker stats/errors won’t appear in snapshots. Consider including profile.FallbackModels (and possibly any observed override models) when building the snapshot list so operator/admin views reflect all active routes.
| private static ModelProfile ToProfile(GatewayConfig config, ModelProfileConfig model) | ||
| => new() | ||
| { | ||
| Id = Normalize(model.Id) ?? "default", | ||
| ProviderId = Normalize(model.Provider) ?? config.Llm.Provider, | ||
| ModelId = Normalize(model.Model) ?? config.Llm.Model, | ||
| BaseUrl = Normalize(model.BaseUrl), | ||
| ApiKey = Normalize(model.ApiKey), | ||
| Tags = NormalizeDistinct(model.Tags), | ||
| FallbackProfileIds = NormalizeDistinct(model.FallbackProfileIds), | ||
| FallbackModels = NormalizeDistinct(model.FallbackModels), | ||
| Capabilities = model.Capabilities ?? GuessCapabilities(model.Provider), | ||
| IsImplicit = string.Equals(model.Id, "default", StringComparison.OrdinalIgnoreCase) |
There was a problem hiding this comment.
ModelProfileConfig.Capabilities is non-nullable and default-initialized, so model.Capabilities ?? GuessCapabilities(...) will never call GuessCapabilities when the config omits Capabilities. This makes capability-based selection behave as if most capabilities are false by default. Consider making ModelProfileConfig.Capabilities nullable (or adding a sentinel) so provider-based guessing can be applied when capabilities aren’t specified.
| ProviderId = Normalize(model.Provider) ?? config.Llm.Provider, | ||
| ModelId = Normalize(model.Model) ?? config.Llm.Model, | ||
| BaseUrl = Normalize(model.BaseUrl), | ||
| ApiKey = Normalize(model.ApiKey), | ||
| Tags = NormalizeDistinct(model.Tags), | ||
| FallbackProfileIds = NormalizeDistinct(model.FallbackProfileIds), | ||
| FallbackModels = NormalizeDistinct(model.FallbackModels), | ||
| Capabilities = model.Capabilities ?? GuessCapabilities(model.Provider), |
There was a problem hiding this comment.
Profile ApiKey values aren’t resolved through SecretResolver (e.g., env:MODEL_PROVIDER_KEY), but docs/examples use that format. As written, ApiKey="env:..." will be passed literally to the provider client and fail auth at runtime. Consider resolving secret refs for ModelProfileConfig.ApiKey (and possibly BaseUrl) when building ModelProfile / LlmProviderConfig, consistent with how OpenClaw:Llm:ApiKey is handled in bootstrap.
| config.Models.Profiles.Any(profile => string.Equals(profile.Id, config.Models.DefaultProfile, StringComparison.OrdinalIgnoreCase)))); | ||
| }, | ||
| warnOnly: false, | ||
| detail: "Check Models.DefaultProfile, duplicate profile ids, and route profile references."); |
There was a problem hiding this comment.
The doctor check’s detail says it validates “route profile references”, but the predicate only checks for duplicate profile IDs and that Models.DefaultProfile exists. Either extend the check to also validate Routing.Routes.*.ModelProfileId/FallbackModelProfileIds against Models.Profiles, or adjust the detail string so it matches what is actually being checked.
| detail: "Check Models.DefaultProfile, duplicate profile ids, and route profile references."); | |
| detail: "Check Models.DefaultProfile and duplicate profile ids."); |
| private static void ValidateModelProfiles(GatewayConfig config, List<string> errors, bool pluginBackedProvidersPossible) | ||
| { | ||
| if (config.Models.Profiles.Count == 0) | ||
| return; |
There was a problem hiding this comment.
ValidateModelProfiles returns early when Models.Profiles is empty, so route fields like Routing.Routes.*.ModelProfileId / FallbackModelProfileIds won’t be validated in that (common) implicit-default mode. This can lead to runtime ModelSelectionException for misconfigured routes that the config validator should catch. Consider validating route profile references regardless of whether Models.Profiles is empty (e.g., flag any non-empty route ModelProfileId when there are zero profiles, or treat the implicit default profile as the only valid id).
🤖 Augment PR SummarySummary: This PR introduces provider-agnostic model profiles and profile-aware routing/selection, enabling Gemma-family models (incl. Gemma 4) to run via existing Ollama or OpenAI-compatible provider paths. Changes:
Technical Notes: Gemma is treated as a backend reachable through existing transports; routing now supports profile IDs plus capability/tag constraints to fail fast or fall back when unsupported. 🤖 Was this summary useful? React with 👍 or 👎 |
|
|
||
| Exception? lastError = null; | ||
| for (var modelIndex = 0; modelIndex < modelsToTry.Length; modelIndex++) | ||
| foreach (var candidate in selection.Candidates) |
There was a problem hiding this comment.
src/OpenClaw.Gateway/GatewayLlmExecutionService.cs:136: Executing by iterating selection.Candidates can run against profiles that were merely “attempted” (e.g., the explicit profile even when the policy selected a fallback), which can violate the resolved SelectedProfileId/Explanation and the capability guarantees.
Other locations where this applies: src/OpenClaw.Gateway/GatewayLlmExecutionService.cs:247.
Severity: high
Other Locations
src/OpenClaw.Gateway/GatewayLlmExecutionService.cs:247
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| : this( | ||
| config, | ||
| new ConfiguredModelProfileRegistry(config, NullLogger<ConfiguredModelProfileRegistry>.Instance), | ||
| new DefaultModelSelectionPolicy(new ConfiguredModelProfileRegistry(config, NullLogger<ConfiguredModelProfileRegistry>.Instance)), |
There was a problem hiding this comment.
src/OpenClaw.Gateway/GatewayLlmExecutionService.cs:66: This overload constructs two separate ConfiguredModelProfileRegistry instances (one stored in _modelProfiles, another inside DefaultModelSelectionPolicy), so selection vs execution can diverge and profile initialization work is duplicated. Also, the LlmProviderRegistry registry parameter in this overload is unused, which suggests a wiring mismatch.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| FallbackModels = explicitRegistration.ProviderConfig.FallbackModels | ||
| }); | ||
|
|
||
| if (Satisfies(explicitRegistration.Profile, requirements)) |
There was a problem hiding this comment.
src/OpenClaw.Gateway/Models/DefaultModelSelectionPolicy.cs:35: Resolve can return a profile as “selected” based on capabilities without checking registration availability (e.g., Client is null / validation issues), which can later fail execution even when other configured profiles satisfy the same requirements.
Severity: medium
Other Locations
src/OpenClaw.Gateway/Models/DefaultModelSelectionPolicy.cs:66
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| return; | ||
|
|
||
| var profileIds = new HashSet<string>(StringComparer.OrdinalIgnoreCase); | ||
| foreach (var profile in config.Models.Profiles) |
There was a problem hiding this comment.
src/OpenClaw.Core/Validation/ConfigValidator.cs:520: ValidateModelProfiles validates route fallback profile IDs, but it doesn’t validate Models.Profiles[*].FallbackProfileIds, so a typo there won’t be caught at config-validate time and will only surface as runtime selection skipping that fallback.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| Tags = NormalizeDistinct(model.Tags), | ||
| FallbackProfileIds = NormalizeDistinct(model.FallbackProfileIds), | ||
| FallbackModels = NormalizeDistinct(model.FallbackModels), | ||
| Capabilities = model.Capabilities ?? GuessCapabilities(model.Provider), |
There was a problem hiding this comment.
src/OpenClaw.Gateway/Models/ConfiguredModelProfileRegistry.cs:172: Capabilities = model.Capabilities ?? GuessCapabilities(...) will always take model.Capabilities because ModelProfileConfig.Capabilities is non-null by default, so profiles that omit a Capabilities section may silently advertise incorrect (mostly-false) capabilities instead of using the provider-based guess.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| if (profile.Capabilities.MaxContextTokens > 0 && estimate.EstimatedInputTokens > profile.Capabilities.MaxContextTokens) | ||
| { | ||
| throw new ModelSelectionException( | ||
| $"Selected model profile '{profile.Id}' cannot satisfy this request because estimated input tokens ({estimate.EstimatedInputTokens}) exceed MaxContextTokens ({profile.Capabilities.MaxContextTokens})."); |
There was a problem hiding this comment.
src/OpenClaw.Gateway/GatewayLlmExecutionService.cs:425: Throwing ModelSelectionException here aborts the request immediately, so a too-small MaxContextTokens on the first attempted profile can prevent trying other candidate profiles/fallbacks that might satisfy the request.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Summary
Testing