Add provider-aware prompt caching and diagnostics#45
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a provider-aware prompt caching layer to the OpenClaw gateway/runtime (request shaping, normalized cache usage accounting, diagnostics/tracing, and keep-warm), while also extending observability and tightening a few memory-store behaviors (retention protection, recall scoping, note search ranking, sqlite corruption handling).
Changes:
- Introduces prompt caching configuration (global + per-profile), request shaping via
ChatOptions.AdditionalProperties, cache usage normalization, and keep-warm background sweeping. - Surfaces prompt-cache read/write usage in session status, provider metrics, diagnostics output, and command responses (with recent-turn fallback when session totals are missing).
- Expands memory subsystem robustness/UX: sqlite corruption exceptions, retention protection via metadata (starred/tags), project-scoped memory recall preference, and a faster/scored file-note search index.
Reviewed changes
Copilot reviewed 40 out of 40 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/OpenClaw.Tests/SqliteSessionSearchTests.cs | Adds coverage for corrupt sqlite session rows throwing a dedicated corruption exception. |
| src/OpenClaw.Tests/PromptCachingTests.cs | New tests for prompt caching validation, profile-merge behavior, fingerprint determinism, keep-warm eligibility, and cache usage extraction. |
| src/OpenClaw.Tests/MemoryRetentionSweeperServiceTests.cs | Tests retention protection using session metadata (starred sessions). |
| src/OpenClaw.Tests/MemoryRecallInjectionTests.cs | Tests preference for project-scoped recall prefix when configured. |
| src/OpenClaw.Tests/FileMemoryStoreTests.cs | Tests note search ordering (score + recency) after index/search changes. |
| src/OpenClaw.Tests/FeatureParityTests.cs | Updates default compaction threshold expectation. |
| src/OpenClaw.Tests/ConfigValidatorTests.cs | Adds coverage for invalid memory provider validation. |
| src/OpenClaw.Tests/ChatCommandProcessorTests.cs | Adds coverage for /status prompt-cache usage fallback behavior. |
| src/OpenClaw.MicrosoftAgentFrameworkAdapter/MafExecutionServiceChatClient.cs | Records normalized prompt cache usage into session/metrics/provider usage during MAF execution (incl. streaming). |
| src/OpenClaw.MicrosoftAgentFrameworkAdapter/MafAgentRuntime.cs | Adds project-scoped recall prefix + memory recall/compaction metrics; records prompt cache usage for summaries. |
| src/OpenClaw.Gateway/wwwroot/admin.html | Updates default compaction threshold in admin UI. |
| src/OpenClaw.Gateway/Tools/SessionStatusTool.cs | Displays prompt cache totals with fallback to provider usage history. |
| src/OpenClaw.Gateway/PromptCaching/PromptCacheWarmService.cs | New background service that keeps eligible provider caches warm for active sessions. |
| src/OpenClaw.Gateway/PromptCaching/PromptCacheTraceWriter.cs | New JSONL tracing for cache request/response diagnostics. |
| src/OpenClaw.Gateway/PromptCaching/PromptCacheCoordinator.cs | New coordinator for cache fingerprinting, request shaping, warm-candidate tracking, and trace hooks. |
| src/OpenClaw.Gateway/Models/ConfiguredModelProfileRegistry.cs | Adds prompt caching config to profile statuses; merges global/profile caching defaults; expands capability guessing. |
| src/OpenClaw.Gateway/HeartbeatService.cs | Limits/truncates user-turn/note text considered for heartbeat suggestions. |
| src/OpenClaw.Gateway/GatewayLlmExecutionService.cs | Integrates prompt cache coordinator + warm registry into standard and streaming execution; normalizes cache usage from provider responses. |
| src/OpenClaw.Gateway/Extensions/MemoryRetentionSweeperService.cs | Protects retention sweep targets based on session metadata (starred/tags). |
| src/OpenClaw.Gateway/Extensions/LlmClientFactory.cs | Adds provider ids for anthropic-vertex and amazon-bedrock, with endpoint validation. |
| src/OpenClaw.Gateway/Endpoints/DiagnosticsEndpoints.cs | Adds prompt cache info to diagnostics output; includes cache usage in provider usage lines. |
| src/OpenClaw.Gateway/Composition/CoreServicesExtensions.cs | Wires prompt-cache services + trace/coordinator registries; ensures RuntimeMetrics is singleton; injects metadata snapshot provider into retention sweeper. |
| src/OpenClaw.Gateway/appsettings.Production.json | Switches production memory defaults to sqlite and enables retention defaults. |
| src/OpenClaw.Gateway/appsettings.json | Adds default Llm.PromptCaching configuration section. |
| src/OpenClaw.Core/Validation/DoctorCheck.cs | Adds a warn-only doctor check for provider-compatible prompt cache configuration. |
| src/OpenClaw.Core/Validation/ConfigValidator.cs | Adds prompt cache config validation + memory provider validation; extends built-in provider list. |
| src/OpenClaw.Core/Pipeline/ChatCommandProcessor.cs | Shows prompt cache usage in /status and /usage, with fallback to provider usage history. |
| src/OpenClaw.Core/Observability/RuntimeMetrics.cs | Adds counters for session cache hits/misses, recall searches/hits, compactions, and prompt cache usage + keep-warm stats. |
| src/OpenClaw.Core/Observability/ProviderUsageTracker.cs | Tracks cache read/write usage; records per-turn cache counters; provides latest-session cache totals. |
| src/OpenClaw.Core/Observability/PromptCacheUsage.cs | New normalized prompt-cache usage model + extraction helpers. |
| src/OpenClaw.Core/Models/Session.cs | Tracks per-session prompt cache read/write totals; expands JSON source-gen types. |
| src/OpenClaw.Core/Models/OperatorApiModels.cs | Extends provider turn usage entries with cache read/write tokens. |
| src/OpenClaw.Core/Models/ModelProfiles.cs | Adds prompt caching config to profiles and capabilities; exposes in profile status. |
| src/OpenClaw.Core/Models/GatewayConfig.cs | Adds diagnostics + prompt caching configuration models; updates compaction threshold default. |
| src/OpenClaw.Core/Memory/SqliteMemoryStore.cs | Throws explicit exceptions for corrupt persisted session/branch JSON; improves embedding backfill batching/transactionality. |
| src/OpenClaw.Core/Memory/MemoryRetentionArchive.cs | Optimizes archive purge by parsing archive path date segments before reading JSON. |
| src/OpenClaw.Core/Memory/FileMemoryStore.cs | Adds note index + scored search; adds metrics hook for session cache hits/misses. |
| src/OpenClaw.Agent/AgentRuntime.cs | Adds project-scoped recall prefix and recall/compaction metrics; records prompt cache usage for non-gateway agent runtime. |
| README.md | Documents prompt caching at a high level and links to detailed doc. |
| docs/PROMPT_CACHING.md | New detailed prompt caching documentation (config, provider behavior, diagnostics, tracing, keep-warm). |
| var (stableSystem, volatileSuffix) = ExtractSystemPromptSegments(messages); | ||
| var toolSignature = BuildToolSignature(options); | ||
| var stableFingerprint = BuildStableFingerprint(profile.ProviderId, modelId, stableSystem, toolSignature, options.ResponseFormat); | ||
| var preparedOptions = CloneOptions(options); |
There was a problem hiding this comment.
Prepare(...) receives the resolved modelId, but the cloned ChatOptions doesn’t get ModelId updated to that value. If a fallback model is selected, the executed request (and keep-warm request) can be sent with the wrong model id. Set preparedOptions.ModelId = modelId (or ensure the caller sets it before cloning).
| var preparedOptions = CloneOptions(options); | |
| var preparedOptions = CloneOptions(options); | |
| preparedOptions.ModelId = modelId; |
| if (caching.Enabled == true && dialect != "none") | ||
| { | ||
| preparedOptions.AdditionalProperties ??= new AdditionalPropertiesDictionary(); | ||
| preparedOptions.AdditionalProperties["openclaw_prompt_cache_enabled"] = true; | ||
| preparedOptions.AdditionalProperties["openclaw_prompt_cache_dialect"] = dialect; |
There was a problem hiding this comment.
Cache hint additional-properties are added based on PromptCaching.Enabled/dialect without checking profile.Capabilities.SupportsPromptCaching. That can send provider-specific cache parameters to providers/models explicitly marked as not supporting prompt caching. Gate this request-shaping block on profile.Capabilities.SupportsPromptCaching so capabilities are authoritative.
| var key = BuildKey(request.Descriptor.SessionId, request.Descriptor.ProfileId); | ||
| _entries[key] = new PromptCacheWarmCandidate | ||
| { | ||
| Descriptor = request.Descriptor, | ||
| WarmMessages = [new ChatMessage(ChatRole.System, request.Descriptor.StableSystemPrompt)], |
There was a problem hiding this comment.
Entries are only ever added/overwritten in _entries; there’s no eviction path. Over long uptimes, ended sessions will leave behind warm candidates forever (the warm sweep just skips them). Add a bounded eviction strategy (e.g., remove when session inactive or when LastSeenAtUtc is older than a cutoff).
| var prepared = _promptCacheCoordinator.Prepare(session, candidate.Profile, modelId, messages, effectiveOptions); | ||
| _promptCacheWarmRegistry.Record(prepared); | ||
|
|
||
| var routeState = GetOrAddRouteState(candidate.Profile.Id, candidate.Profile.ProviderId, modelId); | ||
|
|
There was a problem hiding this comment.
Prepare(...) is called before effectiveOptions.ModelId is set to modelId, and the prepared options are what get executed. With fallback models, this can issue the request against the profile default model instead of the selected fallback. Ensure ModelId is set before cloning/preparing (or have Prepare override it).
| if (provider.Equals("anthropic", StringComparison.OrdinalIgnoreCase) || | ||
| provider.Equals("claude", StringComparison.OrdinalIgnoreCase) || | ||
| provider.Equals("anthropic-vertex", StringComparison.OrdinalIgnoreCase)) | ||
| return true; | ||
|
|
||
| if (provider.Equals("amazon-bedrock", StringComparison.OrdinalIgnoreCase)) | ||
| return string.Equals(dialect, "anthropic", StringComparison.OrdinalIgnoreCase) || string.Equals(dialect, "auto", StringComparison.OrdinalIgnoreCase); | ||
|
|
There was a problem hiding this comment.
SupportsExplicitCacheTtl doesn’t consider gemini/google, so ValidatePromptCaching will reject KeepWarmEnabled=true for Gemini even though the coordinator/docs treat Gemini as eligible. Update this helper (and/or the keep-warm validation rule) to allow gemini/google when dialect resolves to gemini so validation matches runtime behavior.
| if (provider.Equals("anthropic", StringComparison.OrdinalIgnoreCase) || | |
| provider.Equals("claude", StringComparison.OrdinalIgnoreCase) || | |
| provider.Equals("anthropic-vertex", StringComparison.OrdinalIgnoreCase)) | |
| return true; | |
| if (provider.Equals("amazon-bedrock", StringComparison.OrdinalIgnoreCase)) | |
| return string.Equals(dialect, "anthropic", StringComparison.OrdinalIgnoreCase) || string.Equals(dialect, "auto", StringComparison.OrdinalIgnoreCase); | |
| var normalizedDialect = (dialect ?? "auto").Trim(); | |
| if (provider.Equals("anthropic", StringComparison.OrdinalIgnoreCase) || | |
| provider.Equals("claude", StringComparison.OrdinalIgnoreCase) || | |
| provider.Equals("anthropic-vertex", StringComparison.OrdinalIgnoreCase)) | |
| return true; | |
| if (provider.Equals("amazon-bedrock", StringComparison.OrdinalIgnoreCase)) | |
| return string.Equals(normalizedDialect, "anthropic", StringComparison.OrdinalIgnoreCase) || string.Equals(normalizedDialect, "auto", StringComparison.OrdinalIgnoreCase); | |
| if (provider.Equals("gemini", StringComparison.OrdinalIgnoreCase) || | |
| provider.Equals("google", StringComparison.OrdinalIgnoreCase)) | |
| return string.Equals(normalizedDialect, "gemini", StringComparison.OrdinalIgnoreCase) || string.Equals(normalizedDialect, "auto", StringComparison.OrdinalIgnoreCase); |
| public bool IncludeMessages { get; set; } = true; | ||
| public bool IncludePrompt { get; set; } = true; | ||
| public bool IncludeSystem { get; set; } = true; |
There was a problem hiding this comment.
IncludeMessages is defined in PromptCacheTraceConfig but isn’t referenced by the trace writer (or elsewhere). Either implement it (e.g., to control logging request message bodies / options metadata) or remove it to avoid a misleading, dead configuration knob.
| public bool IncludeMessages { get; set; } = true; | |
| public bool IncludePrompt { get; set; } = true; | |
| public bool IncludeSystem { get; set; } = true; | |
| public bool IncludePrompt { get; set; } = true; |
| AdditionalProperties = options.AdditionalProperties?.ToDictionary(static kvp => kvp.Key, static kvp => kvp.Value?.ToString()) | ||
| }); | ||
| } | ||
|
|
There was a problem hiding this comment.
WriteRequest persists all ChatOptions.AdditionalProperties by stringifying arbitrary objects. Since these properties can contain sensitive values depending on upstream clients/adapters, consider redacting/whitelisting keys (or gating this behind a dedicated opt-in flag) to avoid leaking secrets into the JSONL trace file.
| AdditionalProperties = options.AdditionalProperties?.ToDictionary(static kvp => kvp.Key, static kvp => kvp.Value?.ToString()) | |
| }); | |
| } | |
| AdditionalProperties = SanitizeAdditionalProperties(options.AdditionalProperties) | |
| }); | |
| } | |
| private static Dictionary<string, string?>? SanitizeAdditionalProperties(IReadOnlyDictionary<string, object?>? additionalProperties) | |
| { | |
| if (additionalProperties is null || additionalProperties.Count == 0) | |
| return null; | |
| Dictionary<string, string?>? sanitized = null; | |
| foreach (KeyValuePair<string, object?> kvp in additionalProperties) | |
| { | |
| string key = kvp.Key; | |
| string? value = IsSensitiveAdditionalPropertyKey(key) | |
| ? "[REDACTED]" | |
| : FormatTraceAdditionalPropertyValue(kvp.Value); | |
| if (value is null) | |
| continue; | |
| sanitized ??= new Dictionary<string, string?>(StringComparer.Ordinal); | |
| sanitized[key] = value; | |
| } | |
| return sanitized; | |
| } | |
| private static bool IsSensitiveAdditionalPropertyKey(string key) | |
| { | |
| if (string.IsNullOrWhiteSpace(key)) | |
| return false; | |
| return key.Contains("authorization", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("token", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("secret", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("password", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("passwd", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("pwd", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("apikey", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("api-key", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("api_key", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("accesskey", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("access-key", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("access_key", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("clientsecret", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("client-secret", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("client_secret", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("privatekey", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("private-key", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("private_key", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("credential", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("session", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("cookie", StringComparison.OrdinalIgnoreCase) | |
| || key.Contains("set-cookie", StringComparison.OrdinalIgnoreCase); | |
| } | |
| private static string? FormatTraceAdditionalPropertyValue(object? value) | |
| { | |
| return value switch | |
| { | |
| null => null, | |
| string stringValue => stringValue, | |
| bool boolValue => boolValue ? "true" : "false", | |
| char charValue => charValue.ToString(), | |
| byte byteValue => byteValue.ToString(), | |
| sbyte sbyteValue => sbyteValue.ToString(), | |
| short shortValue => shortValue.ToString(), | |
| ushort ushortValue => ushortValue.ToString(), | |
| int intValue => intValue.ToString(), | |
| uint uintValue => uintValue.ToString(), | |
| long longValue => longValue.ToString(), | |
| ulong ulongValue => ulongValue.ToString(), | |
| float floatValue => floatValue.ToString(), | |
| double doubleValue => doubleValue.ToString(), | |
| decimal decimalValue => decimalValue.ToString(), | |
| Guid guidValue => guidValue.ToString(), | |
| DateTime dateTimeValue => dateTimeValue.ToString("O"), | |
| DateTimeOffset dateTimeOffsetValue => dateTimeOffsetValue.ToString("O"), | |
| TimeSpan timeSpanValue => timeSpanValue.ToString(), | |
| Uri uriValue => uriValue.ToString(), | |
| _ => "[OMITTED]" | |
| }; | |
| } |
Summary
Validation
dotnet build src/OpenClaw.Tests/OpenClaw.Tests.csproj --no-restore -v minimaldotnet test src/OpenClaw.Tests/OpenClaw.Tests.csproj --no-build -v minimal