Skip to content

Add provider-aware prompt caching and diagnostics#45

Merged
Telli merged 4 commits intomainfrom
codex/prompt-caching
Apr 8, 2026
Merged

Add provider-aware prompt caching and diagnostics#45
Telli merged 4 commits intomainfrom
codex/prompt-caching

Conversation

@Telli
Copy link
Copy Markdown
Contributor

@Telli Telli commented Apr 8, 2026

Summary

  • add provider-aware prompt caching configuration, request shaping, cache accounting, diagnostics, and selective keep-warm services
  • surface normalized cache read/write usage in session status, provider metrics, and command output with transcript fallback when live counters are missing
  • document prompt caching and add targeted coverage for config validation, profile merge behavior, cache fingerprints, keep-warm eligibility, and cache usage extraction

Validation

  • dotnet build src/OpenClaw.Tests/OpenClaw.Tests.csproj --no-restore -v minimal
  • dotnet test src/OpenClaw.Tests/OpenClaw.Tests.csproj --no-build -v minimal

@Telli Telli marked this pull request as ready for review April 8, 2026 05:11
Copilot AI review requested due to automatic review settings April 8, 2026 05:11
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a provider-aware prompt caching layer to the OpenClaw gateway/runtime (request shaping, normalized cache usage accounting, diagnostics/tracing, and keep-warm), while also extending observability and tightening a few memory-store behaviors (retention protection, recall scoping, note search ranking, sqlite corruption handling).

Changes:

  • Introduces prompt caching configuration (global + per-profile), request shaping via ChatOptions.AdditionalProperties, cache usage normalization, and keep-warm background sweeping.
  • Surfaces prompt-cache read/write usage in session status, provider metrics, diagnostics output, and command responses (with recent-turn fallback when session totals are missing).
  • Expands memory subsystem robustness/UX: sqlite corruption exceptions, retention protection via metadata (starred/tags), project-scoped memory recall preference, and a faster/scored file-note search index.

Reviewed changes

Copilot reviewed 40 out of 40 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/OpenClaw.Tests/SqliteSessionSearchTests.cs Adds coverage for corrupt sqlite session rows throwing a dedicated corruption exception.
src/OpenClaw.Tests/PromptCachingTests.cs New tests for prompt caching validation, profile-merge behavior, fingerprint determinism, keep-warm eligibility, and cache usage extraction.
src/OpenClaw.Tests/MemoryRetentionSweeperServiceTests.cs Tests retention protection using session metadata (starred sessions).
src/OpenClaw.Tests/MemoryRecallInjectionTests.cs Tests preference for project-scoped recall prefix when configured.
src/OpenClaw.Tests/FileMemoryStoreTests.cs Tests note search ordering (score + recency) after index/search changes.
src/OpenClaw.Tests/FeatureParityTests.cs Updates default compaction threshold expectation.
src/OpenClaw.Tests/ConfigValidatorTests.cs Adds coverage for invalid memory provider validation.
src/OpenClaw.Tests/ChatCommandProcessorTests.cs Adds coverage for /status prompt-cache usage fallback behavior.
src/OpenClaw.MicrosoftAgentFrameworkAdapter/MafExecutionServiceChatClient.cs Records normalized prompt cache usage into session/metrics/provider usage during MAF execution (incl. streaming).
src/OpenClaw.MicrosoftAgentFrameworkAdapter/MafAgentRuntime.cs Adds project-scoped recall prefix + memory recall/compaction metrics; records prompt cache usage for summaries.
src/OpenClaw.Gateway/wwwroot/admin.html Updates default compaction threshold in admin UI.
src/OpenClaw.Gateway/Tools/SessionStatusTool.cs Displays prompt cache totals with fallback to provider usage history.
src/OpenClaw.Gateway/PromptCaching/PromptCacheWarmService.cs New background service that keeps eligible provider caches warm for active sessions.
src/OpenClaw.Gateway/PromptCaching/PromptCacheTraceWriter.cs New JSONL tracing for cache request/response diagnostics.
src/OpenClaw.Gateway/PromptCaching/PromptCacheCoordinator.cs New coordinator for cache fingerprinting, request shaping, warm-candidate tracking, and trace hooks.
src/OpenClaw.Gateway/Models/ConfiguredModelProfileRegistry.cs Adds prompt caching config to profile statuses; merges global/profile caching defaults; expands capability guessing.
src/OpenClaw.Gateway/HeartbeatService.cs Limits/truncates user-turn/note text considered for heartbeat suggestions.
src/OpenClaw.Gateway/GatewayLlmExecutionService.cs Integrates prompt cache coordinator + warm registry into standard and streaming execution; normalizes cache usage from provider responses.
src/OpenClaw.Gateway/Extensions/MemoryRetentionSweeperService.cs Protects retention sweep targets based on session metadata (starred/tags).
src/OpenClaw.Gateway/Extensions/LlmClientFactory.cs Adds provider ids for anthropic-vertex and amazon-bedrock, with endpoint validation.
src/OpenClaw.Gateway/Endpoints/DiagnosticsEndpoints.cs Adds prompt cache info to diagnostics output; includes cache usage in provider usage lines.
src/OpenClaw.Gateway/Composition/CoreServicesExtensions.cs Wires prompt-cache services + trace/coordinator registries; ensures RuntimeMetrics is singleton; injects metadata snapshot provider into retention sweeper.
src/OpenClaw.Gateway/appsettings.Production.json Switches production memory defaults to sqlite and enables retention defaults.
src/OpenClaw.Gateway/appsettings.json Adds default Llm.PromptCaching configuration section.
src/OpenClaw.Core/Validation/DoctorCheck.cs Adds a warn-only doctor check for provider-compatible prompt cache configuration.
src/OpenClaw.Core/Validation/ConfigValidator.cs Adds prompt cache config validation + memory provider validation; extends built-in provider list.
src/OpenClaw.Core/Pipeline/ChatCommandProcessor.cs Shows prompt cache usage in /status and /usage, with fallback to provider usage history.
src/OpenClaw.Core/Observability/RuntimeMetrics.cs Adds counters for session cache hits/misses, recall searches/hits, compactions, and prompt cache usage + keep-warm stats.
src/OpenClaw.Core/Observability/ProviderUsageTracker.cs Tracks cache read/write usage; records per-turn cache counters; provides latest-session cache totals.
src/OpenClaw.Core/Observability/PromptCacheUsage.cs New normalized prompt-cache usage model + extraction helpers.
src/OpenClaw.Core/Models/Session.cs Tracks per-session prompt cache read/write totals; expands JSON source-gen types.
src/OpenClaw.Core/Models/OperatorApiModels.cs Extends provider turn usage entries with cache read/write tokens.
src/OpenClaw.Core/Models/ModelProfiles.cs Adds prompt caching config to profiles and capabilities; exposes in profile status.
src/OpenClaw.Core/Models/GatewayConfig.cs Adds diagnostics + prompt caching configuration models; updates compaction threshold default.
src/OpenClaw.Core/Memory/SqliteMemoryStore.cs Throws explicit exceptions for corrupt persisted session/branch JSON; improves embedding backfill batching/transactionality.
src/OpenClaw.Core/Memory/MemoryRetentionArchive.cs Optimizes archive purge by parsing archive path date segments before reading JSON.
src/OpenClaw.Core/Memory/FileMemoryStore.cs Adds note index + scored search; adds metrics hook for session cache hits/misses.
src/OpenClaw.Agent/AgentRuntime.cs Adds project-scoped recall prefix and recall/compaction metrics; records prompt cache usage for non-gateway agent runtime.
README.md Documents prompt caching at a high level and links to detailed doc.
docs/PROMPT_CACHING.md New detailed prompt caching documentation (config, provider behavior, diagnostics, tracing, keep-warm).

var (stableSystem, volatileSuffix) = ExtractSystemPromptSegments(messages);
var toolSignature = BuildToolSignature(options);
var stableFingerprint = BuildStableFingerprint(profile.ProviderId, modelId, stableSystem, toolSignature, options.ResponseFormat);
var preparedOptions = CloneOptions(options);
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prepare(...) receives the resolved modelId, but the cloned ChatOptions doesn’t get ModelId updated to that value. If a fallback model is selected, the executed request (and keep-warm request) can be sent with the wrong model id. Set preparedOptions.ModelId = modelId (or ensure the caller sets it before cloning).

Suggested change
var preparedOptions = CloneOptions(options);
var preparedOptions = CloneOptions(options);
preparedOptions.ModelId = modelId;

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +113
if (caching.Enabled == true && dialect != "none")
{
preparedOptions.AdditionalProperties ??= new AdditionalPropertiesDictionary();
preparedOptions.AdditionalProperties["openclaw_prompt_cache_enabled"] = true;
preparedOptions.AdditionalProperties["openclaw_prompt_cache_dialect"] = dialect;
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache hint additional-properties are added based on PromptCaching.Enabled/dialect without checking profile.Capabilities.SupportsPromptCaching. That can send provider-specific cache parameters to providers/models explicitly marked as not supporting prompt caching. Gate this request-shaping block on profile.Capabilities.SupportsPromptCaching so capabilities are authoritative.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +56
var key = BuildKey(request.Descriptor.SessionId, request.Descriptor.ProfileId);
_entries[key] = new PromptCacheWarmCandidate
{
Descriptor = request.Descriptor,
WarmMessages = [new ChatMessage(ChatRole.System, request.Descriptor.StableSystemPrompt)],
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entries are only ever added/overwritten in _entries; there’s no eviction path. Over long uptimes, ended sessions will leave behind warm candidates forever (the warm sweep just skips them). Add a bounded eviction strategy (e.g., remove when session inactive or when LastSeenAtUtc is older than a cutoff).

Copilot uses AI. Check for mistakes.
Comment on lines +247 to 251
var prepared = _promptCacheCoordinator.Prepare(session, candidate.Profile, modelId, messages, effectiveOptions);
_promptCacheWarmRegistry.Record(prepared);

var routeState = GetOrAddRouteState(candidate.Profile.Id, candidate.Profile.ProviderId, modelId);

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prepare(...) is called before effectiveOptions.ModelId is set to modelId, and the prepared options are what get executed. With fallback models, this can issue the request against the profile default model instead of the selected fallback. Ensure ModelId is set before cloning/preparing (or have Prepare override it).

Copilot uses AI. Check for mistakes.
Comment on lines +635 to +642
if (provider.Equals("anthropic", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("claude", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("anthropic-vertex", StringComparison.OrdinalIgnoreCase))
return true;

if (provider.Equals("amazon-bedrock", StringComparison.OrdinalIgnoreCase))
return string.Equals(dialect, "anthropic", StringComparison.OrdinalIgnoreCase) || string.Equals(dialect, "auto", StringComparison.OrdinalIgnoreCase);

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SupportsExplicitCacheTtl doesn’t consider gemini/google, so ValidatePromptCaching will reject KeepWarmEnabled=true for Gemini even though the coordinator/docs treat Gemini as eligible. Update this helper (and/or the keep-warm validation rule) to allow gemini/google when dialect resolves to gemini so validation matches runtime behavior.

Suggested change
if (provider.Equals("anthropic", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("claude", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("anthropic-vertex", StringComparison.OrdinalIgnoreCase))
return true;
if (provider.Equals("amazon-bedrock", StringComparison.OrdinalIgnoreCase))
return string.Equals(dialect, "anthropic", StringComparison.OrdinalIgnoreCase) || string.Equals(dialect, "auto", StringComparison.OrdinalIgnoreCase);
var normalizedDialect = (dialect ?? "auto").Trim();
if (provider.Equals("anthropic", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("claude", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("anthropic-vertex", StringComparison.OrdinalIgnoreCase))
return true;
if (provider.Equals("amazon-bedrock", StringComparison.OrdinalIgnoreCase))
return string.Equals(normalizedDialect, "anthropic", StringComparison.OrdinalIgnoreCase) || string.Equals(normalizedDialect, "auto", StringComparison.OrdinalIgnoreCase);
if (provider.Equals("gemini", StringComparison.OrdinalIgnoreCase) ||
provider.Equals("google", StringComparison.OrdinalIgnoreCase))
return string.Equals(normalizedDialect, "gemini", StringComparison.OrdinalIgnoreCase) || string.Equals(normalizedDialect, "auto", StringComparison.OrdinalIgnoreCase);

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +124
public bool IncludeMessages { get; set; } = true;
public bool IncludePrompt { get; set; } = true;
public bool IncludeSystem { get; set; } = true;
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IncludeMessages is defined in PromptCacheTraceConfig but isn’t referenced by the trace writer (or elsewhere). Either implement it (e.g., to control logging request message bodies / options metadata) or remove it to avoid a misleading, dead configuration knob.

Suggested change
public bool IncludeMessages { get; set; } = true;
public bool IncludePrompt { get; set; } = true;
public bool IncludeSystem { get; set; } = true;
public bool IncludePrompt { get; set; } = true;

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +40
AdditionalProperties = options.AdditionalProperties?.ToDictionary(static kvp => kvp.Key, static kvp => kvp.Value?.ToString())
});
}

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WriteRequest persists all ChatOptions.AdditionalProperties by stringifying arbitrary objects. Since these properties can contain sensitive values depending on upstream clients/adapters, consider redacting/whitelisting keys (or gating this behind a dedicated opt-in flag) to avoid leaking secrets into the JSONL trace file.

Suggested change
AdditionalProperties = options.AdditionalProperties?.ToDictionary(static kvp => kvp.Key, static kvp => kvp.Value?.ToString())
});
}
AdditionalProperties = SanitizeAdditionalProperties(options.AdditionalProperties)
});
}
private static Dictionary<string, string?>? SanitizeAdditionalProperties(IReadOnlyDictionary<string, object?>? additionalProperties)
{
if (additionalProperties is null || additionalProperties.Count == 0)
return null;
Dictionary<string, string?>? sanitized = null;
foreach (KeyValuePair<string, object?> kvp in additionalProperties)
{
string key = kvp.Key;
string? value = IsSensitiveAdditionalPropertyKey(key)
? "[REDACTED]"
: FormatTraceAdditionalPropertyValue(kvp.Value);
if (value is null)
continue;
sanitized ??= new Dictionary<string, string?>(StringComparer.Ordinal);
sanitized[key] = value;
}
return sanitized;
}
private static bool IsSensitiveAdditionalPropertyKey(string key)
{
if (string.IsNullOrWhiteSpace(key))
return false;
return key.Contains("authorization", StringComparison.OrdinalIgnoreCase)
|| key.Contains("token", StringComparison.OrdinalIgnoreCase)
|| key.Contains("secret", StringComparison.OrdinalIgnoreCase)
|| key.Contains("password", StringComparison.OrdinalIgnoreCase)
|| key.Contains("passwd", StringComparison.OrdinalIgnoreCase)
|| key.Contains("pwd", StringComparison.OrdinalIgnoreCase)
|| key.Contains("apikey", StringComparison.OrdinalIgnoreCase)
|| key.Contains("api-key", StringComparison.OrdinalIgnoreCase)
|| key.Contains("api_key", StringComparison.OrdinalIgnoreCase)
|| key.Contains("accesskey", StringComparison.OrdinalIgnoreCase)
|| key.Contains("access-key", StringComparison.OrdinalIgnoreCase)
|| key.Contains("access_key", StringComparison.OrdinalIgnoreCase)
|| key.Contains("clientsecret", StringComparison.OrdinalIgnoreCase)
|| key.Contains("client-secret", StringComparison.OrdinalIgnoreCase)
|| key.Contains("client_secret", StringComparison.OrdinalIgnoreCase)
|| key.Contains("privatekey", StringComparison.OrdinalIgnoreCase)
|| key.Contains("private-key", StringComparison.OrdinalIgnoreCase)
|| key.Contains("private_key", StringComparison.OrdinalIgnoreCase)
|| key.Contains("credential", StringComparison.OrdinalIgnoreCase)
|| key.Contains("session", StringComparison.OrdinalIgnoreCase)
|| key.Contains("cookie", StringComparison.OrdinalIgnoreCase)
|| key.Contains("set-cookie", StringComparison.OrdinalIgnoreCase);
}
private static string? FormatTraceAdditionalPropertyValue(object? value)
{
return value switch
{
null => null,
string stringValue => stringValue,
bool boolValue => boolValue ? "true" : "false",
char charValue => charValue.ToString(),
byte byteValue => byteValue.ToString(),
sbyte sbyteValue => sbyteValue.ToString(),
short shortValue => shortValue.ToString(),
ushort ushortValue => ushortValue.ToString(),
int intValue => intValue.ToString(),
uint uintValue => uintValue.ToString(),
long longValue => longValue.ToString(),
ulong ulongValue => ulongValue.ToString(),
float floatValue => floatValue.ToString(),
double doubleValue => doubleValue.ToString(),
decimal decimalValue => decimalValue.ToString(),
Guid guidValue => guidValue.ToString(),
DateTime dateTimeValue => dateTimeValue.ToString("O"),
DateTimeOffset dateTimeOffsetValue => dateTimeOffsetValue.ToString("O"),
TimeSpan timeSpanValue => timeSpanValue.ToString(),
Uri uriValue => uriValue.ToString(),
_ => "[OMITTED]"
};
}

Copilot uses AI. Check for mistakes.
@Telli Telli merged commit e55a1ce into main Apr 8, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants