Skip to content

Commit 5ff346a

Browse files
Simplify Effect AI prompt-caching README to link to main docs
1 parent 8be9cfa commit 5ff346a

File tree

1 file changed

+30
-93
lines changed
  • typescript/effect-ai/src/prompt-caching

1 file changed

+30
-93
lines changed
Lines changed: 30 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -1,121 +1,58 @@
1-
# Anthropic Prompt Caching Examples (Effect AI)
1+
# Prompt Caching Examples (Effect AI)
22

3-
This directory contains examples demonstrating Anthropic's prompt caching feature via OpenRouter using @effect/ai and @effect/ai-openrouter.
3+
Examples demonstrating prompt caching with @effect/ai and @effect/ai-openrouter.
44

5-
## What is Prompt Caching?
5+
## Documentation
66

7-
Anthropic's prompt caching allows you to cache large portions of your prompts to:
8-
- **Reduce costs** - Cached tokens cost significantly less
9-
- **Improve latency** - Cached content is processed faster
10-
- **Enable larger contexts** - Use more context without proportional cost increases
7+
For full prompt caching documentation including all providers, pricing, and configuration details, see:
8+
- **[Prompt Caching Guide](../../../../docs/prompt-caching.md)**
119

12-
Cache TTL: 5 minutes for ephemeral caches
10+
## Examples in This Directory
1311

14-
## Examples
12+
- `user-message-cache.ts` - Cache large context in user messages
13+
- `multi-message-cache.ts` - Cache system prompt across multi-turn conversations
14+
- `no-cache-control.ts` - Control scenario (validates methodology)
15+
16+
## Quick Start
1517

16-
### User Message Cache (`user-message-cache.ts`)
17-
Cache large context in user messages using Effect AI:
1818
```bash
19+
# Run an example
1920
bun run typescript/effect-ai/src/prompt-caching/user-message-cache.ts
2021
```
2122

22-
**Pattern**: User message with `options.openrouter.cacheControl` using Effect.gen
23-
24-
## How to Use with Effect AI
23+
## Effect AI Usage
2524

2625
```typescript
27-
import * as OpenRouterClient from '@effect/ai-openrouter/OpenRouterClient';
2826
import * as OpenRouterLanguageModel from '@effect/ai-openrouter/OpenRouterLanguageModel';
29-
import * as LanguageModel from '@effect/ai/LanguageModel';
30-
import * as Prompt from '@effect/ai/Prompt';
31-
import { Effect, Layer, Redacted } from 'effect';
32-
33-
// Create OpenRouter client layer
34-
const OpenRouterClientLayer = OpenRouterClient.layer({
35-
apiKey: Redacted.make(process.env.OPENROUTER_API_KEY!),
36-
}).pipe(Layer.provide(FetchHttpClient.layer));
3727

38-
// Create language model layer with CRITICAL stream_options config
3928
const OpenRouterModelLayer = OpenRouterLanguageModel.layer({
4029
model: 'anthropic/claude-3.5-sonnet',
4130
config: {
42-
stream_options: { include_usage: true }, // CRITICAL: Required!
31+
stream_options: { include_usage: true }, // Required for cache metrics
4332
},
44-
}).pipe(Layer.provide(OpenRouterClientLayer));
33+
});
4534

46-
// Use in Effect.gen program
4735
const program = Effect.gen(function* () {
4836
const response = yield* LanguageModel.generateText({
49-
prompt: Prompt.make([
50-
{
51-
role: 'user',
52-
content: [
53-
{
54-
type: 'text',
55-
text: 'Large context here...',
56-
options: {
57-
openrouter: {
58-
cacheControl: { type: 'ephemeral' }, // Cache this block
59-
},
60-
},
61-
},
62-
{
63-
type: 'text',
64-
text: 'Your question here',
65-
},
66-
],
67-
},
68-
]),
37+
prompt: Prompt.make([{
38+
role: 'user',
39+
content: [{
40+
type: 'text',
41+
text: 'Large context...',
42+
options: {
43+
openrouter: { cacheControl: { type: 'ephemeral' } }
44+
}
45+
}]
46+
}])
6947
});
7048

7149
// Check cache metrics
72-
const cachedTokens = response.usage.cachedInputTokens ?? 0;
50+
const cached = response.usage.cachedInputTokens ?? 0;
7351
});
74-
75-
// Run with dependencies
76-
await program.pipe(
77-
Effect.provide(OpenRouterModelLayer),
78-
Effect.runPromise,
79-
);
80-
```
81-
82-
## Important Notes
83-
84-
### Critical Configuration
85-
**MUST include `stream_options: { include_usage: true }` in model config**
86-
- Without this, usage.cachedInputTokens will be undefined
87-
- OpenRouterClient only sets this for streaming by default
88-
- Must be set explicitly in the layer configuration
89-
90-
### Cache Metrics Location
91-
Cache metrics are in `response.usage`:
92-
```typescript
93-
{
94-
inputTokens: number,
95-
outputTokens: number,
96-
cachedInputTokens: number // Number of tokens read from cache
97-
}
9852
```
9953

100-
### Requirements
101-
1. **stream_options.include_usage = true** - In model config layer
102-
2. **Minimum 2048+ tokens** - Smaller content may not be cached
103-
3. **options.openrouter.cacheControl** - On content items in Prompt
104-
4. **Exact match** - Cache only hits on identical content
105-
106-
### Expected Behavior
107-
- **First call**: `cachedInputTokens = 0` (cache miss, creates cache)
108-
- **Second call**: `cachedInputTokens > 0` (cache hit, reads from cache)
109-
110-
### Effect-Specific Patterns
111-
- Use `Effect.gen` for composable effect workflows
112-
- Layer-based dependency injection for client and model
113-
- Type-safe error handling via Effect type
114-
- Structured concurrency with Effect.sleep for delays
54+
## Effect-Specific Notes
11555

116-
## Scientific Method
117-
All examples follow evidence-based verification:
118-
- **Hypothesis**: options.openrouter.cacheControl triggers caching
119-
- **Experiment**: Make identical calls twice
120-
- **Evidence**: Measure via response.usage.cachedInputTokens
121-
- **Analysis**: Compare cache miss vs cache hit
56+
- Use layer-based dependency injection for client and model configuration
57+
- `stream_options.include_usage` must be set in the model layer config
58+
- Cache metrics appear in `response.usage.cachedInputTokens`

0 commit comments

Comments
 (0)