Skip to content

Commit 86ca0e7

Browse files
Simplify fetch prompt-caching README to link to main docs
1 parent 723e1ee commit 86ca0e7

File tree

1 file changed

+15
-86
lines changed

1 file changed

+15
-86
lines changed
Lines changed: 15 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -1,98 +1,27 @@
1-
# Anthropic Prompt Caching Examples
1+
# Prompt Caching Examples (TypeScript + fetch)
22

3-
This directory contains examples demonstrating Anthropic's prompt caching feature via OpenRouter using the raw fetch API.
3+
Examples demonstrating prompt caching with the fetch API.
44

5-
## What is Prompt Caching?
5+
## Documentation
66

7-
Anthropic's prompt caching allows you to cache large portions of your prompts (like system messages or context documents) to:
8-
- **Reduce costs** - Cached tokens cost significantly less than regular tokens
9-
- **Improve latency** - Cached content is processed faster on subsequent requests
10-
- **Enable larger contexts** - Use more context without proportional cost increases
7+
For full prompt caching documentation including all providers, pricing, and configuration details, see:
8+
- **[Prompt Caching Guide](../../../../docs/prompt-caching.md)**
119

12-
Cache TTL: 5 minutes for ephemeral caches
10+
## Examples in This Directory
1311

14-
## Examples
12+
- `user-message-cache.ts` - Cache large context in user messages
13+
- `multi-message-cache.ts` - Cache system prompt across multi-turn conversations
14+
- `no-cache-control.ts` - Control scenario (validates methodology)
1515

16-
### 1. System Message Cache (`system-message-cache.ts`)
17-
The most common pattern - cache a large system prompt:
18-
```bash
19-
bun run typescript/fetch/src/prompt-caching/system-message-cache.ts
20-
```
21-
22-
**Pattern**: System message with content-level `cache_control`
16+
## Quick Start
2317

24-
### 2. User Message Cache (`user-message-cache.ts`)
25-
Cache large context in user messages (e.g., uploading documents):
2618
```bash
19+
# Run an example
2720
bun run typescript/fetch/src/prompt-caching/user-message-cache.ts
2821
```
2922

30-
**Pattern**: User message with content-level `cache_control` on context block
31-
32-
### 3. Multi-Message Cache (`multi-message-cache.ts`)
33-
Cache system prompt across multi-turn conversations:
34-
```bash
35-
bun run typescript/fetch/src/prompt-caching/multi-message-cache.ts
36-
```
37-
38-
**Pattern**: System message cache persists through conversation history
39-
40-
### 4. No Cache Control (`no-cache-control.ts`)
41-
Control scenario - no caching should occur:
42-
```bash
43-
bun run typescript/fetch/src/prompt-caching/no-cache-control.ts
44-
```
45-
46-
**Pattern**: Same structure but NO `cache_control` markers (validates methodology)
47-
48-
## How to Use Cache Control
49-
50-
```typescript
51-
const requestBody = {
52-
model: 'anthropic/claude-3.5-sonnet',
53-
stream_options: {
54-
include_usage: true, // CRITICAL: Required for cache metrics
55-
},
56-
messages: [
57-
{
58-
role: 'system',
59-
content: [
60-
{
61-
type: 'text',
62-
text: 'Your large system prompt here...',
63-
cache_control: { type: 'ephemeral' }, // Cache this block
64-
},
65-
],
66-
},
67-
{
68-
role: 'user',
69-
content: 'Your question here',
70-
},
71-
],
72-
};
73-
```
74-
75-
## Important Notes
76-
77-
### OpenRouter Format Transformation
78-
OpenRouter transforms Anthropic's native response format to OpenAI-compatible format:
79-
- **Anthropic native**: `usage.cache_read_input_tokens`, `usage.cache_creation_input_tokens`
80-
- **OpenRouter returns**: `usage.prompt_tokens_details.cached_tokens` (OpenAI-compatible)
81-
82-
### Requirements for Caching
83-
1. **stream_options.include_usage = true** - CRITICAL, otherwise no usage details
84-
2. **Minimum 2048+ tokens** - Smaller content may not be cached reliably
85-
3. **cache_control on content blocks** - Not on message level
86-
4. **Exact match** - Cache only hits on identical content
87-
88-
### Expected Behavior
89-
- **First call**: `cached_tokens = 0` (cache miss, creates cache)
90-
- **Second call**: `cached_tokens > 0` (cache hit, reads from cache)
91-
- **Control**: `cached_tokens = 0` on both calls (no cache_control)
23+
## Key Requirements (Anthropic)
9224

93-
## Scientific Method
94-
All examples follow scientific method principles:
95-
- **Hypothesis**: cache_control triggers Anthropic caching
96-
- **Experiment**: Make identical calls twice
97-
- **Evidence**: Measure via `usage.prompt_tokens_details.cached_tokens`
98-
- **Analysis**: Compare first call (miss) vs second call (hit)
25+
- `stream_options.include_usage = true` - Required for cache metrics
26+
- Minimum 2048+ tokens to cache reliably
27+
- `cache_control: {type: "ephemeral"}` on content blocks (not message-level)

0 commit comments

Comments
 (0)