diff --git a/CACHING_TTL.md b/CACHING_TTL.md new file mode 100644 index 00000000..e76a461b --- /dev/null +++ b/CACHING_TTL.md @@ -0,0 +1,171 @@ +# Chproxy Caching TTL (Time To Live) + +## Overview + +Chproxy supports caching query responses to improve performance and reduce load on ClickHouse clusters. The Time To Live (TTL) determines how long cached responses remain valid before they expire and need to be refreshed. + +## TTL Configuration + +The caching TTL is configured using the `expire` parameter in the cache configuration section. This parameter accepts a duration value in the format: `` where unit can be: +- `ns` - nanoseconds +- `µs` or `us` - microseconds +- `ms` - milliseconds +- `s` - seconds +- `m` - minutes +- `h` - hours +- `d` - days +- `w` - weeks + +### Required Configuration + +The `expire` parameter is **required** for all cache configurations. If not specified, the configuration will be invalid. + +## Cache Modes + +Chproxy supports two cache modes, both using the same `expire` configuration: + +### 1. File System Cache + +```yaml +caches: + - name: "my_cache" + mode: "file_system" + file_system: + dir: "/path/to/cache/dir" + max_size: 150Mb + expire: 1h # Cached responses expire after 1 hour +``` + +### 2. Redis Cache + +```yaml +caches: + - name: "redis_cache" + mode: "redis" + redis: + addresses: + - "localhost:6379" + username: "user" + password: "password" + expire: 30s # Cached responses expire after 30 seconds +``` + +## Common TTL Examples + +### Short-term Caching (Fast-changing data) +```yaml +expire: 10s # 10 seconds +``` + +### Medium-term Caching (Moderate refresh rate) +```yaml +expire: 5m # 5 minutes +expire: 30m # 30 minutes +``` + +### Long-term Caching (Slowly-changing data) +```yaml +expire: 1h # 1 hour +expire: 24h # 24 hours (1 day) +``` + +## How TTL Works + +1. **When a query is cached**: The current timestamp is recorded along with the cached response. + +2. **When retrieving from cache**: + - The age of the cached entry is calculated + - If `age <= expire`: The cached response is served immediately + - If `age > expire`: The cached entry is considered expired + +3. **Grace Time (deprecated)**: + - An additional `grace_time` parameter exists but is deprecated + - During grace time, expired entries can still be served while being refreshed in the background + - Default grace time is 5 seconds if not specified + - In future versions, this will be replaced by `max_execution_time` + +## TTL in HTTP Response Headers + +When serving cached responses, chproxy includes caching information in the HTTP response headers: + +```http +Cache-Control: max-age= +``` + +Where `` is the number of seconds until the cached entry expires (calculated as `expire - age`). + +## Implementation Details + +### File System Cache +- Location: `cache/filesystem_cache.go` +- Expired files are removed during periodic cleanup operations +- Cleanup runs at intervals of `expire/2` (minimum 1 minute) +- Files older than `expire + grace_time` are permanently deleted + +### Redis Cache +- Location: `cache/redis_cache.go` +- Redis native TTL mechanism is used +- Entries automatically expire in Redis after the configured TTL +- For large payloads with low TTL (<15s), temporary files may be used to prevent data loss during streaming + - This threshold is defined as `minTTLForRedisStreamingReader = 15 * time.Second` in the source code + - Below this threshold, data is cached in temporary files to ensure complete delivery even if the Redis entry expires during transfer + +## Transaction Registry TTL + +In addition to cache entry TTL, chproxy uses transaction registries to prevent "thundering herd" problems (multiple identical queries hitting the backend simultaneously): + +- **Transaction deadline**: Set to `expire + grace_time` +- **Transaction ended TTL**: 500 milliseconds (hardcoded in `cache/transaction_registry.go`) + +## Best Practices + +1. **Match TTL to data freshness requirements**: + - Real-time dashboards: 10s - 30s + - Analytics reports: 5m - 1h + - Historical data: 1h - 24h + +2. **Consider query execution time**: + - TTL should be significantly longer than query execution time + - Otherwise, cached entries may expire before being useful + +3. **Balance between freshness and performance**: + - Shorter TTL = fresher data but more backend load + - Longer TTL = better performance but potentially stale data + +4. **Use different caches for different use cases**: + ```yaml + caches: + - name: "realtime" + expire: 30s + - name: "reporting" + expire: 1h + ``` + +## Validation + +To verify your TTL configuration is valid: + +```bash +# chproxy will validate the configuration on startup +./chproxy -config=config.yml +``` + +If `expire` is missing or zero, chproxy will fail when trying to initialize the cache with an error message like: +``` +FATAL: error while applying config: `expire` must be positive +``` + +Note: The `expire` field is technically optional in the YAML syntax (it has `omitempty` tag), but validation occurs when cache instances are created, ensuring it must be set to a positive value for the cache to function. + +## Related Configuration Files + +- `config/config.go` - Main configuration structure and validation +- `config/README.md` - Complete configuration reference +- `cache/filesystem_cache.go` - File system cache implementation +- `cache/redis_cache.go` - Redis cache implementation + +## See Also + +- [Configuration Documentation](config/README.md) +- [Cache Configuration Examples](config/examples/) +- [Official Documentation](https://www.chproxy.org/) diff --git a/CACHING_TTL_QUICK_REFERENCE.md b/CACHING_TTL_QUICK_REFERENCE.md new file mode 100644 index 00000000..10bb485d --- /dev/null +++ b/CACHING_TTL_QUICK_REFERENCE.md @@ -0,0 +1,142 @@ +# Chproxy Caching TTL - Quick Reference + +## Summary + +Chproxy's caching TTL (Time To Live) determines how long query responses are cached before expiring. This document provides a quick reference to the caching TTL configuration. + +## Key Information + +### Where TTL is Configured + +The caching TTL is configured in the `caches` section of the configuration file using the `expire` parameter: + +```yaml +caches: + - name: "my_cache" + mode: "file_system" # or "redis" + expire: 1h # TTL value - REQUIRED + # ... other settings +``` + +### Default TTL + +**There is NO default TTL value** - the `expire` parameter is **mandatory** and must be explicitly specified. If omitted, chproxy will fail to start with an error. + +### Duration Format + +TTL values use Go's duration format: `` + +| Unit | Description | Example | +|------|-------------|---------| +| `ns` | Nanoseconds | `1000ns` | +| `µs` or `us` | Microseconds | `100µs` | +| `ms` | Milliseconds | `500ms` | +| `s` | Seconds | `30s` | +| `m` | Minutes | `5m` | +| `h` | Hours | `2h` | +| `d` | Days | `7d` | +| `w` | Weeks | `2w` | + +### Common TTL Values + +| Use Case | Recommended TTL | Example | +|----------|----------------|---------| +| Real-time dashboards | 5s - 30s | `expire: 10s` | +| Live analytics | 1m - 5m | `expire: 5m` | +| Hourly reports | 30m - 1h | `expire: 1h` | +| Daily reports | 6h - 24h | `expire: 24h` | +| Historical/static data | 24h - 7d | `expire: 168h` | + +## Quick Start + +### Minimal Configuration + +```yaml +caches: + - name: "default_cache" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-cache" + max_size: 1Gb + expire: 30s # Cache for 30 seconds +``` + +### Multiple Caches with Different TTLs + +```yaml +caches: + - name: "fast" + mode: "file_system" + file_system: + dir: "/tmp/cache/fast" + max_size: 1Gb + expire: 10s # Short TTL for frequently changing data + + - name: "slow" + mode: "file_system" + file_system: + dir: "/tmp/cache/slow" + max_size: 10Gb + expire: 1h # Long TTL for stable data +``` + +## How It Works + +1. **First Request**: Query is executed on ClickHouse, result is cached with timestamp +2. **Subsequent Requests**: + - If `age < expire`: Cached result is served immediately + - If `age >= expire`: Cache entry is expired, query is re-executed +3. **HTTP Header**: Cached responses include `Cache-Control: max-age=` + +## Code Locations + +| Component | File | +|-----------|------| +| Configuration structure | `config/config.go` (line 928) | +| File system cache | `cache/filesystem_cache.go` | +| Redis cache | `cache/redis_cache.go` | +| Cache interface | `cache/cache.go` | + +## Documentation + +- **Detailed Guide**: [CACHING_TTL.md](./CACHING_TTL.md) +- **Configuration Reference**: [config/README.md](./config/README.md) +- **Example Configurations**: + - [config/examples/cache_ttl_examples.yml](./config/examples/cache_ttl_examples.yml) + - [config/examples/simple_cache_ttl_test.yml](./config/examples/simple_cache_ttl_test.yml) + +## Validation + +To validate your TTL configuration: + +```bash +./chproxy -config=your_config.yml +``` + +If the configuration is valid, chproxy will start (or fail with a different error if services are unavailable). + +If `expire` is missing or invalid, you'll see an error like: +``` +FATAL: error while applying config: `expire` must be positive +``` + +## Testing + +Use the provided example configuration to test TTL settings: + +```bash +./chproxy -config=config/examples/simple_cache_ttl_test.yml +``` + +## Additional Notes + +- **Grace Time**: A deprecated `grace_time` parameter exists that extends the cache lifetime slightly. This will be removed in future versions. +- **Redis TTL**: When using Redis cache mode, the native Redis TTL mechanism is used. +- **Transaction Registry**: Internal transaction tracking uses a fixed 500ms TTL after completion. +- **HTTP Streaming**: For Redis cache with TTL < 15s and large payloads, temporary files may be used to prevent data loss. + +## Need Help? + +- Read the [full documentation](./CACHING_TTL.md) +- Check [configuration examples](./config/examples/) +- Visit [chproxy.org](https://www.chproxy.org/) diff --git a/README.md b/README.md index 6f0ec862..b606c954 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,15 @@ It is an open-source community project and not an official ClickHouse project. Full documentation is available on [the official website](https://www.chproxy.org/). +## Key Features + +- **Query Caching**: Cache query responses with configurable TTL (Time To Live) + - See [CACHING_TTL.md](./CACHING_TTL.md) for detailed information about caching TTL configuration +- **Load Balancing**: Distribute queries across multiple ClickHouse nodes +- **Security**: User authentication, network restrictions, and HTTPS support +- **Rate Limiting**: Control query execution and request rates per user +- **High Availability**: Automatic failover and health checks + ## Contributing See our [contributing guide](./CONTRIBUTING.md) diff --git a/config/examples/cache_ttl_examples.yml b/config/examples/cache_ttl_examples.yml new file mode 100644 index 00000000..f4c3987d --- /dev/null +++ b/config/examples/cache_ttl_examples.yml @@ -0,0 +1,141 @@ +# Chproxy Cache TTL Configuration Examples +# This file demonstrates various TTL (Time To Live) configurations for different use cases + +# Security note: Set hack_me_please to true for testing only +hack_me_please: true + +server: + http: + listen_addr: ":9090" + +# Example 1: Multiple caches with different TTL values +caches: + # Real-time cache - for dashboards and live metrics + - name: "realtime" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-cache/realtime" + max_size: 1Gb + expire: 10s # Expires after 10 seconds + max_payload_size: 10Mb + + # Short-term cache - for frequently updated analytics + - name: "shortterm" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-cache/shortterm" + max_size: 5Gb + expire: 5m # Expires after 5 minutes + max_payload_size: 50Mb + + # Medium-term cache - for hourly reports + - name: "hourly" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-cache/hourly" + max_size: 10Gb + expire: 1h # Expires after 1 hour + max_payload_size: 100Mb + + # Long-term cache - for daily/historical reports + - name: "daily" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-cache/daily" + max_size: 50Gb + expire: 24h # Expires after 24 hours (1 day) + max_payload_size: 1Gb + shared_with_all_users: true + + # Redis cache example - for distributed caching + - name: "redis_cache" + mode: "redis" + redis: + addresses: + - "localhost:6379" + password: "your_redis_password" + pool_size: 10 + expire: 30s # Expires after 30 seconds + max_payload_size: 100Mb + shared_with_all_users: true + + # Very short TTL - for high-frequency updates + - name: "flash" + mode: "redis" + redis: + addresses: + - "localhost:6379" + expire: 5s # Expires after 5 seconds + max_payload_size: 5Mb + + # Extended cache - for rarely changing data + - name: "static" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-cache/static" + max_size: 20Gb + expire: 7d # Expires after 7 days + max_payload_size: 500Mb + shared_with_all_users: true + +clusters: + - name: "main_cluster" + nodes: ["localhost:8123"] + users: + - name: "default" + password: "" + +# Example user configurations using different cache TTLs +users: + # Dashboard user - needs fresh data + - name: "dashboard_user" + password: "dash123" + to_cluster: "main_cluster" + to_user: "default" + cache: "realtime" # Uses 10s TTL + max_execution_time: 30s + + # Analytics user - moderate refresh rate + - name: "analytics_user" + password: "analytics123" + to_cluster: "main_cluster" + to_user: "default" + cache: "shortterm" # Uses 5m TTL + max_execution_time: 2m + + # Report user - longer cache + - name: "report_user" + password: "report123" + to_cluster: "main_cluster" + to_user: "default" + cache: "hourly" # Uses 1h TTL + max_execution_time: 10m + + # Historical data user - very long cache + - name: "history_user" + password: "history123" + to_cluster: "main_cluster" + to_user: "default" + cache: "daily" # Uses 24h TTL + max_execution_time: 30m + + # Redis cache user - distributed caching + - name: "distributed_user" + password: "redis123" + to_cluster: "main_cluster" + to_user: "default" + cache: "redis_cache" # Uses 30s TTL with Redis + max_execution_time: 1m + +# TTL Duration Format Notes: +# - s = seconds (e.g., 30s = 30 seconds) +# - m = minutes (e.g., 5m = 5 minutes) +# - h = hours (e.g., 1h = 1 hour) +# - d = days (e.g., 7d = 7 days) +# - w = weeks (e.g., 2w = 2 weeks) +# +# You can also use milliseconds (ms), microseconds (µs/us), or nanoseconds (ns) +# Examples: +# - 500ms = 500 milliseconds +# - 100µs = 100 microseconds +# - 1000ns = 1000 nanoseconds diff --git a/config/examples/simple_cache_ttl_test.yml b/config/examples/simple_cache_ttl_test.yml new file mode 100644 index 00000000..666bc7ab --- /dev/null +++ b/config/examples/simple_cache_ttl_test.yml @@ -0,0 +1,33 @@ +# Simple Cache TTL Test Configuration +# This configuration can be used to test TTL settings without requiring Redis + +hack_me_please: true + +server: + http: + listen_addr: ":9090" + +caches: + # Test cache with 30 second TTL + - name: "test_cache" + mode: "file_system" + file_system: + dir: "/tmp/chproxy-test-cache" + max_size: 100Mb + expire: 30s # Cache expires after 30 seconds + max_payload_size: 10Mb + +clusters: + - name: "test_cluster" + nodes: ["localhost:8123"] + users: + - name: "default" + password: "" + +users: + - name: "test_user" + password: "test123" + to_cluster: "test_cluster" + to_user: "default" + cache: "test_cache" + max_execution_time: 1m