Skip to content

feat(api): add cache#15

Merged
hspedro merged 2 commits intomainfrom
feat/cache
Mar 14, 2025
Merged

feat(api): add cache#15
hspedro merged 2 commits intomainfrom
feat/cache

Conversation

@hspedro
Copy link
Copy Markdown
Owner

@hspedro hspedro commented Mar 14, 2025

🚀 Feature: Caching System for Translation and Detection Services

Overview

This PR introduces a robust caching system for Babeltron's translation and language detection services. The implementation significantly improves response times for repeated requests by storing previous results in a Valkey (Redis-compatible) cache with configurable TTL.

Key Features

1. Cache Infrastructure

  • Cache Interface: Created an abstract base class (CacheInterface) that defines the contract for all cache implementations with save, get, and delete methods.
  • Valkey Implementation: Implemented a Valkey-based cache client with:
    • Singleton pattern to ensure a single connection per application instance
    • Comprehensive error handling via decorators
    • Automatic serialization/deserialization of complex data types
    • Memory usage optimization (configured to use max 512MB with LRU eviction policy)

2. Cache Service Layer

  • Generic Cache Service: Implemented a type-safe cache service with specialized methods for translation and detection results
  • Intelligent Key Generation: Created utility functions to generate consistent cache keys based on:
    • Operation type (translate/detect)
    • Input text (normalized and sanitized)
    • Source and target languages (for translations)

3. API Integration

  • Translation Endpoint: Updated to check cache before performing translation
  • Detection Endpoint: Updated to check cache before performing language detection
  • Cache Flags: Added cached: boolean field to responses to indicate cache hits
  • Telemetry: Added cache hit/miss tracking to OpenTelemetry spans

4. Configuration

  • Added environment variables for cache configuration:
    • CACHE_HOST: Hostname for the cache server
    • CACHE_PORT: Port for the cache server
    • CACHE_TTL_SECONDS: Time-to-live for cached items (default: 3600s/1h)

5. Infrastructure Updates

  • Added Valkey service to Docker Compose configuration
  • Updated OpenTelemetry collector configuration to use OTLP exporter instead of deprecated Jaeger exporter
  • Fixed NumPy compatibility issues by pinning to version <2.0.0

6. Testing

  • Added comprehensive unit tests for:
    • Cache interface implementation
    • Cache service layer
    • Key generation utilities
    • Integration with translation and detection endpoints

Technical Details

Cache Key Format

  • Translation: translate:{source_lang}:{target_lang}:{normalized_text}
  • Detection: detect:{normalized_text}

Text normalization includes:

  • Unicode normalization (NFC form)
  • Whitespace trimming
  • Lowercase conversion

Performance Considerations

  • The cache is configured with a memory limit of 512MB
  • LRU (Least Recently Used) eviction policy ensures optimal memory usage
  • Default TTL of 1 hour balances freshness with performance
  • JSON serialization for complex data types ensures compatibility

Breaking Changes

None. This is a non-breaking enhancement that maintains backward compatibility with existing API contracts.

Future Improvements

  • Add cache statistics endpoint for monitoring
  • Implement cache warming for common translations
  • Add support for additional cache backends (Memcached, Redis Cluster, etc.)
  • Implement cache invalidation strategies for model updates

Dependencies

  • Added valkey package (v6.1.0) - Redis-compatible Python client
  • Fixed NumPy compatibility by pinning to version <2.0.0

Testing Instructions

  1. Start the application with Docker Compose:

    docker-compose up -d
  2. Make a translation request:

    curl -X POST "http://localhost:8000/api/v1/translate" \
      -H "Content-Type: application/json" \
      -d '{"text": "Hello world", "source_language": "en", "target_language": "fr"}'
  3. Make the same request again and observe the cached: true flag in the response and faster response time.

  4. Similarly, test the detection endpoint:

    curl -X POST "http://localhost:8000/api/v1/detect" \
      -H "Content-Type: application/json" \
      -d '{"text": "Hello world"}'

hspedro added 2 commits March 14, 2025 14:56
The cache interface is composed of a class that forces
implementations to provide save, get, and delete methods.
Currently, the only implementation is valkey which has
a docker service limiting its memory. Application connects
to the cache instance via CACHE_HOST and CACHE_PORT
variables with CACHE_TTL_SECONDS set to 1h by default
Now translate and detect routes will save the response
to cache and retrieve to reply faster to users. The cache
key is:
- translate:src_lang:tgt_lang:text
- detect:text

Where text is a sanitized version of the input text. By
sanitizations the app will do:
- Normalize Unicode characters (NFC form)
- Trim excess whitespaces
- Covert to lowercase
@hspedro hspedro self-assigned this Mar 14, 2025
@hspedro hspedro merged commit b6c29de into main Mar 14, 2025
2 checks passed
@hspedro hspedro deleted the feat/cache branch March 14, 2025 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant