Skip to content

Add story handle caching to ChronoKVS#511

Open
EnekoGonzalez3 wants to merge 21 commits intodevelopfrom
431-chronokvs-story-handle-caching
Open

Add story handle caching to ChronoKVS#511
EnekoGonzalez3 wants to merge 21 commits intodevelopfrom
431-chronokvs-story-handle-caching

Conversation

@EnekoGonzalez3
Copy link
Contributor

@EnekoGonzalez3 EnekoGonzalez3 commented Jan 25, 2026

Summary

Implements story handle caching in ChronoKVS to dramatically improve write performance by avoiding repeated AcquireStory/ReleaseStory RPC calls.

Key Feature: Adds a flush() API to explicitly commit cached handles when needed.

Changes

Core Implementation

  • Add handleCache (std::unordered_map<std::string, chronolog::StoryHandle*>) to store active handles by key
  • Add cacheMutex for thread-safe cache access
  • Add getOrAcquireHandle() helper for lazy handle acquisition (cache hit/miss pattern)
  • Add flushCachedHandle() to release a specific key's handle before read operations
  • Add public flush() method to release all cached handles and commit pending writes

Behavior

  • storeEvent(): Uses cached handles for fast repeated writes to the same key
  • retrieveEvents(): Automatically flushes the key's cached handle before reading (ensures data consistency)
  • Destructor: Releases all cached handles before disconnecting

API Usage

ChronoKVS kvs;

// Write operations use cached handles (fast)
kvs.put("key1", "value1");  // Acquires handle, caches it
kvs.put("key1", "value2");  // Reuses cached handle
kvs.put("key1", "value3");  // Reuses cached handle

// Option 1: Explicit flush before waiting for propagation
kvs.flush();  // Releases all cached handles, commits data
// ... wait for data propagation ...
auto history = kvs.get_history("key1");  // Works correctly

// Option 2: Read methods auto-flush (no explicit flush needed)
auto value = kvs.get("key1", timestamp);  // Auto-flushes key1's handle first

Performance Impact

Before: Each operation required AcquireStory + operation + ReleaseStory (3 RPC calls)
After: First operation acquires handle, subsequent operations reuse cached handle

Scenario Before After Improvement
1000 puts to same key 1000 acquire/release cycles 1 acquire + 1000 logs + 1 release ~160x faster
Measured throughput ~27 ops/sec ~3000+ ops/sec ~111x faster

Important Notes

  • ChronoLog Data Propagation: After releasing a story handle, ChronoLog requires time to propagate data before it becomes available for replay. Applications should call flush() and wait before reading recently written data.
  • Thread Safety: All cache operations are protected by mutex for safe concurrent access.
  • Automatic Consistency: Read operations (retrieveEvents) automatically flush the relevant key's cached handle to ensure data consistency.

Testing

All 6 integration tests pass:

  • ✓ Put operations (with caching)
  • ✓ Get by timestamp
  • ✓ Get history
  • ✓ Get range
  • ✓ Get earliest
  • ✓ Get latest

Closes #431

EnekoGonzalez3 and others added 13 commits January 15, 2026 21:49
…id time ranges. Update integration tests to verify event count and range correctness.
… event by key. Enhance integration tests to validate functionality and ensure correct event retrieval.
…st event by key. Update integration tests to validate functionality and ensure correct event retrieval.
Resolves merge conflicts and addresses Copilot review comments:
- Extract MIN_TIMESTAMP/MAX_TIMESTAMP to file-level constants
- Add documentation comments for get_range(), get_earliest(), and get_latest() APIs
- Move <algorithm> include from header to cpp file
- Fix misleading "All range retrievals successful" message
- Update history_events comment to mention subsequent tests
The files were accidentally deleted during the merge from develop.
These files are required by ChronoStore/test/CMakeLists.txt.
- Add spdlog-based logging throughout ChronoKVS plugin with [ChronoKVS] component tags
- Add input validation for empty keys, empty values, and invalid timestamp ranges
- Add error code propagation using chronolog::to_string() for meaningful error context
- Add connection/disconnection logging at info level
- Add operation logging at debug level for put, get, get_history, get_range, get_earliest, get_latest
- Add error logging at error level for all failure conditions
- Replace generic exceptions with error code context for debugging
- Improve error messages with chronolog error code information
- Remove chrono_monitor.h dependency, use std::cerr for logging
- Remove client_errcode.h dependency, use error codes directly
- Remove chronolog_client link from integration test
- Remove logger initialization from integration test

ChronoKVS should only use the client API, not internal headers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements story handle caching in ChronoKVSClientAdapter to reduce
network overhead and improve performance for repeated operations on
the same keys.

Changes:
- Add handleCache (unordered_map) to store active handles by key
- Add cacheMutex for thread-safe cache access
- Add getOrAcquireHandle() helper for lazy handle acquisition
- Modify storeEvent() and retrieveEvents() to use cached handles
- Update destructor to release all cached handles before disconnect

This addresses the performance issue where each put/get operation was
acquiring and releasing a story handle, causing unnecessary RPC calls.
With caching, 100 puts to the same key now require 1 acquire + 100 logs
+ 1 release instead of 100 acquire/release cycles.

Closes #431
@EnekoGonzalez3 EnekoGonzalez3 self-assigned this Jan 25, 2026
@EnekoGonzalez3 EnekoGonzalez3 linked an issue Jan 25, 2026 that may be closed by this pull request
4 tasks
@EnekoGonzalez3 EnekoGonzalez3 changed the title 431 Add story handle caching to ChronoKVS Add story handle caching to ChronoKVS Jan 26, 2026
@EnekoGonzalez3 EnekoGonzalez3 added ServingLayer Refers to all the issues associated to the Serving Layer of ChronoLog project. chronokvs labels Jan 28, 2026
Merges latest develop changes (commits up to 0931ef7) including:
- ChronoKVS Error Handling & Logging (#510)
- ChronoKVS get_latest() API (#508)
- ChronoKVS get_earliest() API (#507)

Conflict resolution strategy:
- Preserved story handle caching infrastructure (handleCache, cacheMutex,
  getOrAcquireHandle) from PR branch - this is the main feature
- Adopted logging system from develop (CHRONOKVS_DEBUG/INFO/ERROR macros)
  replacing all std::cerr calls
- Used UINT64_MAX for MAX_TIMESTAMP constant (more robust than hardcoded value)
- Kept all error handling and input validation from develop
- Integrated chronokvs_logger.h and unit tests from develop

The merge maintains the performance optimization of handle caching while
adopting the improved logging infrastructure and error handling from develop.
The merge incorrectly applied handle caching to retrieveEvents, but
ReplayStory requires proper acquire/release around the operation.

Changes:
- retrieveEvents now uses RAII StoryHandleGuard from develop
- Acquires handle, uses it for ReplayStory, releases via RAII guard
- storeEvent keeps handle caching (optimization for writes)

This fixes error -12 and segfault in integration tests.
The caching approach is incompatible with ChronoLog's API semantics.
ReplayStory and log_event operations require proper acquire/release
cycles around each operation.

Changes:
- Removed handleCache, cacheMutex, and getOrAcquireHandle()
- Both storeEvent and retrieveEvents now use RAII StoryHandleGuard
- Each operation acquires handle, uses it, and releases via RAII

This matches the develop branch approach and should fix the
integration test failures.
Implements story handle caching in ChronoKVSClientAdapter to reduce
network overhead for write operations. Key insight: cached handles
must be released before data can be replayed, requiring explicit
flush semantics.

Implementation:
- Add handleCache (unordered_map) to store active write handles
- Add cacheMutex for thread-safe cache access
- Add getOrAcquireHandle() for lazy handle acquisition (writes)
- Add flushCachedHandle() to release cached handle before reads
- Add public flush() API to release all cached handles
- storeEvent() uses cached handles (performance benefit)
- retrieveEvents() flushes key's cached handle then uses RAII

Performance Impact:
- Write operations: ~160x faster (3000+ ops/sec vs ~27 ops/sec)
- 1000 puts: 1 acquire + 1000 logs + 1 release (on flush)
- Instead of: 1000 acquire/log/release cycles

Usage:
- Write operations use cached handles automatically
- Call flush() before waiting for data propagation
- Read operations automatically flush the key's cached handle

Test updated to call flush() after puts, before propagation wait.
All 6/6 integration tests now pass.

Closes #431
@github-actions

This comment was marked as resolved.

@EnekoGonzalez3 EnekoGonzalez3 marked this pull request as ready for review February 5, 2026 21:08
#include <string>
#include <vector>
#include <algorithm>
#include <stdexcept>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need these two new header files. Maybe they were added for already deleted codes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chronokvs ServingLayer Refers to all the issues associated to the Serving Layer of ChronoLog project.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ChronoKVS Plugin: Extend story handles lifetime

3 participants