Skip to content

Conversation

@gengliqi
Copy link
Member

@gengliqi gengliqi commented Jan 23, 2026

close #171
close #223

Note that the TTL mechanism is disabled by default, as it may cause performance degradation when TiFlash runs in disaggregated mode. It will be enabled once #224 is resolved.

Summary by CodeRabbit

  • New Features

    • Added Time-To-Live (TTL) for region caching with configurable base, jitter, and runtime enable/disable; cached regions can now expire and be evicted or refreshed during lookups.
    • Exposed cluster ID through the client and RPC context so cache and RPCs are aware of cluster identity.
  • Tests

    • Added unit tests covering TTL expiration, refresh-on-access, and no-refresh scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: gengliqi <gengliqiii@gmail.com>
u
Signed-off-by: gengliqi <gengliqiii@gmail.com>
@ti-chi-bot ti-chi-bot bot added dco-signoff: yes Indicates the PR's author has signed the dco. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 23, 2026
Signed-off-by: gengliqi <gengliqiii@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 30, 2026

📝 Walkthrough

Walkthrough

Adds TTL-based expiration and cluster_id tracking to the region cache: Region gains an atomic ttl with init/check/update APIs and global TTL knobs; RegionCache uses TTL checks to evict stale regions during locateKey; RPC/PD interfaces and RPCContext gain cluster_id propagation; unit tests and build wiring added.

Changes

Cohort / File(s) Summary
Region header
include/pingcap/kv/RegionCache.h
Adds std::atomic_int64_t ttl, initTTL(), nextTTL(int64_t), checkRegionCacheTTL(int64_t), setRegionCacheTTL(...), setRegionCacheTTLEnabled(...); constructors initialize and call initTTL().
Region cache implementation
src/kv/RegionCache.cc
Introduces globals regionCacheTTLSec, regionCacheTTLJitterSec, enableRegionCacheTTL; implements TTL computation/validation and integrates TTL eviction into RegionCache::locateKey. Also initializes RegionCache::cluster_id from PD client.
RPC / RPCContext
include/pingcap/kv/Rpc.h, include/pingcap/kv/RegionCache.h
Adds cluster_id to RPCContext (constructor + member) and sets context->set_cluster_id(...) when preparing RPC calls.
PD client interface / implementations
include/pingcap/pd/IClient.h, include/pingcap/pd/Client.h, include/pingcap/pd/MockPDClient.h
Adds getClusterID() to IClient and implements it in Client and MockPDClient; used to initialize RegionCache::cluster_id.
Tests / Build
src/test/CMakeLists.txt, src/test/region_cache_test.cc
Adds region_cache_test.cc to tests and a unit test CheckRegionCacheTTL covering expired, refreshed, and far-ahead TTL scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant RC as RegionCache
    participant R as Region
    participant Clock as Clock

    Client->>RC: locateKey(key)
    RC->>R: lookup cached Region for key
    RC->>Clock: now = current_time()
    RC->>R: checkRegionCacheTTL(now)
    alt TTL expired
        R-->>RC: false
        RC->>RC: evict region (remove cached entry)
        RC-->>Client: cache miss → trigger reload
    else TTL valid
        R-->>RC: true (ttl possibly bumped)
        RC-->>Client: return region bounds
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇
A hop, a tick, a jittered beat,
Regions warm then cool their seat,
I guard the map, I chase the stale,
Time prunes the paths along the trail—
Hooray, the cache keeps fresh and neat.

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add TTL mechanism for region cache' clearly and concisely describes the main change in the pull request, which introduces TTL support throughout the region cache implementation.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #171: TTL initialization, TTL validation logic, configuration methods, and integration into region cache lookup operations.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the TTL mechanism: Region TTL fields/methods, RPCContext cluster_id integration, PD client cluster ID exposure, and region cache TTL checks.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 30, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@include/pingcap/kv/RegionCache.h`:
- Line 10: The header RegionCache.h uses std::chrono (e.g., at the code
referencing time_point/durations) but only includes <atomic>, so add `#include`
<chrono> to the header to make it self-contained; update the include block where
<atomic> is present so translation units that include RegionCache.h directly
will compile without requiring incidental chrono includes.

u
Signed-off-by: gengliqi <gengliqiii@gmail.com>
}

// now we have ts <= ttl <= ts+regionCacheTTLSec <= newTTL = ts+regionCacheTTLSec+randomJitter
if (ttl.compare_exchange_weak(current_ttl, new_ttl, std::memory_order_relaxed))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you don't update current_ttl, it looks to me if ttl.compare_exchange_weak fails, and it will never sucess for the following calls? Of cause by default the loop will break at L55, but if regionCacheTTLJitterSec is 0, it seem will hang in this while loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it fails, current_ttl will be reloaded with the newest value.
Ref: https://en.cppreference.com/w/cpp/atomic/atomic_compare_exchange.html

Atomically compares the object representation(until C++20)value representation(since C++20) of the object pointed to by obj with that of the object pointed to by expected, and if those are bitwise-equal, replaces the former with desired (performs read-modify-write operation). Otherwise, loads the actual value pointed to by obj into *expected (performs load operation).

#include <kvproto/metapb.pb.h>
#include <pingcap/kv/RegionCache.h>

#include "test_helper.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format code

Signed-off-by: gengliqi <gengliqiii@gmail.com>
Copy link
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 30, 2026

@windtalker: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: windtalker
Once this PR has been reviewed and has the lgtm label, please assign little-wallace for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the dco. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set cluster_id in Context Add timeout mechanism for region cache

2 participants