-
Notifications
You must be signed in to change notification settings - Fork 47
issue/125 - cache interface #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a cache interface abstraction (issue #125) to enable support for multiple cache implementations (Dynamic, Paged, etc.) through a common interface. The refactoring extracts the existing DynamicCache implementation into a separate directory and establishes a factory pattern for cache creation.
Key Changes:
- Introduced
CacheInterfaceabstract base class with factory method for polymorphic cache creation - Refactored
DynamicCachefrom a standalone class to implementCacheInterface - Updated cache references throughout the codebase to use the interface type instead of concrete types
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| csrc/cache/cache_interface.hpp | New abstract interface defining the contract for all cache implementations |
| csrc/cache/cache_factory.cpp | Factory implementation for creating cache instances based on configuration |
| csrc/cache/cache.hpp | Updated to include both interface and implementation headers |
| csrc/cache/dynamic_cache/dynamic_cache.hpp | Refactored DynamicCache header to inherit from CacheInterface |
| csrc/cache/dynamic_cache/dynamic_cache.cpp | Implementation moved from inline to separate compilation unit |
| csrc/cache/kv_cache.hpp | Removed (functionality moved to dynamic_cache/) |
| csrc/models/llama/llama_model.hpp | Updated external_cache_ member to use interface pointer |
| csrc/models/llama/llama_attention.hpp | Updated include to use new cache.hpp header |
| csrc/engine/rank_worker.hpp | Updated cache_ptr_ to use interface type |
| csrc/engine/rank_worker.cpp | Updated to use factory method with logging |
| csrc/pybind11/models/llama.hpp | Updated includes and reformatted method definitions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // Copy new k/v into cache at current position | ||
| bool all_equal = cache_positions.empty() || std::equal(cache_positions.begin() + 1, cache_positions.end(), cache_positions.begin()); | ||
| if (all_equal) { |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cache_positions is empty after ensure_capacity (which could happen if initial_batch_size is 0), accessing cache_positions[0] on line 143 would cause undefined behavior. Consider adding a check to ensure cache_positions is not empty before accessing it, or validate that initial_batch_size is at least 1 in ensure_capacity.
| if (all_equal) { | |
| if (all_equal) { | |
| if (cache_positions.empty()) { | |
| throw std::runtime_error("KVCache update: cache_positions is empty after ensure_capacity, cannot update cache."); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| std::pair<infinicore::Tensor, infinicore::Tensor> update( | ||
| const infinicore::Tensor &k_new, | ||
| const infinicore::Tensor &v_new); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update method that takes two parameters should be marked with the override keyword for consistency and clarity. The base class CacheInterface provides a virtual implementation of this method, and marking overrides explicitly helps prevent errors and improves code maintainability.
| const infinicore::Tensor &v_new); | |
| const infinicore::Tensor &v_new) override; |
resolves #125