[LEADS-246] Global cache configuration option#221
[LEADS-246] Global cache configuration option#221xmican10 wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
WalkthroughCentralizes caching under Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SystemConfig as SystemConfig (validator)
participant LLMPool as LLMPool/LLMConfig
participant Runner as EvaluationRunner
participant FS as Filesystem
User->>SystemConfig: provide config (may include core.cache_*)
SystemConfig->>SystemConfig: global_cache_setup() computes per-component cache_enabled and cache_dir
SystemConfig->>LLMPool: provide resolved LLM configs (with cache_dir)
User->>Runner: trigger evaluation
Runner->>LLMPool: query model configs (cache_dir)
Runner->>FS: clear/dedupe resolved cache_dir paths
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
tests/unit/core/models/test_system.py (1)
568-710: Missing test case:llm_poolconfigured withoutjudge_panel.The new suite thoroughly covers the legacy-llm and judge-panel branches of
global_cache_setup(), but there is no test that exercisesSystemConfig(llm_pool=LLMPoolConfig(...))without ajudge_panel. That code path is the one wherellm_pool.defaults.cache_diris currently left unset byglobal_cache_setup()(see the related comment onsystem.py), and a follow-up call toget_llm_config(...)/resolve_llm_config(...)raises with a confusing message. Adding a test for this combination would have caught that behavior and will regression-guard whichever fix is adopted.Suggested test sketch:
def test_global_cache_setup_pool_without_panel(self) -> None: pool = LLMPoolConfig( models={"gpt-4o-mini": LLMProviderConfig(provider="openai")}, ) config = SystemConfig(llm_pool=pool) assert config.llm_pool is not None assert config.llm_pool.defaults.cache_enabled is True assert config.llm_pool.defaults.cache_dir == ".caches/llm" # Should not raise resolved = config.get_llm_config("gpt-4o-mini") assert resolved.cache_dir == ".caches/llm/gpt-4o-mini"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/core/models/test_system.py` around lines 568 - 710, Add a unit test that constructs a SystemConfig with only llm_pool (LLMPoolConfig with a models entry like "gpt-4o-mini": LLMProviderConfig(provider="openai")) and no judge_panel, then assert that config.llm_pool is not None, config.llm_pool.defaults.cache_enabled is True and config.llm_pool.defaults.cache_dir == ".caches/llm", and finally call config.get_llm_config("gpt-4o-mini") (or resolve_llm_config equivalent) and assert the resolved.cache_dir == ".caches/llm/gpt-4o-mini" to ensure global_cache_setup populated llm_pool.defaults and that lookups do not raise.src/lightspeed_evaluation/core/models/system.py (1)
1-1: File-level# pylint: disable=too-many-lines— prefer splitting over suppression.Repo guidelines state: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead". This file is now ~1000 lines and the warning is being suppressed rather than addressed. Consider splitting the module by responsibility (e.g.
llm.pyforLLMConfig/LLMPoolConfig/LLMProviderConfig/LLMParametersConfig/LLMDefaultsConfig,cache.py/core.pyforCoreConfig,geval.pyforGEvalConfig& rubric, keepingsystem.pyfor the orchestratingSystemConfig).As per coding guidelines: "Do not disable lint warnings with # noqa, # type: ignore, or # pylint: disable comments - fix the underlying issue instead".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lightspeed_evaluation/core/models/system.py` at line 1, The file-level suppression "# pylint: disable=too-many-lines" must be removed and the large module should be split by responsibility: move LLM-related classes (LLMConfig, LLMPoolConfig, LLMProviderConfig, LLMParametersConfig, LLMDefaultsConfig) into a new llm.py, move caching/core classes (CoreConfig) into cache.py or core.py, move GEvalConfig and rubric-related code into geval.py, and keep orchestration classes (SystemConfig) in system.py; update imports where those classes are referenced, run tests/linters to ensure no circular imports, and delete the top-level pylint disable comment once the file is split and each new module is within size/lint limits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/system.yaml`:
- Around line 29-31: The YAML comment references a non-existent field
core.cache_dir; update the documentation hints to match the actual config field
names defined in CoreConfig by replacing core.cache_dir with
core.cache_base_dir, and likewise replace any api.cache_dir hint with
api.cache_base_dir so the comments match the real configuration fields (refer to
CoreConfig and the api config section when making the changes).
In `@src/lightspeed_evaluation/core/models/system.py`:
- Around line 936-938: The judge_panel + llm_pool branch currently
unconditionally overwrites legacy settings (self.llm.cache_enabled and
self.llm.cache_dir) with llm_pool.defaults, discarding any user-provided values;
change it to honor explicit legacy overrides the same way the non-panel branch
does: set self.llm.cache_dir = self.llm.cache_dir or llm_cache_path (or use the
existing llm_cache_path variable) and only override cache_enabled if
self.llm.cache_enabled is None (or not explicitly set), or alternatively emit a
clear warning via the logger when llm_pool.defaults replaces a non-default
self.llm.cache_dir; update the docstring for judge_panel/llm_pool behavior if
the override is intended.
- Around line 909-955: The global_cache_setup validator only sets
self.llm_pool.defaults.cache_dir when self.judge_panel is present, leaving
llm_pool.defaults.cache_dir None for valid SystemConfig(llm_pool=...) shapes and
causing resolve_llm_config()/get_llm_config() to raise; update
global_cache_setup so that whenever self.llm_pool exists you apply the cache
defaults (set self.llm_pool.defaults.cache_enabled using global_cache_enabled
and ensure self.llm_pool.defaults.cache_dir = self.llm_pool.defaults.cache_dir
or os.path.join(global_cache_base_dir, DEFAULT_LLM_CACHE_SUBDIR)), while
preserving the legacy sync to self.llm when judge_panel is present (keep the
existing branch that copies llm_pool defaults into self.llm for backward
compatibility).
In `@src/lightspeed_evaluation/runner/evaluation.py`:
- Around line 30-42: The code is adding duplicate cache entries when
system_config.llm.cache_dir equals system_config.llm_pool.defaults.cache_dir;
update the logic around cache_dirs in the evaluation runner to de-duplicate by
resolved filesystem path (use pathlib.Path(...).resolve()) so the same directory
isn't added twice: when collecting entries from system_config.llm_pool
(pool.defaults.cache_dir) and system_config.llm.cache_dir, compare resolved
paths and only append if not already present (or build a set of resolved paths
and skip duplicates) while keeping the labels "LLM Judge (pool)" and "LLM Judge"
behavior intact; reference the cache_dirs list, system_config.llm_pool /
LLMPoolConfig, pool.defaults.cache_dir, and system_config.llm.cache_dir to
locate the change.
---
Nitpick comments:
In `@src/lightspeed_evaluation/core/models/system.py`:
- Line 1: The file-level suppression "# pylint: disable=too-many-lines" must be
removed and the large module should be split by responsibility: move LLM-related
classes (LLMConfig, LLMPoolConfig, LLMProviderConfig, LLMParametersConfig,
LLMDefaultsConfig) into a new llm.py, move caching/core classes (CoreConfig)
into cache.py or core.py, move GEvalConfig and rubric-related code into
geval.py, and keep orchestration classes (SystemConfig) in system.py; update
imports where those classes are referenced, run tests/linters to ensure no
circular imports, and delete the top-level pylint disable comment once the file
is split and each new module is within size/lint limits.
In `@tests/unit/core/models/test_system.py`:
- Around line 568-710: Add a unit test that constructs a SystemConfig with only
llm_pool (LLMPoolConfig with a models entry like "gpt-4o-mini":
LLMProviderConfig(provider="openai")) and no judge_panel, then assert that
config.llm_pool is not None, config.llm_pool.defaults.cache_enabled is True and
config.llm_pool.defaults.cache_dir == ".caches/llm", and finally call
config.get_llm_config("gpt-4o-mini") (or resolve_llm_config equivalent) and
assert the resolved.cache_dir == ".caches/llm/gpt-4o-mini" to ensure
global_cache_setup populated llm_pool.defaults and that lookups do not raise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a11cb9bd-36e9-4f64-a1d1-23021d139da2
📒 Files selected for processing (6)
config/system.yamlsrc/lightspeed_evaluation/core/constants.pysrc/lightspeed_evaluation/core/models/system.pysrc/lightspeed_evaluation/runner/evaluation.pytests/unit/core/models/test_system.pytests/unit/runner/test_evaluation.py
💤 Files with no reviewable changes (1)
- tests/unit/runner/test_evaluation.py
c0d35ce to
a6b3df3
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
tests/unit/runner/test_evaluation.py (1)
66-101: Helper still sets up an embedding cache directory that is never exercised.
EmbeddingConfigno longer has acache_dirfield, so_clear_cachescannot — and does not — touchemb_diroremb.db. The helper still creates both (lines 75, 82), and the docstring claims it covers "pool, API, embedding." This makes the test assertion on line 511 (emb_dir.is_dir()) vacuous and the setup misleading.Consider dropping
emb_dir/emb.dband trimming the docstring, or document explicitly that embedding only contributes acache_enabledtoggle (no directory to clear).♻️ Suggested cleanup
- """Dirs + SystemConfig for _clear_caches covering pool, API, embedding. + """Dirs + SystemConfig for _clear_caches covering pool and API. + + Embedding has no cache_dir field; its cache is shared with the LLM pool, + so only cache_enabled is exercised here. Note: When llm_pool is configured, llm.cache_dir is automatically synced to llm_pool.defaults.cache_dir by the SystemConfig validator. They share the same cache directory. """ pool = tmp_path / "pool_llm" api_dir = tmp_path / "api_cache" - emb_dir = tmp_path / "emb_cache" - for d in (pool, api_dir, emb_dir): + for d in (pool, api_dir): d.mkdir() nested = pool / "nested" nested.mkdir() (nested / "pool.db").write_text("x") (api_dir / "api.db").write_text("x") - (emb_dir / "emb.db").write_text("x")And matching update at lines 501-511 in
test_clear_caches_with_all_caches_enabledto dropemb_dirfrom the directory list and assertion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/runner/test_evaluation.py` around lines 66 - 101, Update the helper _system_config_all_caches_under_tmp to stop creating the unused emb_dir and emb.db and adjust its docstring to reflect that it only covers pool and API caches (EmbeddingConfig now only exposes cache_enabled, no cache_dir); remove the emb_dir and related write_text call, and then update the test test_clear_caches_with_all_caches_enabled to drop emb_dir from the list of directories checked/asserted (and any assertions referencing emb_dir or emb.db), keeping EmbeddingConfig(cache_enabled=True) in the returned SystemConfig so _clear_caches behavior remains tested for the embedding toggle.config/system.yaml (1)
29-31: Clarify thatcache_diroverrides are absolute paths, not appended tocore.cache_base_dir.The override comments hint that uncommenting overrides
core.cache_enabled/core.cache_base_dir, but perglobal_cache_setup()the overridecache_diris used as-is (self.api.cache_dir or api_cache_path), not joined with the base dir. A user who customizescore.cache_base_dir: /var/cachesand then uncommentscache_dir: ".caches/api"will end up with a path unrelated to the configured base. Consider phrasing it as "set an explicit cache directory (replaces the path derived fromcore.cache_base_dir)".Also applies to: 96-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/system.yaml` around lines 29 - 31, Update the comment for the LLM pool cache override so it clearly states that cache_dir is an explicit absolute path that replaces the path derived from core.cache_base_dir rather than being appended to it; reference global_cache_setup() behavior (self.api.cache_dir or api_cache_path) and mention core.cache_base_dir and cache_dir so readers know cache_dir overrides the base dir entirely. Apply the same wording change to the other occurrence around lines 96-98 so both comment blocks consistently explain that setting cache_dir replaces the derived path.tests/unit/core/models/test_system.py (1)
568-723: Solid coverage of the new global cache propagation matrix.Default + legacy override + pool-with-panel + pool-without-panel + global-disabled + per-component override are all exercised, mirroring each branch of
SystemConfig.global_cache_setup(). Nice work.One nit: the comments
# Test global_core_setup_propagates global core cache setup(lines 579, 597, 625, 672, 712) reference a non-existent symbol — likely a leftover rename fromglobal_cache_setup. Consider rewording to e.g.# Verify global cache setup propagates to component defaults.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/core/models/test_system.py` around lines 568 - 723, The comments in TestSystemConfigGlobalCache refer to a non-existent symbol `global_core_setup_propagates` (leftover rename), so update those inline comments in the test methods (e.g., test_global_default_cache_setup_legacy_llm_support, test_global_cache_setup_legacy_llm_support_override, test_global_default_cache_setup_with_judge_panel, test_global_cache_turned_off_with_judge_panel, test_global_cache_setup_with_judge_panel_override) to a clearer phrase such as "Verify global cache setup propagates to component defaults" (or similar) so they no longer reference a nonexistent symbol and accurately describe the behavior of SystemConfig.global_cache_setup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@config/system.yaml`:
- Around line 29-31: Update the comment for the LLM pool cache override so it
clearly states that cache_dir is an explicit absolute path that replaces the
path derived from core.cache_base_dir rather than being appended to it;
reference global_cache_setup() behavior (self.api.cache_dir or api_cache_path)
and mention core.cache_base_dir and cache_dir so readers know cache_dir
overrides the base dir entirely. Apply the same wording change to the other
occurrence around lines 96-98 so both comment blocks consistently explain that
setting cache_dir replaces the derived path.
In `@tests/unit/core/models/test_system.py`:
- Around line 568-723: The comments in TestSystemConfigGlobalCache refer to a
non-existent symbol `global_core_setup_propagates` (leftover rename), so update
those inline comments in the test methods (e.g.,
test_global_default_cache_setup_legacy_llm_support,
test_global_cache_setup_legacy_llm_support_override,
test_global_default_cache_setup_with_judge_panel,
test_global_cache_turned_off_with_judge_panel,
test_global_cache_setup_with_judge_panel_override) to a clearer phrase such as
"Verify global cache setup propagates to component defaults" (or similar) so
they no longer reference a nonexistent symbol and accurately describe the
behavior of SystemConfig.global_cache_setup.
In `@tests/unit/runner/test_evaluation.py`:
- Around line 66-101: Update the helper _system_config_all_caches_under_tmp to
stop creating the unused emb_dir and emb.db and adjust its docstring to reflect
that it only covers pool and API caches (EmbeddingConfig now only exposes
cache_enabled, no cache_dir); remove the emb_dir and related write_text call,
and then update the test test_clear_caches_with_all_caches_enabled to drop
emb_dir from the list of directories checked/asserted (and any assertions
referencing emb_dir or emb.db), keeping EmbeddingConfig(cache_enabled=True) in
the returned SystemConfig so _clear_caches behavior remains tested for the
embedding toggle.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 76c18a68-9e8b-4a53-8ae0-e4b09d324055
📒 Files selected for processing (6)
config/system.yamlsrc/lightspeed_evaluation/core/constants.pysrc/lightspeed_evaluation/core/models/system.pysrc/lightspeed_evaluation/runner/evaluation.pytests/unit/core/models/test_system.pytests/unit/runner/test_evaluation.py
✅ Files skipped from review due to trivial changes (1)
- src/lightspeed_evaluation/core/models/system.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lightspeed_evaluation/runner/evaluation.py
a6b3df3 to
86ab226
Compare
asamal4
left a comment
There was a problem hiding this comment.
Thank you !! LGTM. Few nits..
Also please document the new behavior 1. Deprecation warning, 2. Embedding shares LLM cache.
| @@ -1,3 +1,4 @@ | |||
| # pylint: disable=too-many-lines | |||
There was a problem hiding this comment.
Hmm.. we will have to split the file/data class
| }, | ||
| ) | ||
| panel = JudgePanelConfig(judges=["gpt-4o-mini", "gpt-4o"]) | ||
| embedding = EmbeddingConfig(cache_enabled=True) |
There was a problem hiding this comment.
We are adding common config for cache at system level, also for now LLM cache is used for embedding..
Why do we need a specific definition for embedding EmbeddingConfig(cache_enabled=True)
There was a problem hiding this comment.
The embeddings.cache_enabled toggle allows/disallows caching of the embeddings into the LLM cache. Do you want me to remove this?
| suffix = cache_suffix if cache_suffix else model_id | ||
| if self.defaults.cache_dir is None: | ||
| raise ConfigurationError( | ||
| "cache_dir must be set by global_cache_setup() before resolving model configs" |
There was a problem hiding this comment.
is this error message going to help the User ?
There was a problem hiding this comment.
Probably not.. I will remove this
| # LEGACY support | ||
| # No llm_pool -> uses legacy llm config only |
There was a problem hiding this comment.
We should a deprecation warning here
Description
coresystem.yamlconfigembeddings.cache_dirsince embeddings and llm queries are cached togetherType of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Bug Fixes
Tests