Skip to content

Conversation

hazai
Copy link
Collaborator

@hazai hazai commented Sep 18, 2025

Description

This PR contains a new nemoguardrails/cache folder with an LFU cache implementation (and interface).
The cache if integrated into the content_safety model.
It supports:

  • configuration
  • stats tracking
  • logging
  • persistence
  • thread-safety
  • (a very minimal) normalization of cache key

@Pouyanpi @tgasser-nv

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

@hazai hazai changed the title Feature/dynamic caching cs Dynamic Caching for Content-Safety Sep 18, 2025
@hazai hazai requested review from tgasser-nv, trebedea and Pouyanpi and removed request for tgasser-nv September 18, 2025 08:41

def setUp(self):
"""Set up test fixtures."""
self.test_file = tempfile.mktemp()

Check failure

Code scanning / CodeQL

Insecure temporary file High test

Call to deprecated function tempfile.mktemp may be insecure.

def setUp(self):
"""Set up test fixtures."""
self.test_file = tempfile.mktemp()

Check failure

Code scanning / CodeQL

Insecure temporary file High test

Call to deprecated function tempfile.mktemp may be insecure.
Copy link
Contributor

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-1404

@hazai hazai removed the request for review from trebedea September 18, 2025 08:42
@hazai hazai self-assigned this Sep 18, 2025
@codecov-commenter
Copy link

codecov-commenter commented Sep 18, 2025

Codecov Report

❌ Patch coverage is 76.37475% with 116 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.81%. Comparing base (d2450e2) to head (8195a06).
⚠️ Report is 4 commits behind head on develop.

Files with missing lines Patch % Lines
nemoguardrails/cache/lfu.py 78.46% 70 Missing ⚠️
nemoguardrails/cache/interface.py 51.06% 23 Missing ⚠️
nemoguardrails/library/content_safety/actions.py 42.85% 16 Missing ⚠️
nemoguardrails/rails/llm/llmrails.py 81.48% 5 Missing ⚠️
nemoguardrails/library/content_safety/manager.py 95.34% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1404      +/-   ##
===========================================
+ Coverage    71.64%   71.81%   +0.16%     
===========================================
  Files          171      175       +4     
  Lines        17011    17509     +498     
===========================================
+ Hits         12188    12574     +386     
- Misses        4823     4935     +112     
Flag Coverage Δ
python 71.81% <76.37%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
nemoguardrails/cache/__init__.py 100.00% <100.00%> (ø)
nemoguardrails/rails/llm/config.py 91.02% <100.00%> (+0.27%) ⬆️
nemoguardrails/library/content_safety/manager.py 95.34% <95.34%> (ø)
nemoguardrails/rails/llm/llmrails.py 90.13% <81.48%> (-0.33%) ⬇️
nemoguardrails/library/content_safety/actions.py 78.49% <42.85%> (-15.54%) ⬇️
nemoguardrails/cache/interface.py 51.06% <51.06%> (ø)
nemoguardrails/cache/lfu.py 78.46% <78.46%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Pouyanpi Pouyanpi requested a review from Copilot September 22, 2025 10:55
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a comprehensive dynamic caching system for content safety models to reduce redundant LLM calls and improve performance.

Key changes include:

  • New LFU (Least Frequently Used) cache implementation with thread safety, persistence, and statistics tracking
  • Integration of cache system into content safety manager for automatic cache management per model
  • Configuration support for cache capacity, persistence intervals, and statistics logging

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_cache_lfu.py Comprehensive test suite covering LFU cache functionality, persistence, threading, and statistics
nemoguardrails/rails/llm/llmrails.py Integration of content safety manager and cache cleanup lifecycle management
nemoguardrails/rails/llm/config.py Configuration classes for cache persistence, statistics, and model cache settings
nemoguardrails/library/content_safety/manager.py Content safety manager that creates and manages per-model caches
nemoguardrails/library/content_safety/actions.py Cache integration into content safety check functions
nemoguardrails/cache/lfu.py Core LFU cache implementation with persistence and thread safety
nemoguardrails/cache/interface.py Abstract cache interface defining required methods
nemoguardrails/cache/__init__.py Cache module initialization
nemoguardrails/cache/README.md Comprehensive documentation for cache system usage
examples/configs/content_safety/config.yml Example configuration with cache settings
examples/configs/content_safety/README.md Updated documentation with cache configuration examples

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +520 to +528
# Create cache without persistence
cache = LFUCache(5)

cache.put("key1", "value1")
cache.persist_now() # Should do nothing

# No file should be created
self.assertFalse(os.path.exists("lfu_cache.json"))

Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test checks for a hardcoded file path 'lfu_cache.json' but the cache was created without specifying a persistence path. The test should check that no file is created at any path, or verify the default path behavior more explicitly.

Suggested change
# Create cache without persistence
cache = LFUCache(5)
cache.put("key1", "value1")
cache.persist_now() # Should do nothing
# No file should be created
self.assertFalse(os.path.exists("lfu_cache.json"))
# Run test in a temporary directory to check for file creation
with tempfile.TemporaryDirectory() as tmpdir:
cwd = os.getcwd()
try:
os.chdir(tmpdir)
cache = LFUCache(5)
cache.put("key1", "value1")
cache.persist_now() # Should do nothing
# No file should be created in the temp directory
self.assertEqual(os.listdir(tmpdir), [])
finally:
os.chdir(cwd)

Copilot uses AI. Check for mistakes.

Comment on lines +519 to +528
content_safety_config = self.config.rails.config.content_safety
self._content_safety_manager = ContentSafetyManager(content_safety_config)
self.runtime.register_action_param(
"content_safety_manager", self._content_safety_manager
)

log.info(
"Initialized ContentSafetyManager with cache %s",
"enabled" if content_safety_config.cache.enabled else "disabled",
)
Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line accesses self.config.rails.config.content_safety without checking if these nested attributes exist. If any intermediate attribute is None or missing, this will raise an AttributeError.

Suggested change
content_safety_config = self.config.rails.config.content_safety
self._content_safety_manager = ContentSafetyManager(content_safety_config)
self.runtime.register_action_param(
"content_safety_manager", self._content_safety_manager
)
log.info(
"Initialized ContentSafetyManager with cache %s",
"enabled" if content_safety_config.cache.enabled else "disabled",
)
rails = getattr(self.config, "rails", None)
rails_config = getattr(rails, "config", None)
content_safety_config = getattr(rails_config, "content_safety", None)
if content_safety_config is not None:
self._content_safety_manager = ContentSafetyManager(content_safety_config)
self.runtime.register_action_param(
"content_safety_manager", self._content_safety_manager
)
log.info(
"Initialized ContentSafetyManager with cache %s",
"enabled" if getattr(getattr(content_safety_config, "cache", None), "enabled", False) else "disabled",
)

Copilot uses AI. Check for mistakes.


# Store in cache if available
if cache_key:
assert content_safety_manager is not None and model_name is not None
Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using assert statements for runtime validation in production code is not recommended. These assertions can be disabled with the -O flag and should be replaced with proper error handling or if-checks.

Suggested change
assert content_safety_manager is not None and model_name is not None
if content_safety_manager is None or model_name is None:
raise RuntimeError("content_safety_manager and model_name must not be None when caching content safety results.")

Copilot uses AI. Check for mistakes.

Comment on lines +367 to +368
# Silently fail on persistence errors to not disrupt cache operations
pass
Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception is caught but completely ignored without any logging. Consider logging the error at debug or warning level to help with debugging persistence issues while still maintaining the silent failure behavior.

Suggested change
# Silently fail on persistence errors to not disrupt cache operations
pass
# Silently fail on persistence errors to not disrupt cache operations, but log the error for debugging
log.warning("Failed to persist LFU cache to disk at %s: %s", self.persistence_path, e)

Copilot uses AI. Check for mistakes.

self.freq_map[node.freq].append(node)

except Exception as e:
# If loading fails, start with empty cache
Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the persistence error handling, this exception is caught but the error details are lost. Consider logging the exception to help diagnose cache loading issues.

Suggested change
# If loading fails, start with empty cache
# If loading fails, log the error and start with empty cache
log.exception("Failed to load LFU cache from disk at %s. Starting with empty cache.", self.persistence_path)

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants