Skip to content

Conversation

@AnkushMalaker
Copy link
Collaborator

@AnkushMalaker AnkushMalaker commented Jan 12, 2026

Summary by CodeRabbit

  • New Features

    • Automated testing workflows with support for full tests with API keys and PR-triggered test runs based on labels.
    • Mock service providers enabling integration tests without external API dependencies.
  • Tests

    • New test execution paths that exclude API-dependent tests for faster local development cycles.
    • Improved test fixture management with automatic reuse of existing fixtures.
    • Test tagging system to identify and categorize API-dependent tests.

✏️ Tip: You can customize this high-level summary in your review settings.

…xecution

- Introduced `MockLLMProvider` and `MockTranscriptionProvider` to facilitate testing without external API dependencies, allowing for consistent and controlled test environments.
- Created `run-no-api-tests.sh` script to execute tests that do not require API keys, ensuring separation of API-dependent and independent tests.
- Updated Robot Framework test configurations to utilize mock services, enhancing test reliability and reducing external dependencies.
- Modified existing test workflows to include new configurations and ensure proper handling of results for tests excluding API keys.
- Added `mock-services.yml` configuration to disable external API services while maintaining core functionality for testing purposes.
- Enhanced documentation to reflect the new tagging system for tests requiring API keys, improving clarity on test execution requirements.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This PR introduces testing infrastructure supporting both API-enabled and API-free testing modes. It adds two new GitHub Actions workflows for API-key-dependent tests, updates the existing workflow to clarify its "No API Keys" designation, implements mock provider classes for LLM and transcription services, creates a mock services configuration, and includes supporting scripts and metadata updates to enable parallel testing workflows.

Changes

Cohort / File(s) Summary
GitHub Actions Workflows
.github/workflows/full-tests-with-api.yml, .github/workflows/pr-tests-with-api.yml
Two new workflows introduced for API-enabled Robot Framework tests: one triggered on dev/main pushes with specific paths, the other on PR label "test-with-api-keys". Both verify required secrets (DEEPGRAM_API_KEY, OPENAI_API_KEY, HF_TOKEN), set up Docker and Python environments, execute full Robot Framework tests with API keys injected, collect logs, upload artifacts, and publish HTML reports to GitHub Pages.
No-API Workflow Update
.github/workflows/robot-tests.yml
Workflow renamed to clarify "No API Keys" mode. Removed secret verification, switched test config to mock-services.yml, renamed run script to run-no-api-tests.sh, updated all result paths from tests/results/ to tests/results-no-api/, and updated artifact names and PR comments to reflect no-API test scope with requires-api-keys tag exclusion.
Mock Provider Implementations
backends/advanced/src/advanced_omi_backend/services/memory/providers/mock_llm_provider.py, backends/advanced/src/advanced_omi_backend/services/transcription/mock_provider.py
Two new mock provider classes for testing without external APIs. MockLLMProvider simulates LLM interactions with deterministic memory extraction, embeddings (384-dimension), and memory action proposals. MockTranscriptionProvider provides deterministic in-memory transcription with optional word-level timestamps and speaker diarization segments. Both implement required base class interfaces.
Test Configuration & Execution
tests/configs/mock-services.yml, tests/run-no-api-tests.sh
New mock services configuration disables external API dependencies while maintaining MongoDB, Redis, and Qdrant. New bash script orchestrates Robot Framework test execution without API keys: validates setup, manages environment, starts containers, executes tests with requires-api-keys tag exclusion, collects logs on failure, parses results, and optionally cleans up.
Test Infrastructure Updates
tests/setup/test_manager_keywords.robot, tests/integration/integration_test.robot, tests/tags.md
Enhanced test fixture reuse logic in test keywords to check for existing fixtures before creation. Integrated requires-api-keys tag into full pipeline integration test. Extended tag documentation to include new requires-api-keys tag and updated tag statistics and usage guidelines.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feat/split tests' directly relates to the main objective of the PR, which is to split tests into API-dependent and API-independent test workflows, but uses a generic phrasing that lacks specificity about the nature of the split. Consider a more specific title like 'Split tests into API-dependent and API-independent workflows' or 'Add mock providers and separate test configurations for API-independent testing' to clearly convey the primary change.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker AnkushMalaker changed the base branch from dev to fix/audio-pipe-mongo January 12, 2026 08:09
@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/tags.md (1)

210-215: Inconsistent tag count reference.

Line 212 still references "11 tags" but should be "12 tags" to match the updated count elsewhere in the document.

📝 Proposed fix
 ### Before Adding a New Tag
 
 **STOP!** Ask yourself:
-1. Can I use one of the existing 11 tags?
+1. Can I use one of the existing 12 tags?
 2. Is this tag really necessary for test organization?
 3. Have I checked with the team?
🤖 Fix all issues with AI agents
In @.github/workflows/pr-tests-with-api.yml:
- Around line 239-247: The "Upload logs on failure" step currently uploads
backends/advanced/.env and tests/setup/.env.test which can leak secrets; update
this step to avoid uploading raw .env files by generating sanitized copies
(e.g., create temporary sanitized files like backends/advanced/.env.sanitized
and tests/setup/.env.test.sanitized with secrets redacted) or only upload
specific non-sensitive log files, then change the step's paths to those
sanitized filenames (referencing the "Upload logs on failure" step and the two
.env paths) so secrets are not exposed in the uploaded artifacts.
🧹 Nitpick comments (5)
backends/advanced/src/advanced_omi_backend/services/memory/providers/mock_llm_provider.py (3)

72-87: Global random.seed() may cause side effects in concurrent or parallel tests.

Setting random.seed(seed) modifies the global random state, which could affect other code using the random module if tests run in parallel or if other code relies on random behavior within the same process.

Consider using a local random.Random instance instead:

♻️ Proposed fix using local Random instance
+import random as random_module
+
 async def generate_embeddings(self, texts: List[str]) -> List[List[float]]:
     ...
     embeddings = []

     for text in texts:
         # Generate deterministic embeddings based on text hash
-        # This ensures same text always gets same embedding
         seed = hash(text) % (2**32)
-        random.seed(seed)
+        rng = random_module.Random(seed)

         # Generate random normalized vector
-        embedding = [random.gauss(0, 0.3) for _ in range(self.embedding_dimension)]
+        embedding = [rng.gauss(0, 0.3) for _ in range(self.embedding_dimension)]

         # Normalize to unit length (standard for embeddings)
         magnitude = sum(x**2 for x in embedding) ** 0.5
+        if magnitude == 0:
+            magnitude = 1.0  # Avoid division by zero
         normalized_embedding = [x / magnitude for x in embedding]

         embeddings.append(normalized_embedding)

     return embeddings

89-94: Type signature differs from base class.

The retrieved_old_memory parameter type List[Dict[str, str]] | List[str] is broader than the base class signature List[Dict[str, str]]. While this works for a mock (since the argument is ignored), it could mask issues if tests pass List[str] but the real provider only accepts List[Dict[str, str]].

Consider keeping the signature consistent with the base class for type safety.


22-29: Minor: Use explicit Optional type hint.

Per PEP 484, config: Dict[str, Any] = None should use explicit optional typing. The unused config parameter is fine for a mock implementing an interface.

-    def __init__(self, config: Dict[str, Any] = None):
+    def __init__(self, config: Optional[Dict[str, Any]] = None):
tests/run-no-api-tests.sh (1)

54-57: Edge case in path conversion when CONFIG_FILE has no directory component.

If CONFIG_FILE is set to just a filename (e.g., mock-services.yml), dirname returns ., which works but relies on implicit behavior. Consider adding a guard or using realpath for robustness.

♻️ Suggested improvement using realpath
-# Convert CONFIG_FILE to absolute path
-if [[ ! "$CONFIG_FILE" = /* ]]; then
-    CONFIG_FILE="$(cd "$(dirname "$CONFIG_FILE")" && pwd)/$(basename "$CONFIG_FILE")"
-fi
+# Convert CONFIG_FILE to absolute path
+if [[ ! "$CONFIG_FILE" = /* ]]; then
+    CONFIG_FILE="$(realpath "$CONFIG_FILE")"
+fi
backends/advanced/src/advanced_omi_backend/services/transcription/mock_provider.py (1)

41-42: Remove or use the unused audio_duration variable.

The variable audio_duration is calculated but never used. Either remove it or use it to make mock responses more realistic (e.g., scaling timestamps based on actual audio length).

♻️ Option 1: Remove unused variable
-        # Calculate audio duration from bytes (assuming 16-bit PCM)
-        audio_duration = len(audio_data) / (sample_rate * 2)  # 2 bytes per sample
-
         # Return a mock transcript with word-level timestamps
♻️ Option 2: Use for dynamic mock response
         # Calculate audio duration from bytes (assuming 16-bit PCM)
         audio_duration = len(audio_data) / (sample_rate * 2)  # 2 bytes per sample
+        
+        # Scale mock timestamps to match audio duration (for more realistic testing)
+        scale = max(1.0, audio_duration / 2.6)  # 2.6s is the mock transcript duration
 
         # Return a mock transcript with word-level timestamps
-        # This simulates a real transcription result
         mock_transcript = "This is a mock transcription for testing purposes."
 
         # Generate mock words with timestamps
         words = [
-            {"word": "This", "start": 0.0, "end": 0.3, "confidence": 0.99, "speaker": 0},
+            {"word": "This", "start": 0.0 * scale, "end": 0.3 * scale, "confidence": 0.99, "speaker": 0},
             # ... scale other timestamps similarly
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between d143fe7 and 126df5b.

📒 Files selected for processing (10)
  • .github/workflows/full-tests-with-api.yml
  • .github/workflows/pr-tests-with-api.yml
  • .github/workflows/robot-tests.yml
  • backends/advanced/src/advanced_omi_backend/services/memory/providers/mock_llm_provider.py
  • backends/advanced/src/advanced_omi_backend/services/transcription/mock_provider.py
  • tests/configs/mock-services.yml
  • tests/integration/integration_test.robot
  • tests/run-no-api-tests.sh
  • tests/setup/test_manager_keywords.robot
  • tests/tags.md
🧰 Additional context used
🧬 Code graph analysis (2)
backends/advanced/src/advanced_omi_backend/services/memory/providers/mock_llm_provider.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
  • LLMProviderBase (271-337)
backends/advanced/src/advanced_omi_backend/services/transcription/mock_provider.py (3)
backends/advanced/src/advanced_omi_backend/services/transcription/base.py (1)
  • BatchTranscriptionProvider (117-133)
extras/speaker-recognition/src/simple_speaker_recognition/core/audio_backend.py (1)
  • diarize (53-107)
backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
  • segments (213-215)
🪛 Ruff (0.14.10)
backends/advanced/src/advanced_omi_backend/services/memory/providers/mock_llm_provider.py

22-22: Unused method argument: config

(ARG002)


22-22: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


31-31: Unused method argument: prompt

(ARG002)


91-91: Unused method argument: retrieved_old_memory

(ARG002)


93-93: Unused method argument: custom_prompt

(ARG002)

backends/advanced/src/advanced_omi_backend/services/transcription/mock_provider.py

42-42: Local variable audio_duration is assigned to but never used

Remove assignment to unused variable audio_duration

(F841)


76-76: Unused method argument: client_id

(ARG002)

🔇 Additional comments (26)
tests/setup/test_manager_keywords.robot (1)

125-146: LGTM! Fixture reuse logic is well-implemented.

The pre-check to reuse existing fixtures is a good optimization that avoids redundant audio uploads. The sanity validations (checking transcript exists and is non-empty) before reusing the fixture are appropriate defensive checks.

One minor observation: if multiple fixtures exist, this always picks the first one (${fixtures}[0]). This is likely fine since fixtures should be equivalent, but worth noting if fixture ordering ever matters.

tests/integration/integration_test.robot (1)

26-26: LGTM! Appropriate tagging for API-dependent test.

Adding requires-api-keys tag correctly identifies this test as requiring external API services (Deepgram/OpenAI), enabling proper filtering in the no-API test workflow.

tests/tags.md (1)

91-98: Well-documented special tag section.

The documentation clearly explains the purpose, scope, and requirements for the requires-api-keys tag, including the specific environment variables needed and when these tests are executed in CI.

tests/configs/mock-services.yml (3)

28-39: LGTM! Qdrant configuration is well-structured.

The local Qdrant configuration uses appropriate defaults and environment variable substitution for flexibility across different test environments.


46-52: Excellent inline documentation.

The notes clearly delineate the intended test scope and direct users to appropriate alternative configurations when transcription or memory extraction is needed.


21-26: Chronicle memory provider is properly implemented and configured.

The configuration correctly references provider: chronicle which is the default memory provider implementation in the codebase (backends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py). No issues found.

tests/run-no-api-tests.sh (5)

88-90: Variable expansion occurs at file creation time.

${COMPOSE_PROJECT_NAME:-advanced-backend-test} is expanded when the heredoc is written, meaning the value is baked into the file. If this is intentional (ensuring a consistent project name), the code is fine. If the intent is to allow runtime override via environment, escape the variable with \$.


106-116: LGTM!

Good pattern for capturing the test exit code while allowing subsequent log collection and cleanup steps to run. The timeout protection (30m) prevents runaway test execution.


131-163: LGTM!

The result parsing correctly uses os.getenv('OUTPUTDIR', 'results-no-api') to handle both explicit and default output directories. The XML parsing logic for Robot Framework output is well-structured.


165-180: LGTM!

Cleanup and exit code propagation are correctly implemented. The -v flag for volume removal ensures a clean slate for subsequent test runs.


32-36: The directory check is correct and appropriately matches the actual directory structure. Both Makefile and endpoints/ directory exist in the tests/ directory, confirming this validation is necessary and accurate for ensuring the script runs from the correct location.

Likely an incorrect or invalid review comment.

.github/workflows/full-tests-with-api.yml (4)

1-21: LGTM!

The workflow configuration is well-structured with appropriate triggers (dev/main branches, specific paths) and required permissions for artifact uploads and GitHub Pages deployment.


31-54: LGTM!

Secret verification correctly validates required secrets without exposing their values. The optional HF_TOKEN with a warning is appropriate since speaker recognition is a non-critical feature.


104-117: LGTM!

The deferred failure pattern (capture exit code, always exit 0, fail at end) allows artifact uploads and log collection even when tests fail. This is a good CI practice.


247-255: LGTM!

The deferred failure step correctly propagates the test exit code after all artifact uploads complete.

backends/advanced/src/advanced_omi_backend/services/transcription/mock_provider.py (3)

1-10: LGTM!

Clean module structure with appropriate imports. The docstring clearly explains the purpose of the mock provider.


70-74: LGTM!

The return structure correctly handles the diarize parameter, returning segments only when diarization is requested. This matches the expected contract from real transcription providers.


76-82: LGTM!

The connect and disconnect methods correctly implement the base class interface. The unused client_id parameter is acceptable as it maintains API compatibility with the base class.

.github/workflows/pr-tests-with-api.yml (3)

3-6: Consider the behavior of synchronize trigger with label check.

The workflow triggers on synchronize (any push to PR), but only runs if the label is present. This is correct, but be aware that every push to a labeled PR will trigger API-key tests, which may consume API quota. Consider documenting this behavior or adding a manual trigger option.


183-229: LGTM!

The PR comment generation provides excellent developer experience with clear status, metrics table, and direct links to reports. The formatting is clean and informative.


286-294: LGTM!

Deferred failure pattern correctly implemented, consistent with other workflows.

.github/workflows/robot-tests.yml (5)

1-1: LGTM!

Clear naming convention distinguishes this workflow from the API-enabled variants.


56-63: LGTM!

The configuration correctly uses mock-services.yml and provides clear messaging about disabled external API dependencies.


77-86: LGTM!

Correctly invokes the no-API test script with the established deferred failure pattern.


173-178: LGTM!

The PR comment clearly communicates that API-dependent tests are excluded and directs users to the appropriate branches for full test coverage.


102-124: LGTM!

All result paths and artifact names are consistently updated to reflect the no-API variant. The changes are coherent across result checking, HTML report upload, and pages artifact upload.

Comment on lines +197 to +205
- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: robot-test-logs-full
path: |
backends/advanced/.env
tests/setup/.env.test
retention-days: 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential secret exposure in uploaded logs.

Uploading .env files as artifacts on failure may expose API keys or other secrets. These files are accessible to anyone with repository read access on public repositories.

Consider either:

  1. Sanitizing the files before upload (redacting sensitive values)
  2. Uploading only non-sensitive configuration files
  3. Using if: failure() && github.event_name != 'pull_request' to restrict to non-PR workflows
🔒 Suggested fix to sanitize or exclude sensitive files
     - name: Upload logs on failure
       if: failure()
       uses: actions/upload-artifact@v4
       with:
         name: robot-test-logs-full
         path: |
-          backends/advanced/.env
-          tests/setup/.env.test
+          backends/advanced/docker-compose-test.yml
+          tests/setup/.env.test.example
         retention-days: 7

Or sanitize before upload:

    - name: Sanitize and upload logs on failure
      if: failure()
      run: |
        # Create sanitized copies
        sed 's/\(API_KEY=\).*/\1[REDACTED]/' backends/advanced/.env > backends/advanced/.env.sanitized || true
        sed 's/\(PASSWORD=\).*/\1[REDACTED]/' tests/setup/.env.test > tests/setup/.env.test.sanitized || true
    
    - name: Upload sanitized logs on failure
      if: failure()
      uses: actions/upload-artifact@v4
      with:
        name: robot-test-logs-full
        path: |
          backends/advanced/.env.sanitized
          tests/setup/.env.test.sanitized
        retention-days: 7

Comment on lines +239 to +247
- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: robot-test-logs-pr-labeled
path: |
backends/advanced/.env
tests/setup/.env.test
retention-days: 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same secret exposure concern as in full-tests-with-api.yml.

Uploading .env files on failure may expose secrets. Apply the same sanitization fix recommended for the other workflow.

🤖 Prompt for AI Agents
In @.github/workflows/pr-tests-with-api.yml around lines 239 - 247, The "Upload
logs on failure" step currently uploads backends/advanced/.env and
tests/setup/.env.test which can leak secrets; update this step to avoid
uploading raw .env files by generating sanitized copies (e.g., create temporary
sanitized files like backends/advanced/.env.sanitized and
tests/setup/.env.test.sanitized with secrets redacted) or only upload specific
non-sensitive log files, then change the step's paths to those sanitized
filenames (referencing the "Upload logs on failure" step and the two .env paths)
so secrets are not exposed in the uploaded artifacts.

- Updated CLAUDE.md to clarify test execution modes, emphasizing the separation of tests requiring API keys from those that do not.
- Expanded the testing guidelines in TESTING_GUIDELINES.md to detail the organization of tests based on API dependencies, including tagging conventions and execution paths.
- Improved mock-services.yml to include dummy configurations for LLM and embedding services, ensuring tests can run without actual API calls.
- Added comprehensive documentation on GitHub workflows for different test scenarios, enhancing clarity for contributors and maintainers.
@github-actions
Copy link

⚠️ Robot Framework Test Results (Label-Triggered Full Suite)

Status: ❌ Some tests failed

🏷️ Note: This run was triggered by the test-with-api-keys label.
All tests including API-dependent tests have been executed.

Metric Count
✅ Passed 112
❌ Failed 3
📊 Total 115

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:


View full workflow run

- Modified `plugins.yml.template` to implement event subscriptions for the Home Assistant plugin, enhancing its event-driven capabilities.
- Revised `README.md` to clarify test execution processes, emphasizing the distinction between tests requiring API keys and those that do not.
- Updated `mock-services.yml` to streamline mock configurations, ensuring compatibility with the new testing workflows.
- Added `requires-api-keys` tags to relevant test cases across various test files, improving organization and clarity regarding API dependencies.
- Enhanced documentation for test scripts and configurations, providing clearer guidance for contributors on executing tests based on API key requirements.
@github-actions
Copy link

🎉 Robot Framework Test Results (Label-Triggered Full Suite)

Status: ✅ All tests passed!

🏷️ Note: This run was triggered by the test-with-api-keys label.
All tests including API-dependent tests have been executed.

Metric Count
✅ Passed 115
❌ Failed 0
📊 Total 115

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:


View full workflow run

@github-actions
Copy link

⚠️ Robot Framework Test Results (Label-Triggered Full Suite)

Status: ❌ Some tests failed

🏷️ Note: This run was triggered by the test-with-api-keys label.
All tests including API-dependent tests have been executed.

Metric Count
✅ Passed 114
❌ Failed 1
📊 Total 115

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:


View full workflow run

@github-actions
Copy link

⚠️ Robot Framework Test Results (No API Keys)

Status: ❌ Some tests failed

ℹ️ Note: This run excludes tests requiring external API keys (Deepgram, OpenAI).
Tests tagged with requires-api-keys will run on dev/main branches.

Metric Count
✅ Passed 79
❌ Failed 15
📊 Total 94

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:


View full workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants