Skip to content

chore: Move and run Python SDK integration tests#1164

Merged
edwinjosechittilappilly merged 53 commits intorelease-0.4.0from
integration-tests-sdk
Mar 18, 2026
Merged

chore: Move and run Python SDK integration tests#1164
edwinjosechittilappilly merged 53 commits intorelease-0.4.0from
integration-tests-sdk

Conversation

@edwinjosechittilappilly
Copy link
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly commented Mar 17, 2026

This pull request introduces improvements to integration and SDK testing workflows, enhances test output clarity, and adds a comprehensive QA checklist for the Python SDK. The changes also refine error handling for conversation deletion and streamline onboarding logic for SDK tests. The most significant updates are grouped below.

Test Workflow & Output Enhancements

  • Integration and SDK tests are now grouped and visually separated in the Makefile using GitHub Actions-style output (::group::), making CI logs easier to navigate. [1] [2] [3] [4] [5]
  • SDK integration tests now run from tests/integration/sdk/ instead of a single test file, and use a dedicated install step for the Python SDK. [1] [2] [3]

SDK Integration Test Infrastructure

  • Added tests/integration/sdk/conftest.py to provide shared fixtures, API key management, onboarding, and test file creation for SDK tests.
  • Added a detailed QA checklist in tests/integration/sdk/README.md covering 51 tests across 8 domains for the Python SDK.
  • Introduced test_auth.py for authentication and API key behavior tests in the SDK integration suite.
  • Modified backend onboarding logic to skip in-process setup when SDK_TESTS_ONLY=true, preventing unwanted state resets during SDK integration tests.

Error Handling Improvements

  • Improved conversation deletion logic in the backend (chat_service.py and chat.py) to return False if the conversation does not exist, and to respond with HTTP 404 when appropriate. [1] [2] [3]

Test Scoping & Directory Structure

  • Integration tests for the core system are now scoped to tests/integration/core/ for both CI and local runs, separating them from SDK tests. [1] [2] [3]

These changes collectively improve test clarity, reliability, and coverage for both core and SDK integration workflows.

edwinjosechittilappilly and others added 30 commits March 9, 2026 10:47
Introduce a full Amazon S3 / S3-compatible connector and integrate it into the app. Backend: add new S3 connector implementation and auth helpers (src/connectors/aws_s3), register API routes for defaults/configure/list/bucket-status, wire S3 into connection manager and connector registry, and include AWS S3 in sync_all. Frontend: add S3 settings form/dialog, React Query hooks (defaults, configure, bucket status), connect S3 UI into connector cards, cloud picker, and upload flow with an S3 bucket view and direct-sync support. CLI/TUI: add S3-related env fields and config prompts. Misc: small UI icon usages and query invalidation added to keep state in sync.
Add env-var fallback getters for S3 credentials and clearer errors: implement get_client_id and get_client_secret to read from config or AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY and raise ValueError when missing. Frontend fixes: treat a connected connector as available in the connector card UI, and surface S3 bucket loading errors on the upload page by including the query error in the response and rendering a descriptive error message when bucket fetch fails.
Connector card: use isConnected alone for active state, treat a connector as configured when isConnected or connector.available, and show a "Configure" action (with Settings2 icon) when onConfigure is provided; also keep existing loading state.

S3 settings dialog: import useEffect and add an effect to sync buckets and selectedBuckets when defaults.bucket_names load asynchronously so defaults are applied after dialog mount.

AWS logo: replace fill="currentColor" with an explicit color (#232F3E) for consistent rendering.
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Clean up IBM COS bucket listing and error handling: consolidate bucket enumeration to use the COS resource API, move logger.exception into the except blocks, and return consistent error responses. This removes unreachable/duplicated code paths and prevents leaking exception details in one handler. Also add /opensearch-data-new-lf to .gitignore.
Expose additional IBM COS and S3 env vars in docker-compose and tidy up connector code.

- docker-compose: added AWS_S3_ENDPOINT, AWS_REGION and multiple IBM_COS_* env vars for configuring IBM COS and custom S3 endpoints.
- api/connectors: import adjusted to use create_ibm_cos_resource only.
- aws_s3.auth: shortened debug messages to avoid logging endpoint/region details.
- aws_s3.connector & ibm_cos.connector: switched to printf-style logging (avoid f-strings) and normalized error/warning messages for listing objects and ACL fallbacks.
- ibm_cos.connector: prefer values from connector config (api_key, hmac_access_key, service_instance_id, hmac_secret_key) before falling back to environment variables.

These changes improve configuration flexibility, reduce accidental logging of potentially sensitive details, and standardize connector logging.
Move S3 and IBM COS-specific FastAPI route handlers out of the large api/connectors.py into dedicated modules under connectors/aws_s3 and connectors/ibm_cos. Add Pydantic models (models.py) and pure helper logic (support.py) for credential resolution and config construction, and factor common credential-testing logic into these new modules. Update package __init__.py exports to include the new models and API functions, and adjust main.py to register the new route handlers directly. This modularizes connector code, keeps route handlers thin, and centralizes validation/config-building for easier maintenance.
Replace returning raw exception details to clients with generic error messages and add logger.exception calls to capture stack traces. Changes affect aws_s3 (s3_configure, s3_list_buckets) and ibm_cos (ibm_cos_list_buckets, ibm_cos_bucket_status) handlers: removed f-string exception exposures, standardized response bodies, and added contextual logging (including connection_id) to aid server-side debugging without leaking internals to API consumers.
Introduce an optional ibm_cos_auth_endpoint in EnvConfig to allow overriding the IBM IAM token endpoint. Add IBM_COS_AUTH_ENDPOINT to the env-to-config mapping and include it in the list of environment variables written out. Also add AWS_S3_ENDPOINT and AWS_REGION to the environment write-out list so those settings are persisted.
@github-actions github-actions bot added backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) ignore-for-release labels Mar 18, 2026
@edwinjosechittilappilly edwinjosechittilappilly marked this pull request as ready for review March 18, 2026 07:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reorganizes and streamlines SDK integration testing by moving Python SDK integration tests into the main repo’s tests/ tree and updating CI/local Makefile targets to run core integration tests separately from SDK tests against an already-running stack. It also adjusts chat deletion semantics so “missing conversation” can be treated as a not-found case, and updates the Python SDK to return False on 404 for delete.

Changes:

  • Add a new tests/integration/sdk/ suite (split by domain) and remove the old sdks/python/tests/test_integration.py.
  • Update Makefile targets (test-integration, test-ci, test-ci-local, test-sdk) to run core integration tests separately and run SDK tests with SDK_TESTS_ONLY=true.
  • Update backend + Python SDK behavior so deleting a nonexistent conversation returns False (via 404 on the API + NotFound handling in the Python SDK).

Reviewed changes

Copilot reviewed 17 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/integration/sdk/conftest.py Adds fixtures for SDK integration tests against an external OpenRAG instance.
tests/integration/sdk/test_*.py New Python SDK integration test coverage across auth/chat/docs/search/settings/models/filters/errors/e2e.
tests/integration/sdk/README.md Documents the SDK integration test checklist and how to run it.
sdks/python/tests/test_integration.py Removes the old SDK integration test module from the SDK package.
tests/conftest.py Skips in-process backend onboarding/cleanup when SDK_TESTS_ONLY=true.
src/services/chat_service.py Changes delete-session behavior to distinguish “not found” from success (local-first).
src/api/v1/chat.py Returns HTTP 404 for deleting a missing conversation.
sdks/python/openrag_sdk/chat.py Catches NotFoundError on delete and returns False.
Makefile Updates test targets to run core integration tests and SDK tests separately; installs SDK editable for SDK test runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

try:
result = await chat_service.delete_session(user.user_id, chat_id)
if result.get("not_found"):
return JSONResponse({"error": "Conversation not found"}, status_code=404)
Comment on lines +783 to +785
uv pip install -e sdks/python; \
SDK_TESTS_ONLY=true OPENRAG_URL=http://localhost:3000 uv run pytest tests/integration/sdk/ -vv -s || TEST_RESULT=1; \
echo "::endgroup::"; \
Comment on lines +611 to +614
return {
"success": False,
"not_found": True,
"error": "Conversation not found",
Comment on lines +66 to +72
@pytest.fixture
def client():
"""OpenRAG client authenticated with a valid test API key."""
from openrag_sdk import OpenRAGClient

return OpenRAGClient(api_key=get_api_key(), base_url=_base_url)

Copy link
Collaborator

@mpawlow mpawlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwinjosechittilappilly

Code Review 1

  • See: PR comments: (1a) to (1n)

_onboarding_done = True


def get_api_key() -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1d) [Normal] Thread-unsafe global API key cache; pytest.skip called outside a fixture

  • Problem: get_api_key() uses a module-level _cached_api_key global with no lock.
  • Under any parallel fixture execution, two callers can both observe _cached_api_key is None simultaneously, each call httpx.post(...), and create two API keys (leaking credentials).
  • Proposed Solution: Use a threading.Lock around cache population, or convert this to a session-scoped async fixture that yields the key.

Copy link
Collaborator

@mpawlow mpawlow Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Partially fixed.

  • The module-level _cached_api_key global still has no asyncio.Lock, so two coroutines racing through _fetch_api_key() before either completes can create two API keys

(2a) [Minor] TOCTOU race in _fetch_api_key() - two coroutines can both create API keys

  • Problem: In asyncio, two coroutines that both enter _fetch_api_key() before either reaches the await ac.post(...) checkpoint will both pass the if _cached_api_key is not None guard and both issue API key
    creation requests, producing two keys on the server. The second key is permanently orphaned.
  • Proposed Solution: Use an asyncio.Lock to guard the check-and-set:
_api_key_lock = asyncio.Lock()
async def _fetch_api_key():
    global _cached_api_key
    async with _api_key_lock:
        if _cached_api_key is not None:
            return _cached_api_key
        ...
  • Note: This issue is Minor severity. Please feel free to optionally fix or ignore

timeout=30.0,
)
if response.status_code == 401:
pytest.skip("Cannot create API key - authentication required")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1e) [Normal] pytest.skip called outside a fixture

  • Problem: pytest.skip() is called from a non-fixture helper function, which can cause obscure DID NOT RAISE errors.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝


import pytest

from .conftest import _base_url
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1j) [Minor] Test files import private _base_url directly instead of using the fixture

  • Also, see: tests/integration/sdk/test_errors.py:9
  • Problem: from .conftest import _base_url imports a module-level private variable directly, bypassing pytest's fixture mechanism.
  • If OPENRAG_URL is modified after module import, the test files will use a stale value.
  • Proposed Solution: Use the base_url fixture already defined in conftest.py:74–77.
  • Note: This issue is Minor severity. Please feel free to optionally fix or ignore

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝

"""Core chat operation tests."""

@pytest.mark.asyncio
async def test_chat_non_streaming(self, client):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1k) [Normal] Chat tests create conversations with no cleanup

  • Problem: Tests creating conversations via client.chat.create(...) never delete them. In a long-lived development environment these accumulate indefinitely.
  • Proposed Solution: Add await client.chat.delete(response.chat_id) in a finally block, or use a fixture for conversation lifecycle management.

"""Test knowledge filter create, read, update, delete and usage."""

@pytest.mark.asyncio
async def test_knowledge_filter_crud(self, client):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1l) [Minor] test_knowledge_filter_crud is very monolithic and any step failure leaks the filter resource

  • Problem: The full CRUD lifecycle is a single test. If create succeeds but a later step fails, the filter is never deleted.
  • i.e. There's no try/finally around the delete call.
  • Proposed Solution: Wrap the body in try/finally to ensure deletion, or split into separate tests with a session-scoped fixture managing filter lifecycle.
  • Note: This issue is Minor severity. Please feel free to optionally fix or ignore

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝

assert results.results is not None
assert isinstance(results.results, list)

@pytest.mark.asyncio
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1m) [Normal] test_search_returns_result_fields silently passes with zero results

  • Problem: For result in results.results: is a no-op if the list is empty.
  • The test passes green even though nothing about result fields was verified.
  • Proposed Solution: Add assert len(results.results) > 0 before the loop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝

@@ -0,0 +1,90 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1n) [Minor] asyncio_mode not configured in root pyproject.toml

  • Note: Feedback for pyproject.toml file
  • Applicable to all SDK test files
  • Problem: Tests use @pytest.mark.asyncio on every async method.
  • The root pyproject.toml does not set asyncio_mode = "auto" (only sdks/python/pyproject.toml does).
  • Running pytest tests/integration/sdk/ from the project root may produce collection warnings or failures depending on the installed pytest-asyncio version.
  • Proposed Solution: Add asyncio_mode = "auto" to the [tool.pytest.ini_options] section in the root pyproject.toml.
  • Note: This issue is Minor severity. Please feel free to optionally fix or ignore

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝

The key fix in agent.py: removed the except Exception: return False wrapper that was conflating "not found" with storage errors. Storage errors now propagate up to delete_session's except Exception block which correctly returns a
  500, not a 404.
Three changes:

  1. ensure_onboarding — changed to async def decorated with @pytest_asyncio.fixture, uses httpx.AsyncClient with await. No longer blocks the event loop.
  2. get_api_key — replaced the sync helper with _fetch_api_key() as an async def that also uses httpx.AsyncClient. Cached in the same module-level variable, called with await from the client fixture.
  3. client — changed to @pytest_asyncio.fixture / async def with yield so it properly await c.close() after each test, and can await _fetch_api_key() without blocking.
All 7 asyncio.sleep calls removed across 3 files:

  - test_search.py — 4 sleeps removed (lines 22, 35, 45, 71); import asyncio dropped
  - test_chat.py — 1 sleep removed (line 120); import asyncio dropped
  - test_e2e.py — 2 sleeps removed (lines 33, 66); import asyncio dropped

  ingest() defaults to wait=True, which already calls wait_for_task() internally and only returns once the task reaches a terminal state. No external sleep needed.
@edwinjosechittilappilly
Copy link
Collaborator Author

Rebased to release branch.

Copy link
Collaborator

@mpawlow mpawlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwinjosechittilappilly

Code Review 2

  • LGTM / Approved (partial)
  • See PR Comments: (2a) to (2c)

Skips in-process backend setup when SDK_TESTS_ONLY=true (SDK tests talk to
an already-running external stack and must not wipe its state).
"""
if os.environ.get("SDK_TESTS_ONLY") == "true":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(2b) [Major] tests/conftest.py SDK_TESTS_ONLY guard is inside the fixture body; module-level imports already executed at collection time

  • Related Lines: ~18–20
  • Problem: The fix added a yield; return guard inside the onboard_system fixture body, but from config.settings import clients, from session_manager import SessionManager, and from main import generate_jwt_keys are module-level imports that execute during pytest's collection phase, before any fixture runs.
  • In a pure SDK-only environment without backend packages installed, pytest fails during collection with ModuleNotFoundError.
  • Proposed Solution: Move the imports inside the fixture body, guarded by the SDK_TESTS_ONLY check:
@pytest.fixture(scope="session", autouse=True)
def onboard_system():
    if os.environ.get("SDK_TESTS_ONLY") == "true":
        yield
        return
    from config.settings import clients
    from session_manager import SessionManager
    from main import generate_jwt_keys
    ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch I belive I already moved it inside onboard_system ?

page_token = result.get("next_page_token")
if not page_token:
break
finally:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(2c) [Normal] connector_sync mutation of connector.bucket_names is not concurrency-safe

  • Problem: If two concurrent sync requests arrive for the same connector with different bucket_filter values, both requests write to connector.bucket_names on what is likely a shared/cached connector instance.
  • The finally: connector.bucket_names = original_buckets is per-request; whichever finally runs last wins, meaning one request may use the other's bucket filter mid-flight.
  • Proposed Solution: Deep-copy the connector before mutating it, or use a per-request lock keyed on the connector's connection ID.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we create this as an ticket so we can have it in backlog?
Good Catch!

@github-actions github-actions bot added the lgtm label Mar 18, 2026
@edwinjosechittilappilly edwinjosechittilappilly merged commit c33012f into release-0.4.0 Mar 18, 2026
29 checks passed
@github-actions github-actions bot deleted the integration-tests-sdk branch March 18, 2026 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) ignore-for-release lgtm tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: GA Readiness of Python SDK

5 participants