feat: add IBM COS and AWS S3#1168
Open
edwinjosechittilappilly wants to merge 35 commits intorelease-0.4.0from
Open
feat: add IBM COS and AWS S3#1168edwinjosechittilappilly wants to merge 35 commits intorelease-0.4.0from
edwinjosechittilappilly wants to merge 35 commits intorelease-0.4.0from
Conversation
Introduce a full Amazon S3 / S3-compatible connector and integrate it into the app. Backend: add new S3 connector implementation and auth helpers (src/connectors/aws_s3), register API routes for defaults/configure/list/bucket-status, wire S3 into connection manager and connector registry, and include AWS S3 in sync_all. Frontend: add S3 settings form/dialog, React Query hooks (defaults, configure, bucket status), connect S3 UI into connector cards, cloud picker, and upload flow with an S3 bucket view and direct-sync support. CLI/TUI: add S3-related env fields and config prompts. Misc: small UI icon usages and query invalidation added to keep state in sync.
Add env-var fallback getters for S3 credentials and clearer errors: implement get_client_id and get_client_secret to read from config or AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY and raise ValueError when missing. Frontend fixes: treat a connected connector as available in the connector card UI, and surface S3 bucket loading errors on the upload page by including the query error in the response and rendering a descriptive error message when bucket fetch fails.
Connector card: use isConnected alone for active state, treat a connector as configured when isConnected or connector.available, and show a "Configure" action (with Settings2 icon) when onConfigure is provided; also keep existing loading state. S3 settings dialog: import useEffect and add an effect to sync buckets and selectedBuckets when defaults.bucket_names load asynchronously so defaults are applied after dialog mount. AWS logo: replace fill="currentColor" with an explicit color (#232F3E) for consistent rendering.
…rough an exception Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Clean up IBM COS bucket listing and error handling: consolidate bucket enumeration to use the COS resource API, move logger.exception into the except blocks, and return consistent error responses. This removes unreachable/duplicated code paths and prevents leaking exception details in one handler. Also add /opensearch-data-new-lf to .gitignore.
Expose additional IBM COS and S3 env vars in docker-compose and tidy up connector code. - docker-compose: added AWS_S3_ENDPOINT, AWS_REGION and multiple IBM_COS_* env vars for configuring IBM COS and custom S3 endpoints. - api/connectors: import adjusted to use create_ibm_cos_resource only. - aws_s3.auth: shortened debug messages to avoid logging endpoint/region details. - aws_s3.connector & ibm_cos.connector: switched to printf-style logging (avoid f-strings) and normalized error/warning messages for listing objects and ACL fallbacks. - ibm_cos.connector: prefer values from connector config (api_key, hmac_access_key, service_instance_id, hmac_secret_key) before falling back to environment variables. These changes improve configuration flexibility, reduce accidental logging of potentially sensitive details, and standardize connector logging.
Move S3 and IBM COS-specific FastAPI route handlers out of the large api/connectors.py into dedicated modules under connectors/aws_s3 and connectors/ibm_cos. Add Pydantic models (models.py) and pure helper logic (support.py) for credential resolution and config construction, and factor common credential-testing logic into these new modules. Update package __init__.py exports to include the new models and API functions, and adjust main.py to register the new route handlers directly. This modularizes connector code, keeps route handlers thin, and centralizes validation/config-building for easier maintenance.
Replace returning raw exception details to clients with generic error messages and add logger.exception calls to capture stack traces. Changes affect aws_s3 (s3_configure, s3_list_buckets) and ibm_cos (ibm_cos_list_buckets, ibm_cos_bucket_status) handlers: removed f-string exception exposures, standardized response bodies, and added contextual logging (including connection_id) to aid server-side debugging without leaking internals to API consumers.
Introduce an optional ibm_cos_auth_endpoint in EnvConfig to allow overriding the IBM IAM token endpoint. Add IBM_COS_AUTH_ENDPOINT to the env-to-config mapping and include it in the list of environment variables written out. Also add AWS_S3_ENDPOINT and AWS_REGION to the environment write-out list so those settings are persisted.
…g user_id parameter
…o IBM_COS_S3_SPIKE
This reverts commit 9fee2e8.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds first-class bucket-based connectors for IBM Cloud Object Storage (COS) and AWS S3 (incl. S3-compatible endpoints), including backend connector implementations, configuration endpoints, and frontend settings + bucket ingestion UX.
Changes:
- Introduces new backend connectors (
ibm_cos,aws_s3) with credential resolution, bucket listing/status endpoints, and ingestion support (including bucket-filtered sync). - Extends TUI/env configuration to support S3 endpoint/region and IBM COS credential options.
- Updates frontend auth/context + settings UI to configure these connectors and ingest by selecting buckets (direct-sync flow).
Reviewed changes
Copilot reviewed 50 out of 52 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Locks new dependency set including IBM COS SDK. |
| pyproject.toml | Adds ibm-cos-sdk dependency. |
| docker-compose.yml | Exposes new AWS S3 + IBM COS environment variables to the backend container. |
| .gitignore | Adds an OpenSearch data directory ignore entry. |
| src/utils/file_utils.py | Extends MIME→extension mapping and changes unknown-type handling. |
| src/tui/managers/env_manager.py | Adds AWS S3 endpoint/region + IBM COS env var mapping and persistence. |
| src/tui/config_fields.py | Adds TUI prompts for AWS S3 endpoint/region and IBM COS fields. |
| src/services/auth_service.py | Adds “direct auth” connector path (no OAuth redirect) for IBM COS. |
| src/main.py | Registers IBM COS + AWS S3 connector-specific API routes. |
| src/connectors/langflow_connector_service.py | Skips folder-expansion logic for connectors that don’t use cfg. |
| src/connectors/connection_manager.py | Registers new connectors + adjusts connector availability evaluation signature. |
| src/connectors/init.py | Exports the new connector classes. |
| src/config/settings.py | Adds IBM_AUTH_ENABLED feature flag. |
| src/api/connectors.py | Adds sync_all and bucket_filter support in connector sync; includes new connector types in sync-all. |
| src/connectors/ibm_cos/init.py | Exposes IBM COS connector and IBM COS API helpers. |
| src/connectors/ibm_cos/auth.py | IBM COS IAM/HMAC credential resolution and client/resource factories. |
| src/connectors/ibm_cos/models.py | Pydantic models for IBM COS configure endpoint. |
| src/connectors/ibm_cos/support.py | Pure helper to construct IBM COS connection config from body/env/existing config. |
| src/connectors/ibm_cos/api.py | IBM COS defaults/configure/bucket list/bucket-status endpoints. |
| src/connectors/ibm_cos/connector.py | IBM COS connector implementation (list/download/ACL). |
| src/connectors/aws_s3/init.py | Exposes S3 connector and S3 API helpers. |
| src/connectors/aws_s3/auth.py | S3 credential resolution and boto3 client/resource factories. |
| src/connectors/aws_s3/models.py | Pydantic models for S3 configure endpoint. |
| src/connectors/aws_s3/support.py | Pure helper to construct S3 connection config from body/env/existing config. |
| src/connectors/aws_s3/api.py | S3 defaults/configure/bucket list/bucket-status endpoints. |
| src/connectors/aws_s3/connector.py | S3 connector implementation (list/download/ACL). |
| frontend/contexts/auth-context.tsx | Adds isIbmAuthMode to the auth context. |
| frontend/components/knowledge-dropdown.tsx | Replaces legacy “S3 bucket upload” dialog with navigation to direct bucket-ingest pages; adds IBM COS option. |
| frontend/components/icons/ibm-cos-icon.tsx | Adds IBM COS SVG icon component. |
| frontend/components/cloud-picker/file-item.tsx | Adds provider icons for ibm_cos and aws_s3. |
| frontend/app/upload/[provider]/page.tsx | Adds bucket-selection ingestion UI for direct-sync providers (ibm_cos, aws_s3). |
| frontend/app/settings/_components/watsonx-settings-dialog.tsx | Formatting-only dialog handler change. |
| frontend/app/settings/_components/openai-settings-dialog.tsx | Formatting-only dialog handler change. |
| frontend/app/settings/_components/ollama-settings-dialog.tsx | Formatting-only dialog handler change. |
| frontend/app/settings/_components/anthropic-settings-dialog.tsx | Formatting-only dialog handler change. |
| frontend/app/settings/_components/s3-settings-form.tsx | Adds S3 configuration form UI + bucket selector. |
| frontend/app/settings/_components/s3-settings-dialog.tsx | Adds S3 configuration dialog logic (test connection, save config). |
| frontend/app/settings/_components/ibm-cos-settings-form.tsx | Adds IBM COS configuration form UI with IAM/HMAC tabs + bucket selector. |
| frontend/app/settings/_components/ibm-cos-settings-dialog.tsx | Adds IBM COS configuration dialog logic (test connection, save config). |
| frontend/app/settings/_components/connector-cards.tsx | Adds IBM COS + S3 cards, gating behind IBM auth mode; wires settings dialogs. |
| frontend/app/settings/_components/connector-card.tsx | Adds optional “configure” action; adjusts icon active state; updates button icon. |
| frontend/app/settings/_components/card-icon.tsx | Adjusts active styling to enforce black icon color. |
| frontend/app/knowledge/page.tsx | Adds source icons for ibm_cos and aws_s3. |
| frontend/app/api/queries/useS3DefaultsQuery.ts | Adds query hook for S3 defaults endpoint. |
| frontend/app/api/queries/useS3BucketStatusQuery.ts | Adds query hook for S3 bucket ingestion status. |
| frontend/app/api/queries/useIBMCOSDefaultsQuery.ts | Adds query hook for IBM COS defaults endpoint. |
| frontend/app/api/queries/useIBMCOSBucketsQuery.ts | Adds query hook for IBM COS bucket list endpoint. |
| frontend/app/api/queries/useIBMCOSBucketStatusQuery.ts | Adds query hook for IBM COS bucket ingestion status. |
| frontend/app/api/mutations/useSyncConnector.ts | Extends sync payload to include sync_all and bucket_filter. |
| frontend/app/api/mutations/useS3ConfigureMutation.ts | Adds mutation hook for S3 configure endpoint. |
| frontend/app/api/mutations/useIBMCOSConfigureMutation.ts | Adds mutation hook for IBM COS configure endpoint. |
| frontend/app/api/mutations/useConnectConnectorMutation.ts | Handles direct-auth connectors by skipping OAuth redirect and refreshing connector status. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| ) | ||
| connector = await connector_service.get_connector(working_connection.connection_id) | ||
| if body.bucket_filter: | ||
| # List only files from the requested buckets, then sync_specific_files |
| onNavigateToKnowledge, | ||
| onConfigure, | ||
| }: ConnectorCardProps) { | ||
| console.log(connector); |
| </CardTitle> | ||
| <CardDescription className="text-sm"> | ||
| {connector?.available | ||
| {isConnected || connector?.available |
Comment on lines
+50
to
+56
| // Bucket state | ||
| const [buckets, setBuckets] = useState<string[] | null>( | ||
| defaults?.bucket_names?.length ? defaults.bucket_names : null, | ||
| ); | ||
| const [selectedBuckets, setSelectedBuckets] = useState<string[]>( | ||
| defaults?.bucket_names ?? [], | ||
| ); |
Comment on lines
+161
to
+165
| try: | ||
| cfg = connection.config | ||
| cos = create_ibm_cos_resource(cfg) | ||
| all_buckets = [b.name for b in cos.buckets.all()] | ||
| except Exception: |
Comment on lines
+198
to
+204
| except Exception as exc: | ||
| await self.connector_service.connection_manager.delete_connection( | ||
| connection_id | ||
| ) | ||
| raise ValueError( | ||
| f"Failed to connect {connector_type}: {exc}" | ||
| ) from exc |
Comment on lines
+280
to
+296
| const { | ||
| data: buckets, | ||
| isLoading, | ||
| refetch, | ||
| } = useIBMCOSBucketStatusQuery(connector.connectionId, { enabled: true }); | ||
| return ( | ||
| <BucketView | ||
| connector={connector} | ||
| buckets={buckets} | ||
| isLoading={isLoading} | ||
| onRefetch={refetch} | ||
| invalidateQueryKey={["ibm-cos-bucket-status", connector.connectionId]} | ||
| syncMutation={syncMutation} | ||
| addTask={addTask} | ||
| onBack={onBack} | ||
| onDone={onDone} | ||
| /> |
Comment on lines
+371
to
+374
| if __name__ == "__main__": | ||
| connector = IBMCOSConnector({}) | ||
| print(connector.authenticate()) | ||
| print(connector.list_files()) |
Comment on lines
63
to
+64
| def get_file_extension(mimetype: str) -> str: | ||
| """Get file extension based on MIME type""" | ||
| """Get file extension based on MIME type. Returns None if the type is unknown.""" |
| "Full connector ingest requested", | ||
| connector_type=connector_type, | ||
| bucket_filter=body.bucket_filter, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add IBM COS Fix from main to release 0.4.0