Skip to content

feat: add IBM COS and AWS S3#1168

Open
edwinjosechittilappilly wants to merge 35 commits intorelease-0.4.0from
COS_MAIN_MERGE
Open

feat: add IBM COS and AWS S3#1168
edwinjosechittilappilly wants to merge 35 commits intorelease-0.4.0from
COS_MAIN_MERGE

Conversation

@edwinjosechittilappilly
Copy link
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly commented Mar 17, 2026

Add IBM COS Fix from main to release 0.4.0

edwinjosechittilappilly and others added 30 commits March 9, 2026 10:47
Introduce a full Amazon S3 / S3-compatible connector and integrate it into the app. Backend: add new S3 connector implementation and auth helpers (src/connectors/aws_s3), register API routes for defaults/configure/list/bucket-status, wire S3 into connection manager and connector registry, and include AWS S3 in sync_all. Frontend: add S3 settings form/dialog, React Query hooks (defaults, configure, bucket status), connect S3 UI into connector cards, cloud picker, and upload flow with an S3 bucket view and direct-sync support. CLI/TUI: add S3-related env fields and config prompts. Misc: small UI icon usages and query invalidation added to keep state in sync.
Add env-var fallback getters for S3 credentials and clearer errors: implement get_client_id and get_client_secret to read from config or AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY and raise ValueError when missing. Frontend fixes: treat a connected connector as available in the connector card UI, and surface S3 bucket loading errors on the upload page by including the query error in the response and rendering a descriptive error message when bucket fetch fails.
Connector card: use isConnected alone for active state, treat a connector as configured when isConnected or connector.available, and show a "Configure" action (with Settings2 icon) when onConfigure is provided; also keep existing loading state.

S3 settings dialog: import useEffect and add an effect to sync buckets and selectedBuckets when defaults.bucket_names load asynchronously so defaults are applied after dialog mount.

AWS logo: replace fill="currentColor" with an explicit color (#232F3E) for consistent rendering.
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…rough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Clean up IBM COS bucket listing and error handling: consolidate bucket enumeration to use the COS resource API, move logger.exception into the except blocks, and return consistent error responses. This removes unreachable/duplicated code paths and prevents leaking exception details in one handler. Also add /opensearch-data-new-lf to .gitignore.
Expose additional IBM COS and S3 env vars in docker-compose and tidy up connector code.

- docker-compose: added AWS_S3_ENDPOINT, AWS_REGION and multiple IBM_COS_* env vars for configuring IBM COS and custom S3 endpoints.
- api/connectors: import adjusted to use create_ibm_cos_resource only.
- aws_s3.auth: shortened debug messages to avoid logging endpoint/region details.
- aws_s3.connector & ibm_cos.connector: switched to printf-style logging (avoid f-strings) and normalized error/warning messages for listing objects and ACL fallbacks.
- ibm_cos.connector: prefer values from connector config (api_key, hmac_access_key, service_instance_id, hmac_secret_key) before falling back to environment variables.

These changes improve configuration flexibility, reduce accidental logging of potentially sensitive details, and standardize connector logging.
Move S3 and IBM COS-specific FastAPI route handlers out of the large api/connectors.py into dedicated modules under connectors/aws_s3 and connectors/ibm_cos. Add Pydantic models (models.py) and pure helper logic (support.py) for credential resolution and config construction, and factor common credential-testing logic into these new modules. Update package __init__.py exports to include the new models and API functions, and adjust main.py to register the new route handlers directly. This modularizes connector code, keeps route handlers thin, and centralizes validation/config-building for easier maintenance.
Replace returning raw exception details to clients with generic error messages and add logger.exception calls to capture stack traces. Changes affect aws_s3 (s3_configure, s3_list_buckets) and ibm_cos (ibm_cos_list_buckets, ibm_cos_bucket_status) handlers: removed f-string exception exposures, standardized response bodies, and added contextual logging (including connection_id) to aid server-side debugging without leaking internals to API consumers.
Introduce an optional ibm_cos_auth_endpoint in EnvConfig to allow overriding the IBM IAM token endpoint. Add IBM_COS_AUTH_ENDPOINT to the env-to-config mapping and include it in the list of environment variables written out. Also add AWS_S3_ENDPOINT and AWS_REGION to the environment write-out list so those settings are persisted.
@github-actions github-actions bot added frontend 🟨 Issues related to the UI/UX backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) docker labels Mar 17, 2026
@edwinjosechittilappilly edwinjosechittilappilly marked this pull request as ready for review March 17, 2026 19:56
Copilot AI review requested due to automatic review settings March 17, 2026 19:56
@github-actions github-actions bot added the enhancement 🔵 New feature or request label Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class bucket-based connectors for IBM Cloud Object Storage (COS) and AWS S3 (incl. S3-compatible endpoints), including backend connector implementations, configuration endpoints, and frontend settings + bucket ingestion UX.

Changes:

  • Introduces new backend connectors (ibm_cos, aws_s3) with credential resolution, bucket listing/status endpoints, and ingestion support (including bucket-filtered sync).
  • Extends TUI/env configuration to support S3 endpoint/region and IBM COS credential options.
  • Updates frontend auth/context + settings UI to configure these connectors and ingest by selecting buckets (direct-sync flow).

Reviewed changes

Copilot reviewed 50 out of 52 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
uv.lock Locks new dependency set including IBM COS SDK.
pyproject.toml Adds ibm-cos-sdk dependency.
docker-compose.yml Exposes new AWS S3 + IBM COS environment variables to the backend container.
.gitignore Adds an OpenSearch data directory ignore entry.
src/utils/file_utils.py Extends MIME→extension mapping and changes unknown-type handling.
src/tui/managers/env_manager.py Adds AWS S3 endpoint/region + IBM COS env var mapping and persistence.
src/tui/config_fields.py Adds TUI prompts for AWS S3 endpoint/region and IBM COS fields.
src/services/auth_service.py Adds “direct auth” connector path (no OAuth redirect) for IBM COS.
src/main.py Registers IBM COS + AWS S3 connector-specific API routes.
src/connectors/langflow_connector_service.py Skips folder-expansion logic for connectors that don’t use cfg.
src/connectors/connection_manager.py Registers new connectors + adjusts connector availability evaluation signature.
src/connectors/init.py Exports the new connector classes.
src/config/settings.py Adds IBM_AUTH_ENABLED feature flag.
src/api/connectors.py Adds sync_all and bucket_filter support in connector sync; includes new connector types in sync-all.
src/connectors/ibm_cos/init.py Exposes IBM COS connector and IBM COS API helpers.
src/connectors/ibm_cos/auth.py IBM COS IAM/HMAC credential resolution and client/resource factories.
src/connectors/ibm_cos/models.py Pydantic models for IBM COS configure endpoint.
src/connectors/ibm_cos/support.py Pure helper to construct IBM COS connection config from body/env/existing config.
src/connectors/ibm_cos/api.py IBM COS defaults/configure/bucket list/bucket-status endpoints.
src/connectors/ibm_cos/connector.py IBM COS connector implementation (list/download/ACL).
src/connectors/aws_s3/init.py Exposes S3 connector and S3 API helpers.
src/connectors/aws_s3/auth.py S3 credential resolution and boto3 client/resource factories.
src/connectors/aws_s3/models.py Pydantic models for S3 configure endpoint.
src/connectors/aws_s3/support.py Pure helper to construct S3 connection config from body/env/existing config.
src/connectors/aws_s3/api.py S3 defaults/configure/bucket list/bucket-status endpoints.
src/connectors/aws_s3/connector.py S3 connector implementation (list/download/ACL).
frontend/contexts/auth-context.tsx Adds isIbmAuthMode to the auth context.
frontend/components/knowledge-dropdown.tsx Replaces legacy “S3 bucket upload” dialog with navigation to direct bucket-ingest pages; adds IBM COS option.
frontend/components/icons/ibm-cos-icon.tsx Adds IBM COS SVG icon component.
frontend/components/cloud-picker/file-item.tsx Adds provider icons for ibm_cos and aws_s3.
frontend/app/upload/[provider]/page.tsx Adds bucket-selection ingestion UI for direct-sync providers (ibm_cos, aws_s3).
frontend/app/settings/_components/watsonx-settings-dialog.tsx Formatting-only dialog handler change.
frontend/app/settings/_components/openai-settings-dialog.tsx Formatting-only dialog handler change.
frontend/app/settings/_components/ollama-settings-dialog.tsx Formatting-only dialog handler change.
frontend/app/settings/_components/anthropic-settings-dialog.tsx Formatting-only dialog handler change.
frontend/app/settings/_components/s3-settings-form.tsx Adds S3 configuration form UI + bucket selector.
frontend/app/settings/_components/s3-settings-dialog.tsx Adds S3 configuration dialog logic (test connection, save config).
frontend/app/settings/_components/ibm-cos-settings-form.tsx Adds IBM COS configuration form UI with IAM/HMAC tabs + bucket selector.
frontend/app/settings/_components/ibm-cos-settings-dialog.tsx Adds IBM COS configuration dialog logic (test connection, save config).
frontend/app/settings/_components/connector-cards.tsx Adds IBM COS + S3 cards, gating behind IBM auth mode; wires settings dialogs.
frontend/app/settings/_components/connector-card.tsx Adds optional “configure” action; adjusts icon active state; updates button icon.
frontend/app/settings/_components/card-icon.tsx Adjusts active styling to enforce black icon color.
frontend/app/knowledge/page.tsx Adds source icons for ibm_cos and aws_s3.
frontend/app/api/queries/useS3DefaultsQuery.ts Adds query hook for S3 defaults endpoint.
frontend/app/api/queries/useS3BucketStatusQuery.ts Adds query hook for S3 bucket ingestion status.
frontend/app/api/queries/useIBMCOSDefaultsQuery.ts Adds query hook for IBM COS defaults endpoint.
frontend/app/api/queries/useIBMCOSBucketsQuery.ts Adds query hook for IBM COS bucket list endpoint.
frontend/app/api/queries/useIBMCOSBucketStatusQuery.ts Adds query hook for IBM COS bucket ingestion status.
frontend/app/api/mutations/useSyncConnector.ts Extends sync payload to include sync_all and bucket_filter.
frontend/app/api/mutations/useS3ConfigureMutation.ts Adds mutation hook for S3 configure endpoint.
frontend/app/api/mutations/useIBMCOSConfigureMutation.ts Adds mutation hook for IBM COS configure endpoint.
frontend/app/api/mutations/useConnectConnectorMutation.ts Handles direct-auth connectors by skipping OAuth redirect and refreshing connector status.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

)
connector = await connector_service.get_connector(working_connection.connection_id)
if body.bucket_filter:
# List only files from the requested buckets, then sync_specific_files
onNavigateToKnowledge,
onConfigure,
}: ConnectorCardProps) {
console.log(connector);
</CardTitle>
<CardDescription className="text-sm">
{connector?.available
{isConnected || connector?.available
Comment on lines +50 to +56
// Bucket state
const [buckets, setBuckets] = useState<string[] | null>(
defaults?.bucket_names?.length ? defaults.bucket_names : null,
);
const [selectedBuckets, setSelectedBuckets] = useState<string[]>(
defaults?.bucket_names ?? [],
);
Comment on lines +161 to +165
try:
cfg = connection.config
cos = create_ibm_cos_resource(cfg)
all_buckets = [b.name for b in cos.buckets.all()]
except Exception:
Comment on lines +198 to +204
except Exception as exc:
await self.connector_service.connection_manager.delete_connection(
connection_id
)
raise ValueError(
f"Failed to connect {connector_type}: {exc}"
) from exc
Comment on lines +280 to +296
const {
data: buckets,
isLoading,
refetch,
} = useIBMCOSBucketStatusQuery(connector.connectionId, { enabled: true });
return (
<BucketView
connector={connector}
buckets={buckets}
isLoading={isLoading}
onRefetch={refetch}
invalidateQueryKey={["ibm-cos-bucket-status", connector.connectionId]}
syncMutation={syncMutation}
addTask={addTask}
onBack={onBack}
onDone={onDone}
/>
Comment on lines +371 to +374
if __name__ == "__main__":
connector = IBMCOSConnector({})
print(connector.authenticate())
print(connector.list_files())
Comment on lines 63 to +64
def get_file_extension(mimetype: str) -> str:
"""Get file extension based on MIME type"""
"""Get file extension based on MIME type. Returns None if the type is unknown."""
"Full connector ingest requested",
connector_type=connector_type,
bucket_filter=body.bucket_filter,
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) docker enhancement 🔵 New feature or request frontend 🟨 Issues related to the UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants