Skip to content

feat(api): add external archive downloads and retire datasink ingress#1818

Open
benjaminpaige wants to merge 27 commits intomainfrom
datasink
Open

feat(api): add external archive downloads and retire datasink ingress#1818
benjaminpaige wants to merge 27 commits intomainfrom
datasink

Conversation

@benjaminpaige
Copy link
Copy Markdown
Collaborator

@benjaminpaige benjaminpaige commented Jan 22, 2026

This branch started as a broader datasink spike, but what it now brings back to main is a much tighter set of changes. The headline items are external archive downloads through the existing attachment URL endpoint, a locked-down checkIdentifierUsage route, removal of datasink-specific ingress/data-exchange pieces we are not promoting, and the CI/tooling updates needed to keep the current branch healthy under today’s Bun/Vite/Storybook stack.

Summary

  • Add external OAuth attachment access and extend POST /external/getAttachmentUrl to serve existing object downloads plus full-package and section archive ZIPs from the archive buckets.
  • Keep GET /checkIdentifierUsage, but change it from anonymous access to AWS_IAM.
  • Remove the datasink ingress/data-exchange path from the promoted stack shape while preserving the targeted archive-download work.
  • Refresh Storybook, Vitest, and Bun configuration so current GitHub Actions can run against this branch cleanly again.

What Changed

External download API

  • Added the external auth surface in lib/lambda/external-auth/*, lib/lambda/externalToken.ts, and lib/lambda/externalAttachmentAuthorizer.ts.
  • Extended lib/lambda/getExternalAttachmentUrl.ts so external consumers can request existing object downloads, full package archive downloads by packageId, and package section archive downloads by packageId plus sectionId.
  • Refactored lib/lambda/attachmentArchive-lib.ts and lib/lambda/presignedAttachmentUrl.ts so internal and external attachment flows share archive resolution, filenames, and content-disposition handling.
  • Added the corresponding lambda coverage for archive readiness, presigned URL generation, ACL handling, and mixed object/archive request behavior.

Auth, routing, and infrastructure

  • Added deployment config support for the external API auth secret in lib/config/deployment-config.ts.
  • Updated lib/stacks/api.ts to wire the external token endpoint, external attachment authorizer, dedicated IAM for archive downloads, and the archive bucket/rebuild settings.
  • Added checkIdentifierUsage Lambda/API support in lib/lambda/checkIdentifierUsage.ts and lib/libs/api/package/checkIdentifierUsage.ts.
  • Changed checkIdentifierUsage on API Gateway to AWS_IAM instead of anonymous access.
  • Updated stack ordering and auth wiring so Cognito and IAM permissions target the concrete API resources correctly.

Datasink cleanup

  • Removed the datasink-only ingress and outbound data-exchange pieces that are not intended for promotion.
  • Removed the datasink route and authorizer pathing, the datasink envelope/shared contract plumbing, and the outbound data-exchange secret and sink processing path.
  • Kept the pieces that are still relevant to external archive downloads and identifier usage.

Tooling and repo hygiene

  • Refreshed bun.lockb and aligned the branch with the current Bun dependency graph.
  • Updated Vite and Storybook config so Storybook no longer imports the ESM mocks workspace package directly from .storybook/main.ts.
  • Made the Storybook Vitest config await the merged async Vite config and inline the AWS packages that were breaking Storybook execution under GitHub Actions.
  • Prevented vite-plugin-checker from interfering with Vitest test mode.
  • Updated a few repo-level utilities and config edges: cdk.json now uses bunx tsx, the deploy CLI help text references the current AWS credential naming, detect-secrets ignores test files to reduce false positives, and local-only docs/local and test/local paths remain ignored.

Related Context

Risks / Reviewer Notes

  • external/getAttachmentUrl is an additive contract change. Existing object-mode callers still work, but responses now include status and target, and archive callers may receive PENDING before a ZIP is ready.
  • checkIdentifierUsage is still exposed through the EDGE API, but it is no longer anonymously callable.
  • Datasink ingress and data-exchange resources are intentionally being removed from the promoted shape in this branch. Anything depending on /dataSink or outbound data-exchange wiring should be treated as out of scope for this promotion.
  • This PR includes CI and tooling fixes because the branch had drifted far enough from main that current Actions could no longer execute Storybook and coverage successfully without compatibility updates.

Deployment / Operations

  • Higher-environment promotion will require the external auth secret to include the correct archive bucket and prefix allowlists.
  • Reviewers should pay particular attention to IAM changes, removed datasink resources, and API Gateway auth changes.

- Add dataSink Lambda function to handle incoming events from SMART.
- Implement idempotency checks using DynamoDB to prevent duplicate processing.
- Publish validated events to Kafka for further processing.
- Introduce environment variables for configuration, including idempotency table name and Kafka topic.
- Create associated tests for dataSink functionality.
- Update package dependencies to include AWS SDK for DynamoDB operations.
- Add necessary infrastructure in CDK for DynamoDB table and Lambda permissions.
@macpro-snyk-service-account
Copy link
Copy Markdown
Collaborator

macpro-snyk-service-account commented Jan 22, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 22, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 83.87% (🎯 90%)
⬆️ +0.11%
7678 / 9154
🔵 Statements 83.74% (🎯 90%)
⬆️ +0.12%
8250 / 9851
🔵 Functions 81.97% (🎯 85%)
⬆️ +0.38%
2105 / 2568
🔵 Branches 72.59% (🎯 80%)
⬆️ +0.30%
4416 / 6083
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
lib/config/bundling-config.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
lib/config/deployment-config.ts 96.66%
🟰 ±0%
93.93%
⬆️ +0.18%
100%
🟰 ±0%
96.66%
🟰 ±0%
109-111
lib/lambda/attachmentArchive-lib.ts 79.58%
⬆️ +0.55%
68.18%
⬆️ +1.00%
88.23%
⬆️ +0.36%
80%
⬆️ +0.55%
107, 136, 145, 150-157, 164-168, 260, 269, 279, 297, 335, 455-458, 466-469, 473-476, 483-486, 507, 603-607, 611-615, 666-673, 677-684, 733-754, 891, 895, 905-907, 920-922, 934-938, 949-958, 1016-1019, 1029-1063
lib/lambda/checkIdentifierUsage.ts 100% 100% 100% 100%
lib/lambda/externalAttachmentAuthorizer.ts 77.77% 70% 100% 77.77% 12, 17, 51, 66
lib/lambda/externalToken.ts 88.23% 72.72% 100% 88.23% 30, 55, 63, 67
lib/lambda/getAttachmentUrl.ts 95.78%
⬆️ +0.69%
78.57%
⬆️ +0.99%
94.73%
⬆️ +3.43%
95.78%
⬆️ +0.69%
237-247, 260-265, 451
lib/lambda/getExternalAttachmentUrl.ts 94.35% 89.92% 100% 94.35% 94, 118, 126, 192, 212, 241-244, 259-262
lib/lambda/presignedAttachmentUrl.ts 74.46% 31.42% 80% 74.46% 34-43, 59, 83-87, 102, 130
lib/lambda/external-auth/acl.ts 0% 0% 0% 0% 8-9
lib/lambda/external-auth/config.ts 77.58% 73.46% 100% 77.19% 21, 29, 34, 43, 48, 55, 60-62, 83, 88, 94, 112, 116, 122
lib/lambda/external-auth/constants.ts 100% 100% 100% 100%
lib/lambda/external-auth/jwt.ts 78.04% 78.12% 100% 78.04% 19, 58, 64, 75, 79, 88, 96, 107, 111
lib/lambda/external-auth/service.ts 95.83% 93.75% 100% 95.83% 34
lib/libs/api/package/checkIdentifierUsage.ts 100% 100% 100% 100%
Generated in workflow #4730 for commit 7caaf52 by the Vitest Coverage Report Action

…cy and event storage

- Replace DynamoDB idempotency checks with OpenSearch document management.
- Update dataSink Lambda to handle event storage and status updates in OpenSearch.
- Remove Kafka publishing logic and related dependencies.
- Introduce new OpenSearch document schema and update associated tests.
- Adjust infrastructure to support OpenSearch integration, including IAM permissions and index setup.
- Delete obsolete files related to previous DynamoDB implementation.
…h integration

- Add support for processing outbound events from OneMac to MuleSoft.
- Introduce new data structures and functions for creating and storing outbound envelopes.
- Update OpenSearch document schema to include outbound event fields.
- Implement error handling and logging for outbound event processing.
- Create unit tests for new data exchange functionality.
- Adjust infrastructure to support outbound event configuration via environment variables.
Update setupIndex and deleteIndex tests to include the datasink index
in expected indices arrays. The datasink index was added as part of the
outbound DataExchange integration feature.
- Add Lambda function to check if an identifier is in use, returning its status and origin.
- Create associated tests for the new functionality.
- Update package.json to reflect changes in dependencies.
- Introduce new API endpoint for public access without authentication.
- Ensure CORS headers are included in responses for cross-origin requests.
- Update bun.lockb to reflect changes in package dependencies.
- Modify deploy command description to clarify AWS credentials requirements.
- Refactor dataSink and checkIdentifierUsage modules for improved structure and readability.
- Enhance data exchange functionality with better error handling and configuration management.
- Add tests for new data exchange features and ensure existing tests are updated accordingly.
# Conflicts:
#	.gitignore
#	bun.lockb
#	lib/lambda/getAttachmentUrl.ts
#	lib/stacks/api.ts
#	lib/stacks/data.ts
#	lib/stacks/parent.ts
jdinh8124 and others added 13 commits April 7, 2026 07:29
- extend external/getAttachmentUrl with package and section archive downloads
- share archive resolution and content-disposition handling across download paths
- protect checkIdentifierUsage with IAM and remove datasink/data-exchange resources

Refs: OY2-37464
- mark literal field-name strings that trip detect-secrets during push
- keep the allowlists scoped to the exact reported lines
- move the allowlist pragma to the exact clientSecret guard the hook reports
- leave the token error message unchanged
- rename the local token request secret field to clientCredential
- keep the contract-facing client_secret message allowlisted on the exact literal
- stop setup/delete index flows from managing a datasink OpenSearch index\n- remove the datasink base index type and restore the deploy CLI credential description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants