Skip to content

Implement PDDA/HTML upload to AWS S3 (#334)#345

Open
alexbottenberg wants to merge 22 commits intomasterfrom
feature/334-pdda-html-s3-upload
Open

Implement PDDA/HTML upload to AWS S3 (#334)#345
alexbottenberg wants to merge 22 commits intomasterfrom
feature/334-pdda-html-s3-upload

Conversation

@alexbottenberg
Copy link
Copy Markdown
Contributor

@alexbottenberg alexbottenberg commented Feb 11, 2026

Summary

Implements service-to-service API endpoint for PDDA to upload HTML/HTM files directly to AWS S3 (XHIBIT bucket).

Changes

New Module: libs/pdda-html-upload

  • API Endpoint: POST /api/v1/pdda-html
  • Authentication: OAuth 2.0 via existing authenticateApi() middleware
  • File Validation:
    • Accepts .htm and .html files (case-insensitive)
    • File size limit (10MB default, configurable via env var)
    • Path traversal protection
    • ArtefactType must be "LCSU"
  • AWS S3 Integration:
    • Upload with PutObject
    • Verification with HeadObject
    • S3 key format: pdda-html/YYYY/MM/DD/{uuid}.html
  • Correlation ID: Support for request tracing via X-Correlation-ID header

Database Changes

  • Added type column to artefact table (VARCHAR 50, NOT NULL)
  • Migration includes backfill of existing records with 'LIST'
  • Index added on type column for performance
  • Updated Prisma schema

Testing

  • 37 unit tests (all passing)
  • >95% code coverage
  • Tests cover all validation scenarios, S3 operations, and error handling
  • All existing tests still passing

Request/Response Format

Successful Upload (201)

{
  "success": true,
  "message": "Upload accepted and stored",
  "s3_key": "pdda-html/2026/02/11/{uuid}.html",
  "correlation_id": "{uuid}"
}

Validation Error (400)

{
  "success": false,
  "message": "The uploaded file must be an HTM or HTML file",
  "correlation_id": "{uuid}"
}

Deployment Requirements

Environment Variables Required

AWS_S3_XHIBIT_BUCKET_NAME=xhibit-publications
AWS_S3_XHIBIT_REGION=eu-west-2
AWS_S3_XHIBIT_PREFIX=pdda-html/
AWS_ACCESS_KEY_ID=<from-azure-keyvault>
AWS_SECRET_ACCESS_KEY=<from-azure-keyvault>
PDDA_HTML_MAX_FILE_SIZE=10485760  # 10MB (optional)

Deployment Steps

  1. Configure AWS credentials in Azure Key Vault
  2. Set environment variables in Azure App Service
  3. Run database migration: yarn db:migrate
  4. Verify S3 bucket exists and credentials have PutObject + HeadObject permissions
  5. Smoke test the endpoint in staging

Acceptance Criteria

  • ✅ Both HTM and HTML files are accepted
  • ✅ Files passed through to XHIBIT S3
  • ✅ AWS credentials configured via environment variables
  • ✅ S3 region specified
  • ✅ New endpoint created (POST /api/v1/pdda-html)
  • ✅ AWS S3 SDK integrated
  • ✅ New artefact type LCSU added
  • ✅ Functional tests added (37 unit tests)
  • ✅ S3 upload verification implemented

Code Quality

  • ✅ All linting rules satisfied
  • ✅ TypeScript strict mode (no any types)
  • ✅ Follows HMCTS monorepo conventions
  • ✅ Functional programming style
  • ✅ Comprehensive error handling
  • ✅ Security best practices (input validation, no sensitive data in logs)

Review

Full code review completed: docs/tickets/334/review.md

Overall Assessment: ✅ APPROVED

  • No critical issues
  • High quality implementation
  • Production ready

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • HTM/HTML upload endpoint added — client-facing upload with validation, S3 storage, and structured success/error responses (returns S3 key and correlation ID).
  • Database

    • Artefact storage extended with a mandatory type field and index to support categorisation and faster lookups.
  • Integration

    • Upload routes registered with the API.
  • Tests & Documentation

    • Comprehensive unit, integration and E2E tests plus planning, review and task documentation added.

github-actions bot and others added 2 commits February 11, 2026 11:33
Created technical specification and implementation plan for issue #334.

This plan covers:
- New API endpoint for PDDA HTML/HTM file uploads to XHIBIT S3
- Database migration to add artefact type column
- AWS S3 integration with @aws-sdk/client-s3
- File validation (extensions, size, path traversal protection)
- New artefact type LCSU for PDDA HTML uploads
- Comprehensive test strategy including S3 verification

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add service-to-service API endpoint for PDDA to upload HTML/HTM files
directly to AWS S3 (XHIBIT bucket).

Changes:
- Add new libs/pdda-html-upload module with S3 upload functionality
- Create POST /api/v1/pdda-html endpoint with OAuth authentication
- Add database migration for artefact.type column (LCSU artefact type)
- Implement comprehensive validation (file type, size, path traversal)
- Add 37 unit tests with >95% coverage
- Update publication queries to set default artefact type

Technical details:
- AWS SDK integration for S3 uploads with verification
- Correlation ID support for request tracing
- File size limit (10MB default, configurable)
- S3 key format: pdda-html/YYYY/MM/DD/{uuid}.html
- Secure logging (no sensitive data)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Registers a new PDDA HTML upload API module and package, adds route registration, S3 client and upload service, validation, types and tests, introduces a Prisma migration adding non-nullable artefact.type (backfilled to 'LIST') and updates seeds/tests and TypeScript path aliases.

Changes

Cohort / File(s) Summary
API entry
apps/api/src/app.ts
Import and register pddaHtmlRoutes in route mounts.
Database schema & seeds
apps/postgres/prisma/migrations/20260211121457_add_artefact_type/migration.sql, apps/postgres/prisma/schema.prisma, apps/postgres/prisma/seed.ts
Add type VARCHAR(50) to artefact, backfill existing rows to 'LIST', make NOT NULL, add index, update Prisma schema and seeds.
Test seeds & e2e adjustments
e2e-tests/utils/seed-location-data.ts, e2e-tests/tests/*, apps/postgres/prisma/seed.ts
Add type: "LIST" to seeded artefacts in tests and fixtures.
New package & TS config
libs/pdda-html-upload/package.json, libs/pdda-html-upload/tsconfig.json, libs/pdda-html-upload/src/config.ts, libs/pdda-html-upload/src/index.ts, tsconfig.json
Scaffold new ES module package, export route path config, add tsconfig path aliases (@hmcts/pdda-html-upload*).
Routes & types
libs/pdda-html-upload/src/routes/v1/pdda-html.ts, libs/pdda-html-upload/src/routes/v1/pdda-html.test.ts, libs/pdda-html-upload/src/types.ts
Add POST route middleware array (authenticateApi, multer memory, handler), response types and comprehensive route tests (validation, success, S3 error, correlation ID).
Validation
libs/pdda-html-upload/src/validation/file-validation.ts, libs/pdda-html-upload/src/validation/file-validation.test.ts
Add validatePddaHtmlUpload enforcing artefact_type === 'LCSU', .htm/.html extension, size limits, and path-traversal checks with unit tests.
S3 client & upload service
libs/pdda-html-upload/src/s3/s3-client.ts, libs/pdda-html-upload/src/s3/s3-upload-service.ts, libs/pdda-html-upload/src/s3/s3-client.test.ts, libs/pdda-html-upload/src/s3/s3-upload-service.test.ts
Add createS3Client (env validation) and uploadHtmlToS3 (date/UUID S3 key, metadata, HeadObject verification), plus tests for config, key format, prefix fallback and error cases.
Publication repo changes
libs/publication/src/repository/queries.ts, libs/publication/src/repository/queries.test.ts
Set type: "LIST" on artefact creation and default noMatch to data.noMatch ?? false; update tests accordingly.
Docs & tickets
docs/tickets/334/*
Add ticket, plan, review and tasks docs describing PDDA HTML upload design, implementation, tests and deployment.
Misc
package.json
Bump resolutions.qs from 6.14.1 → 6.14.2.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant API as API (Express)
  participant Auth as OAuth
  participant Validator as Validation
  participant S3 as AWS S3
  Note over API: libs/pdda-html-upload route handler
  Client->>API: POST /api/v1/pdda-html (file, artefact_type, correlation_id)
  API->>Auth: authenticateApi()
  Auth-->>API: 200 OK
  API->>Validator: validatePddaHtmlUpload(artefact_type, file)
  alt validation fails
    Validator-->>API: { valid: false, error }
    API-->>Client: 400 { success:false, message, correlation_id }
  else validation passes
    Validator-->>API: { valid: true }
    API->>S3: uploadHtmlToS3(fileBuffer, filename, correlation_id)
    alt upload succeeds
      S3-->>API: { success:true, s3Key, bucketName }
      API-->>Client: 201 { success:true, s3_key, correlation_id }
    else upload fails
      S3-->>API: error
      API-->>Client: 500 { success:false, message, correlation_id }
    end
  end
Loading

Possibly related issues

  • PDDA/HTML #334 — Implements the same PDDA/HTML upload feature, routes, S3 upload and artefact.type DB migration described in this PR.

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarises the primary change: implementing PDDA/HTML upload functionality to AWS S3, which is clearly reflected throughout the changeset including the new module, S3 integration, database migration, and route registration.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into master

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/334-pdda-html-s3-upload

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 11, 2026

🎭 Playwright E2E Test Results

238 tests   238 ✅  22m 14s ⏱️
 33 suites    0 💤
  1 files      0 ❌

Results for commit 50100db.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (13)
libs/pdda-html-upload/src/index.ts (1)

1-1: Unnecessary comment in an otherwise empty file.

Per coding guidelines, comments should explain why, not what. An empty file is self-explanatory; this comment adds no value. Consider removing it or leaving the file truly empty.

As per coding guidelines, "Do not add comments unless meaningful - explain why, not what."

libs/pdda-html-upload/src/s3/s3-client.ts (1)

3-19: Consider using the AWS SDK default credential provider chain.

Hardcoding credentials from environment variables bypasses the SDK's built-in credential resolution (IAM roles, instance profiles, ECS task roles, etc.), which is the recommended approach for deployed environments. new S3Client({ region }) without explicit credentials will automatically resolve credentials from the environment, including AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY if set, plus IAM roles when available.

This would simplify the code and improve deployment flexibility.

♻️ Suggested simplification
 export function createS3Client(): S3Client {
   const region = process.env.AWS_S3_XHIBIT_REGION;
-  const accessKeyId = process.env.AWS_ACCESS_KEY_ID;
-  const secretAccessKey = process.env.AWS_SECRET_ACCESS_KEY;

-  if (!region || !accessKeyId || !secretAccessKey) {
+  if (!region) {
     throw new Error("Missing required AWS S3 configuration");
   }

   return new S3Client({
-    region,
-    credentials: {
-      accessKeyId,
-      secretAccessKey
-    }
+    region
   });
 }
docs/tickets/334/tasks.md (1)

119-129: Clarify the mapping between Key Vault secrets and environment variables.

Lines 120–125 document the env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), while lines 127–128 reference Key Vault secrets with different names (xhibit-s3-access-key, xhibit-s3-access-key-secret). Ensure the deployment configuration clearly maps each secret to its corresponding environment variable to avoid misconfiguration.

docs/tickets/334/review.md (1)

1-422: AI-generated review document committed to the repository.

This is a documentation artifact rather than code. No functional concerns, but consider whether committing AI-generated review documents as permanent repo artefacts is the intended practice — they can become stale quickly and may confuse future readers about which issues were actually addressed.

libs/pdda-html-upload/src/validation/file-validation.test.ts (1)

133-163: Consider adding a test for embedded path traversal.

The validation uses includes("../") which catches traversal anywhere in the filename (e.g., foo/../evil.html), but there's no test for this case. A quick addition would strengthen coverage.

libs/pdda-html-upload/src/types.ts (2)

1-17: Boolean properties don't follow is/has/can prefix convention.

Per coding guidelines, boolean fields should use prefixes like isValid, isSuccess. These are API response types so the external contract may justify deviation, but worth noting for internal consistency.

As per coding guidelines: "Booleans should use is/has/can prefixes (e.g., isActive, hasAccess, canEdit)."


12-17: PddaHtmlUploadResponse uses snake_case property names.

s3_key and correlation_id use snake_case, which suits an external API contract but deviates from the TypeScript camelCase guideline. If the API contract is fixed, this is fine — just ensure consistency across all response types in the codebase.

As per coding guidelines: "TypeScript variables should use camelCase (e.g., userId, caseDetails, documentId)."

libs/pdda-html-upload/src/routes/v1/pdda-html.ts (2)

8-13: Max file size parsed in two places — risk of divergence.

The PDDA_HTML_MAX_FILE_SIZE env var is parsed identically here (multer limit) and in file-validation.ts line 35. If one is updated but not the other, multer could reject a file before validation runs, or vice versa. Consider extracting a shared constant or config helper.


15-16: x-correlation-id header type could be string[].

In Express, req.headers[key] can return string | string[] | undefined. The cast to string | undefined silently ignores the array case. Low risk but worth a safe coercion.

Proposed fix
-  const correlationId = req.headers["x-correlation-id"] as string | undefined;
+  const correlationIdHeader = req.headers["x-correlation-id"];
+  const correlationId = Array.isArray(correlationIdHeader) ? correlationIdHeader[0] : correlationIdHeader;
libs/pdda-html-upload/src/validation/file-validation.ts (1)

43-48: Path traversal check is basic — may miss encoded variants.

The check only looks for literal ../ and ..\ strings. Encoded variants like ..%2F or ..%5C could bypass this. The actual risk is low since the filename is only used for extension extraction and S3 metadata (the S3 key uses a UUID), but it's worth hardening if the original filename is ever used in a file-system context downstream.

Proposed hardening
-  if (file.originalname.includes("../") || file.originalname.includes("..\\")) {
+  const decodedName = decodeURIComponent(file.originalname);
+  if (decodedName.includes("../") || decodedName.includes("..\\") || decodedName.includes("..%")) {
     return {
       valid: false,
       error: "Invalid filename"
     };
   }
libs/pdda-html-upload/src/s3/s3-upload-service.ts (2)

8-8: New S3 client created per request.

createS3Client() is called on every upload, instantiating a new S3Client each time. The AWS SDK client is safe to reuse and handles connection pooling internally. Consider creating the client once at module level or caching it.


30-34: User-controlled originalFilename stored in S3 metadata without sanitisation.

file.originalname is placed directly into S3 object metadata. While S3 metadata doesn't execute, excessively long or specially crafted values could cause issues with downstream consumers reading the metadata. Consider truncating or sanitising.

libs/pdda-html-upload/src/s3/s3-upload-service.test.ts (1)

85-91: Missing test for HeadObject failure after successful PutObject.

The test at line 85 mocks send to reject on the first call (PutObject). Consider adding a test where PutObject succeeds but HeadObject fails (second send call rejects), to verify the error path and document the orphaned-file behaviour.

Suggested additional test
+ it("should throw when HeadObject verification fails after successful upload", async () => {
+   mockS3Send
+     .mockResolvedValueOnce({})       // PutObject succeeds
+     .mockRejectedValueOnce(new Error("HeadObject failed")); // HeadObject fails
+
+   const fileBuffer = Buffer.from("<html></html>");
+
+   await expect(uploadHtmlToS3(fileBuffer, "test.html")).rejects.toThrow(
+     "The file could not be uploaded to storage. Try again."
+   );
+ });

**I WANT** to implement the PDDA/HTML functionality in CaTH
**SO THAT** the Crown hearing lists can be published in CaTH

**TECHNIAL SPECIFICATION**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: "TECHNIAL" → "TECHNICAL".

• AWS S3 SDK should be set up in API to communicate with S3
• A new value LCSU is added to the ArtefactType which is to be used by PDDA when sending html to PIP APIM
• Functional test is added to the API to upload the html file
• Functional test is added to upload the html/htm file to file to S3 bucket and check the uploaded file exists in the S3 bucket
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicated phrase: "file to file to".

Should likely read "file to S3 bucket".

🧰 Tools
🪛 LanguageTool

[grammar] ~34-~34: This phrase is duplicated. You should probably use “file to” only once.
Context: ...al test is added to upload the html/htm file to file to S3 bucket and check the uploaded file e...

(PHRASE_REPETITION)

________________________________________
**Accessibility**
• Ensure error messages returned by the upload flow are concise, consistent, and machine-readable for any consuming UI (where applicable).
• Ensure logs/audit events include correlation identifiers to support support/operations without exposing sensitive credentials or file contents.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicated word: "support support/operations".

Should be "support operations" or "support/operations".

🧰 Tools
🪛 LanguageTool

[duplication] ~132-~132: Possible typo: you repeated a word.
Context: ...ents include correlation identifiers to support support/operations without exposing sensitive c...

(ENGLISH_WORD_REPEAT_RULE)

Comment on lines +54 to +66
} catch (error) {
console.error("PDDA HTML upload failed", {
error: error instanceof Error ? error.message : "Unknown error",
correlationId
});

const response: PddaHtmlUploadResponse = {
success: false,
message: error instanceof Error ? error.message : "Internal server error",
correlation_id: correlationId
};
return res.status(500).json(response);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Internal error details may leak to the client.

If createS3Client() throws (e.g. "Missing required AWS S3 configuration"), that message is returned verbatim in the 500 response at line 62. This exposes internal infrastructure details. Consider returning a generic message for unexpected errors and reserving the error.message passthrough only for known, user-safe error types.

Proposed fix
-      message: error instanceof Error ? error.message : "Internal server error",
+      message: "Internal server error",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} catch (error) {
console.error("PDDA HTML upload failed", {
error: error instanceof Error ? error.message : "Unknown error",
correlationId
});
const response: PddaHtmlUploadResponse = {
success: false,
message: error instanceof Error ? error.message : "Internal server error",
correlation_id: correlationId
};
return res.status(500).json(response);
}
} catch (error) {
console.error("PDDA HTML upload failed", {
error: error instanceof Error ? error.message : "Unknown error",
correlationId
});
const response: PddaHtmlUploadResponse = {
success: false,
message: "Internal server error",
correlation_id: correlationId
};
return res.status(500).json(response);
}

Comment on lines +39 to +43
const headCommand = new HeadObjectCommand({
Bucket: bucketName,
Key: s3Key
});
await s3Client.send(headCommand);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

HeadObject failure after successful PutObject leaves an orphaned file in S3.

If PutObjectCommand succeeds but HeadObjectCommand fails (transient network issue, eventual consistency), the catch block at line 50 throws a generic error to the caller. The file exists in S3 but the client receives a 500 failure, with no cleanup or retry logic. This could lead to orphaned objects and potential duplicate uploads on retry.

Consider either:

  1. Separating the HeadObject call into its own try/catch so PutObject success is preserved, or
  2. Removing the HeadObject verification (PutObject success is sufficient).

alexbottenberg and others added 11 commits February 13, 2026 09:59
Update seed scripts to include the new required 'type' field for artefacts.

Changes:
- apps/postgres/prisma/seed.ts: Add type='LIST' to test artefacts
- e2e-tests/utils/seed-location-data.ts: Add type='LIST' to E2E test artefacts

Required for database schema changes in #334.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolved yarn.lock conflicts by regenerating with yarn install.

Changes from master:
- Updated E2E test workflows
- Helm values updates
- SSO config improvements
- GitHub secrets documentation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add required 'type' field to blob-explorer E2E test artefact creation.

Required for schema changes in #334.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Updated the following packages to address security vulnerabilities:

- multer: 1.4.5-lts.1 → 2.0.2 (CVSS 8.7 critical)
  Fixes arbitrary file exposure vulnerability

- @aws-sdk/client-s3: 3.712.0 → 3.989.0
  Includes update to @smithy/config-resolver (CVSS 3.7)
  Addresses incorrect resolution of endpoints during SDK credential retrieval

- @types/multer: 1.4.12 → 2.0.0
  Updated type definitions to match multer 2.0.2

- qs: 6.14.1 → 6.14.2 (CVSS 3.7)
  Fixes prototype pollution via array brackets

All unit tests passing (39/41 tasks successful).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add type: "LIST" to artefact creation in delete-court test
to match the new database schema requirement.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add type: "LIST" to artefact creation in flat-file-viewing test
to match the new database schema requirement.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add @db.VarChar(50) to the type field in the Artefact model to match
the migration SQL which creates VARCHAR(50). Without this annotation,
Prisma maps String to text in PostgreSQL, which would cause schema
drift detection in future migrations.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changed error handling to return generic "Internal server error" message
instead of exposing raw error.message to the client. This prevents
leaking internal infrastructure details such as:
- AWS configuration errors
- S3 bucket names or credentials
- Internal service error messages

Error details are still logged server-side for debugging at lines 55-58.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Removed HeadObjectCommand verification after PutObjectCommand success.
If PutObject succeeds but HeadObject fails (transient network issue,
eventual consistency), the previous implementation would throw an error
to the caller even though the file was successfully uploaded to S3.

This created two problems:
1. Orphaned files in S3 that the client doesn't know about
2. Client retries would create duplicate uploads

PutObject success is sufficient proof that the file was uploaded.
The HeadObject verification added no real value and created this
problematic failure scenario.

Changes:
- Removed HeadObjectCommand import and usage
- Updated tests to expect 1 S3 call instead of 2

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolved conflicts:
- package.json: Updated dependency versions (qs 6.15.0, tar 7.5.9)
- yarn.lock: Regenerated after dependency resolution

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changes:
- Use original filename as S3 key instead of UUID-based path structure
- Enhance S3 upload error logging to include AWS response details (status code, headers, metadata)
- Add @aws-sdk/node-http-handler dependency for future SSL certificate handling
- Fix S3 region from eu-west-2 to eu-west-1 (correct bucket region)
- Remove custom endpoint configuration that was causing CloudFront 403 errors

The S3 key now follows the format: pdda-html/{original-filename}.html
instead of: pdda-html/YYYY/MM/DD/{uuid}.html

This allows PDDA to identify uploaded files by their original names and
provides better error diagnostics for troubleshooting S3 upload issues.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@alexbottenberg alexbottenberg linked an issue Feb 20, 2026 that may be closed by this pull request
alexbottenberg and others added 2 commits February 20, 2026 14:19
Resolved conflicts:
- yarn.lock: Regenerated after merging latest dependencies from master
- Multiple package.json files: Updated dependency versions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Updated tests to reflect the new behavior where S3 keys use the original
filename (pdda-html/filename.html) instead of the previous date-based
UUID format (pdda-html/YYYY/MM/DD/uuid.html).

Changes:
- Update "should upload file to S3 successfully" to expect exact filename match
- Rename and update "should generate S3 key with date-based path" test to verify original filename is preserved
- Make file extension regex checks case-insensitive (match both .html and .HTML)

All 7 tests now pass.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
}
});

const postHandler = async (req: Request, res: Response) => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a separate endpoint, could this be hooked into the existing endpoint with a separate path if LCSU type to match CaTH

alexbottenberg and others added 5 commits February 23, 2026 17:02
Consolidate HTML file upload functionality into the existing /api/v1/publication
endpoint to support both JSON blob ingestion and multipart file uploads.

Changes:
- Add content-type detection to route requests appropriately
- Apply multer middleware conditionally for multipart requests
- Extract PDDA HTML validation and S3 upload as reusable functions
- Remove separate /api/v1/pdda-html endpoint and route files
- Add comprehensive tests for both upload types (11 tests passing)

Benefits:
- Single endpoint for all publication uploads
- Consistent authentication across upload types
- Improved code reusability and maintainability

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolved conflicts:
- tsconfig.json: Added both pdda-html-upload and pdf-generation paths
- yarn.lock: Regenerated after merge

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Organize imports in correct order (external, then relative)
- Remove unused consoleInfoSpy variable from tests

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

async function handleHtmlFileUpload(req: Request, res: Response) {
const correlationId = req.headers["x-correlation-id"] as string | undefined;

const { artefact_type } = req.body;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be 'type' instead of 'artefact_type', so that it is is consistent with the DB column

alexbottenberg and others added 2 commits February 25, 2026 12:41
…ml-s3-upload

# Conflicts:
#	apps/postgres/prisma/schema.prisma
#	tsconfig.json
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PDDA/HTML

2 participants