Skip to content

Expose Content-Length for CORS headers to send through hub#400

Merged
gschoeni merged 2 commits intomainfrom
gs/stream-content-length
Mar 28, 2026
Merged

Expose Content-Length for CORS headers to send through hub#400
gschoeni merged 2 commits intomainfrom
gs/stream-content-length

Conversation

@gschoeni
Copy link
Copy Markdown
Collaborator

No description provided.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 28, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d66f0a4b-9b15-496d-bcc9-3b74f9c83a45

📥 Commits

Reviewing files that changed from the base of the PR and between 6cd568b and 7c3df69.

📒 Files selected for processing (1)
  • crates/lib/src/storage/version_store.rs

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Downloads for files, versioned artifacts, workspace files, and resized images now include correct Content-Length headers, improving client compatibility and reliable streaming.
    • Content-Length is exposed via CORS so browsers and HTTP clients can read it.
  • Bug Fixes

    • Derived/cached assets now report accurate byte sizes for served responses.
  • Tests

    • Added tests asserting Content-Length presence and exposure.

Walkthrough

Adds a get_version_derived_size method to the VersionStore trait with Local and S3 implementations; updates image-resize utilities to return (stream, content_length); adds response helpers to expose Content-Length; and updates controllers/tests to include and expose Content-Length for streamed downloads.

Changes

Cohort / File(s) Summary
VersionStore Trait & Implementations
crates/lib/src/storage/version_store.rs, crates/lib/src/storage/local.rs, crates/lib/src/storage/s3.rs
Added get_version_derived_size(orig_hash, derived_filename) to the trait and implemented it for Local (fs::metadata) and S3 (head_object → content_length) with error mappings.
Image Resize & Streaming Utilities
crates/lib/src/util/fs.rs
Changed handle_image_resize and resize_cache_image_version_store signatures to return (stream, content_length); compute size from get_version_derived_size on cache hit or from buffer length on miss.
HTTP Response Helpers
crates/server/src/helpers.rs
Added expose_response_header, expose_content_length, and file_stream_response to apply Content-Length and CORS exposure; included unit tests.
File/Version/Workspace Controllers & Tests
crates/server/src/controllers/file.rs, crates/server/src/controllers/versions.rs, crates/server/src/controllers/workspaces/files.rs
Replaced manual HttpResponse construction with file_stream_response(...), pass Some(content_length) when available, updated image-resize handling to use returned size, and added integration tests asserting Content-Length and Access-Control-Expose-Headers.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client
participant Server as Server (controllers/helpers)
participant Util as util/fs (resize)
participant VS as VersionStore (trait)
participant Storage as Storage (Local/S3)

Client->>Server: GET /file or /version?resize...
Server->>Util: handle_image_resize or stream request
Util->>VS: get_version_derived_size(orig_hash, filename) (on cache hit)
VS->>Storage: head_object / fs::metadata
Storage-->>VS: content_length
VS-->>Util: content_length
Util-->>Server: (stream, content_length)
Server-->>Server: file_stream_response(..., Some(content_length))
Server-->>Client: HTTP 200 with Content-Length and exposed header, streaming body

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • jcelliott
  • rpschoenburg
  • malcolmgreaves

Poem

🐇 I nibble bytes and count each one,

Streams sing true beneath the sun,
Content-Length now proudly shown,
From S3 burrow to local home,
A rabbit cheers: "Headers known!" 🎉

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive No description was provided by the author, making it impossible to assess relatedness to the changeset. Add a description explaining the motivation and impact of exposing Content-Length headers for CORS compatibility.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly describes the main change: exposing the Content-Length header for CORS communication.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch gs/stream-content-length

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
crates/server/src/controllers/versions.rs (1)

650-655: Assert presence, not exact equality, for Access-Control-Expose-Headers.

crates/server/src/helpers.rs is append-based, so these exact-match assertions will start failing as soon as another header is exposed. The same matcher is duplicated in the sibling controller tests; checking that Content-Length is present will be more stable.

🧪 Suggested assertion pattern
-        assert_eq!(
-            resp.headers()
-                .get(header::ACCESS_CONTROL_EXPOSE_HEADERS)
-                .unwrap(),
-            header::CONTENT_LENGTH.as_str()
-        );
+        let exposes_content_length = resp
+            .headers()
+            .get_all(header::ACCESS_CONTROL_EXPOSE_HEADERS)
+            .into_iter()
+            .any(|value| {
+                value
+                    .to_str()
+                    .unwrap()
+                    .split(',')
+                    .any(|name| name.trim().eq_ignore_ascii_case(header::CONTENT_LENGTH.as_str()))
+            });
+        assert!(exposes_content_length);

Also applies to: 706-711

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/server/src/controllers/versions.rs` around lines 650 - 655, The test
currently asserts exact equality of
resp.headers().get(header::ACCESS_CONTROL_EXPOSE_HEADERS).unwrap() to
header::CONTENT_LENGTH.as_str(), but helpers append exposed headers so the
assertion should instead check that the Access-Control-Expose-Headers header
contains Content-Length; update the assertion in the versions controller test
(and the duplicate in the sibling controller tests) to convert the header value
to a string (via to_str()/as_str()) and assert it contains
header::CONTENT_LENGTH.as_str() (or split on commas and assert any segment
equals Content-Length) rather than asserting exact equality.
crates/lib/src/util/fs.rs (1)

1595-1606: Collapse the cache-hit metadata lookups into one storage round-trip.

On S3 this branch is now HEAD (derived_version_exists) + HEAD (get_version_derived_size) + GET (get_version_derived_stream) for every cached resize. That adds latency/cost on the hot path and widens the TOCTOU window between the advertised length and the body fetch. Consider replacing the boolean existence check with a metadata lookup that returns Option<u64>, or having the store return (stream, content_length) together.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/lib/src/util/fs.rs` around lines 1595 - 1606, Replace the three-step
existence/size/stream round-trip (version_store.derived_version_exists,
get_version_derived_size, get_version_derived_stream) with a single
metadata/combined retrieval API on the store (e.g., add or use a method like
get_version_derived_metadata(img_hash, derived_filename) -> Option<u64> or
get_version_derived(img_hash, derived_filename) -> Option<(Stream, u64)>), then
change the resize-cache branch in the code that currently calls
derived_version_exists, get_version_derived_size and get_version_derived_stream
to call that single method and early-return when it yields
Some(metadata_or_pair); keep error handling the same for store errors and handle
the None case as a cache miss. This reduces S3 HEAD/GET calls and eliminates the
TOCTOU window between advertised length and body fetch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/lib/src/storage/version_store.rs`:
- Around line 205-214: The doc comment for get_version_derived_size incorrectly
describes a stream-returning API; update the documentation on the
get_version_derived_size method to state it returns the size in bytes (u64) of
the derived file (e.g., resized image, video thumbnail) for the given orig_hash
and derived_filename, mention any error cases (OxenError) and the semantics
(e.g., size on disk or content length), and remove references to streaming so
implementors and rustdoc reflect the actual return type.

---

Nitpick comments:
In `@crates/lib/src/util/fs.rs`:
- Around line 1595-1606: Replace the three-step existence/size/stream round-trip
(version_store.derived_version_exists, get_version_derived_size,
get_version_derived_stream) with a single metadata/combined retrieval API on the
store (e.g., add or use a method like get_version_derived_metadata(img_hash,
derived_filename) -> Option<u64> or get_version_derived(img_hash,
derived_filename) -> Option<(Stream, u64)>), then change the resize-cache branch
in the code that currently calls derived_version_exists,
get_version_derived_size and get_version_derived_stream to call that single
method and early-return when it yields Some(metadata_or_pair); keep error
handling the same for store errors and handle the None case as a cache miss.
This reduces S3 HEAD/GET calls and eliminates the TOCTOU window between
advertised length and body fetch.

In `@crates/server/src/controllers/versions.rs`:
- Around line 650-655: The test currently asserts exact equality of
resp.headers().get(header::ACCESS_CONTROL_EXPOSE_HEADERS).unwrap() to
header::CONTENT_LENGTH.as_str(), but helpers append exposed headers so the
assertion should instead check that the Access-Control-Expose-Headers header
contains Content-Length; update the assertion in the versions controller test
(and the duplicate in the sibling controller tests) to convert the header value
to a string (via to_str()/as_str()) and assert it contains
header::CONTENT_LENGTH.as_str() (or split on commas and assert any segment
equals Content-Length) rather than asserting exact equality.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e361c75b-79b1-4e7e-9b21-af1f740547db

📥 Commits

Reviewing files that changed from the base of the PR and between 37cedf0 and 6cd568b.

📒 Files selected for processing (8)
  • crates/lib/src/storage/local.rs
  • crates/lib/src/storage/s3.rs
  • crates/lib/src/storage/version_store.rs
  • crates/lib/src/util/fs.rs
  • crates/server/src/controllers/file.rs
  • crates/server/src/controllers/versions.rs
  • crates/server/src/controllers/workspaces/files.rs
  • crates/server/src/helpers.rs

@gschoeni gschoeni merged commit 06dee90 into main Mar 28, 2026
4 of 5 checks passed
@gschoeni gschoeni deleted the gs/stream-content-length branch March 28, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant