(im) support im oapi range download large file#283
(im) support im oapi range download large file#283chenxingtong-bytedance wants to merge 1 commit intolarksuite:mainfrom
Conversation
|
|
📝 WalkthroughWalkthroughRefactors IM resource download to use an initial ranged probe and subsequent chunked HTTP Range downloads with per-request retries/backoff, temp-file write+rename finalization, Content-Range parsing/validation, plus a header-merge change in shortcut streaming requests and expanded tests/docs. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Runner as Shortcut Runner
participant Remote as Remote Server
participant FS as File System
Client->>Runner: Request resource download
Runner->>Remote: Probe GET with Range: bytes=0-(probe-1)
Note over Runner,Remote: Probe may be retried with backoff
Remote-->>Runner: 200 OK or 206 Partial Content + Content-Range
alt 200 OK
Runner->>FS: Stream body -> temp file
else 206 Partial Content
Runner->>Runner: Parse Content-Range -> total size
Runner->>FS: Create temp file, write probe bytes at offset 0
loop For each remaining chunk
Runner->>Remote: GET chunk with Range header
Note over Runner,Remote: Per-chunk retries with backoff
Remote-->>Runner: 206 Partial Content
Runner->>FS: Write chunk at offset
end
Runner->>Runner: Validate temp file size == total
end
Runner->>FS: Sync, Close, Rename temp -> final path
Runner-->>Client: Return success or error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryAdds HTTP Range-request based chunked downloading for large IM resources: a 128KB probe determines total file size via Confidence Score: 4/5Functionally sound; two prior P1 threads (probe byte-count not asserted, no-op semaphore) remain unresolved. The chunked download implementation is well-structured with proper temp-file cleanup, integrity checks, and solid test coverage. The runner.go header-merge fix is correct and the new Range-header test confirms the regression is caught. Score is 4 rather than 5 because the prior-thread P1 finding — probe chunk written-byte count is silently discarded, leaving a potential sparse-file gap when the server delivers fewer bytes than requested — has not been addressed in this revision. shortcuts/im/im_messages_resources_download.go — probe chunk write validation (line 194) and no-op semaphore (lines 199-206) flagged in prior review threads remain open. Important Files Changed
Sequence DiagramsequenceDiagram
participant CLI as CLI (downloadIMResourceToPath)
participant API as Lark IM API
participant FS as Filesystem
CLI->>API: GET /resources/:file_key\n(Range: bytes=0-131071)
note over CLI,API: doIMResourceDownloadRequest\nretries up to 2× with backoff
alt 200 OK (server ignores Range)
API-->>CLI: 200 + full body
CLI->>FS: io.Copy → tmpFile
else 206 Partial Content
API-->>CLI: 206 + Content-Range: bytes 0-N/totalSize
CLI->>FS: WriteAt(offset=0) probe chunk
loop each 8MB chunk [probeChunkSize..totalSize)
CLI->>API: GET (Range: bytes=start-end)
note over CLI,API: retries up to 2× with backoff
API-->>CLI: 206 + chunk body
CLI->>FS: WriteAt(start) chunk\nvalidate written == expected
end
CLI->>FS: Stat() → assert size == totalSize
end
CLI->>FS: Sync + Close\nRename tmpFile → finalPath
Reviews (3): Last reviewed commit: "(im) support im oapi range download larg..." | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/im/im_messages_resources_download.go`:
- Around line 272-284: The retry loop can start another DoAPIStream call after
sleepIMDownloadRetry observes cancellation; update the loops that call
runtime.DoAPIStream with imDownloadRequestRetries to check ctx.Err() immediately
after sleepIMDownloadRetry and return nil, ctx.Err() if canceled so no extra
attempt runs; specifically modify the loops that use runtime.DoAPIStream,
imDownloadRequestRetries and sleepIMDownloadRetry (also the similar loop at the
later block around lines 289–297) to return on ctx.Err() before the next
iteration.
- Around line 189-215: The 206 handling currently only calls parseTotalSize and
doesn't verify the Content-Range start/end or check the number of bytes actually
written; update the logic that handles http.StatusPartialContent (and the
similar blocks referenced by downloadAndWriteChunk and the other 206 handlers)
to parse start, end, and total from downloadResp.Header.Get("Content-Range")
(not just total), verify the returned start/end exactly match the requested task
range, and ensure the number of bytes written by writeChunkAt equals
(end-start+1) before accepting the chunk; if mismatched, return a network/error
immediately. Modify or overload writeChunkAt (or the call sites) so it can
validate and return the written length for this check, and keep using
buildChunkTasks, sem, and downloadAndWriteChunk to locate locations to change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 74c0c338-e7d6-4dc7-85c6-bcb98affb66a
📒 Files selected for processing (4)
shortcuts/common/runner.goshortcuts/im/helpers_network_test.goshortcuts/im/helpers_test.goshortcuts/im/im_messages_resources_download.go
5faf274 to
ea10aa6
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
shortcuts/im/im_messages_resources_download.go (3)
369-393:⚠️ Potential issue | 🟠 MajorChunk download validates written bytes but not the response range.
downloadAndWriteChunkcorrectly validates thatwritten == expected(line 390-392), which catches short reads. However, similar to the probe, it doesn't validate that theContent-Rangeheader in the 206 response matches the requestedstart-end. If the server returns a different range, the data would be written at the wrong offset.🛠️ Suggested validation
if downloadResp.StatusCode != http.StatusPartialContent { return output.ErrNetwork("unexpected status code: %d", downloadResp.StatusCode) } + +// Validate Content-Range matches requested range +respStart, respEnd, _, err := parseContentRange(downloadResp.Header.Get("Content-Range")) +if err != nil { + return output.ErrNetwork("invalid Content-Range: %s", err) +} +if respStart != start || respEnd != end { + return output.ErrNetwork("response range %d-%d does not match requested %d-%d", respStart, respEnd, start, end) +} written, err := writeChunkAt(file, downloadResp.Body, start)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/im_messages_resources_download.go` around lines 369 - 393, In downloadAndWriteChunk, validate the HTTP 206 response's Content-Range header before writing: parse downloadResp.Header.Get("Content-Range") (in function downloadAndWriteChunk) to ensure the returned range start and end match the requested start and end and that the reported length equals end-start+1; if parsing fails or the range doesn't match, return an error (similar to downloadResponseError/output.ErrNetwork) and avoid calling writeChunkAt. This prevents writing data at the wrong offset when the server returns a different range.
272-284:⚠️ Potential issue | 🟡 MinorReturn immediately once backoff observes cancellation.
After
sleepIMDownloadRetry()returns due toctx.Done(), the loop still proceeds to the next iteration and callsDoAPIStreamagain. Add a context check after the sleep to avoid the extra attempt.🔧 Proposed fix
sleepIMDownloadRetry(ctx, attempt) + if ctx.Err() != nil { + return nil, ctx.Err() + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/im_messages_resources_download.go` around lines 272 - 284, The loop in the download retry logic (using imDownloadRequestRetries, runtime.DoAPIStream and sleepIMDownloadRetry) can perform an extra request after sleep if the context was canceled; add a context cancellation check immediately after sleepIMDownloadRetry(ctx, attempt) and return nil, ctx.Err() (or propagate ctx.Err()) when ctx.Err() != nil to avoid calling runtime.DoAPIStream again with a canceled context.
189-216:⚠️ Potential issue | 🟠 MajorValidate the full
Content-Rangebefore accepting a partial response.The probe chunk handling at line 194 writes the response body to offset 0 without validating that the server actually returned bytes
0-N. If the server returns a misaligned range (e.g.,bytes 1000-2000/total), the data will be written at offset 0, corrupting the file. TheparseTotalSize()function only extracts the total, not the start/end values.Consider parsing and validating the full
Content-Rangeheader to ensure the returned range matches the requested range before writing.🛠️ Suggested approach
Create a
parseContentRangefunction that returnsstart,end, andtotal, then validate:start, end, total, err := parseContentRange(downloadResp.Header.Get("Content-Range")) if err != nil { return "", 0, output.ErrNetwork("invalid Content-Range header: %s", err) } if start != 0 || end > probeChunkSize-1 { return "", 0, output.ErrNetwork("unexpected range in response: %d-%d", start, end) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/im_messages_resources_download.go` around lines 189 - 216, The probe chunk handling currently calls parseTotalSize and writes downloadResp.Body at offset 0 via writeChunkAt without validating the returned byte range; change this to parse the full Content-Range (create/use a parseContentRange that returns start, end, total) and validate that start == 0 and end is within the expected probe chunk size before calling writeChunkAt on tmpFile; if validation fails, return an output.ErrNetwork with a clear "unexpected range" message, otherwise use the parsed total for sizeBytes and proceed to buildChunkTasks/downloadAndWriteChunk as before.
🧹 Nitpick comments (1)
shortcuts/im/helpers_network_test.go (1)
321-354: Retry tests incur real wall-clock delays (~900ms total).The retry logic uses
sleepIMDownloadRetrywhich is not mocked, causing these tests to wait through actual exponential backoff delays (300ms + 600ms for 2 failed attempts). While this tests the real behavior, it slows down the test suite.Consider extracting the sleep function as a package-level variable that can be overridden in tests, or accept the delay as acceptable for integration-style testing.
💡 Optional: Make sleep mockable
In the implementation file:
var sleepFunc = sleepIMDownloadRetry func sleepIMDownloadRetry(ctx context.Context, attempt int) { // ... existing implementation }In tests:
func init() { sleepFunc = func(ctx context.Context, attempt int) {} // no-op for tests }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/helpers_network_test.go` around lines 321 - 354, The tests are incurring real backoff delays because downloadIMResourceToPath calls sleepIMDownloadRetry directly; refactor by introducing a package-level variable (e.g., sleepFunc) initialized to sleepIMDownloadRetry and change downloadIMResourceToPath (and any callers) to call sleepFunc(ctx, attempt) instead of sleepIMDownloadRetry; in tests override sleepFunc to a no-op (and optionally restore it) so retries run instantly. Ensure you reference and update the symbols sleepIMDownloadRetry, sleepFunc, and downloadIMResourceToPath so the test can set sleepFunc = func(ctx context.Context, attempt int) {} to avoid real sleeps.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@shortcuts/im/im_messages_resources_download.go`:
- Around line 369-393: In downloadAndWriteChunk, validate the HTTP 206
response's Content-Range header before writing: parse
downloadResp.Header.Get("Content-Range") (in function downloadAndWriteChunk) to
ensure the returned range start and end match the requested start and end and
that the reported length equals end-start+1; if parsing fails or the range
doesn't match, return an error (similar to
downloadResponseError/output.ErrNetwork) and avoid calling writeChunkAt. This
prevents writing data at the wrong offset when the server returns a different
range.
- Around line 272-284: The loop in the download retry logic (using
imDownloadRequestRetries, runtime.DoAPIStream and sleepIMDownloadRetry) can
perform an extra request after sleep if the context was canceled; add a context
cancellation check immediately after sleepIMDownloadRetry(ctx, attempt) and
return nil, ctx.Err() (or propagate ctx.Err()) when ctx.Err() != nil to avoid
calling runtime.DoAPIStream again with a canceled context.
- Around line 189-216: The probe chunk handling currently calls parseTotalSize
and writes downloadResp.Body at offset 0 via writeChunkAt without validating the
returned byte range; change this to parse the full Content-Range (create/use a
parseContentRange that returns start, end, total) and validate that start == 0
and end is within the expected probe chunk size before calling writeChunkAt on
tmpFile; if validation fails, return an output.ErrNetwork with a clear
"unexpected range" message, otherwise use the parsed total for sizeBytes and
proceed to buildChunkTasks/downloadAndWriteChunk as before.
---
Nitpick comments:
In `@shortcuts/im/helpers_network_test.go`:
- Around line 321-354: The tests are incurring real backoff delays because
downloadIMResourceToPath calls sleepIMDownloadRetry directly; refactor by
introducing a package-level variable (e.g., sleepFunc) initialized to
sleepIMDownloadRetry and change downloadIMResourceToPath (and any callers) to
call sleepFunc(ctx, attempt) instead of sleepIMDownloadRetry; in tests override
sleepFunc to a no-op (and optionally restore it) so retries run instantly.
Ensure you reference and update the symbols sleepIMDownloadRetry, sleepFunc, and
downloadIMResourceToPath so the test can set sleepFunc = func(ctx
context.Context, attempt int) {} to avoid real sleeps.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 04629503-8d6f-410e-8ee4-e540a022fef7
📒 Files selected for processing (6)
shortcuts/common/runner.goshortcuts/im/helpers_network_test.goshortcuts/im/helpers_test.goshortcuts/im/im_messages_resources_download.goskills/lark-im/SKILL.mdskills/lark-im/references/lark-im-messages-resources-download.md
✅ Files skipped from review due to trivial changes (3)
- shortcuts/common/runner.go
- skills/lark-im/references/lark-im-messages-resources-download.md
- shortcuts/im/helpers_test.go
Change-Id: I38e6f6f9cf8b8711dc40650d19c77503f4e44989
ea10aa6 to
baa8261
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/im/helpers_test.go (1)
492-527: Well-structured test with good coverage.The table-driven test comprehensively covers the
parseTotalSizefunction, including success cases and all major error paths. The substring matching for error assertions is appropriate for maintainability.One minor gap: the implementation checks for
totalSize < 0(line 355-356 inim_messages_resources_download.go), but there's no test case for a negative total like"bytes 0-15/-100". Consider adding this edge case for completeness.🧪 Optional: Add test case for negative total
{name: "unknown total size", contentRange: "bytes 0-99/*", wantErr: `unknown total size in content-range: "bytes 0-99/*"`}, {name: "invalid total", contentRange: "bytes 0-15/not-a-number", wantErr: "parse total size:"}, + {name: "negative total", contentRange: "bytes 0-15/-100", wantErr: "invalid total size:"}, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/helpers_test.go` around lines 492 - 527, Add a table-driven test case to TestParseTotalSize that covers the negative-total branch in parseTotalSize: insert a test entry like {name: "negative total", contentRange: "bytes 0-15/-100", wantErr: "total size"} so the test calls parseTotalSize and asserts the returned error contains the "total size" substring (matching the implementation's totalSize < 0 check in im_messages_resources_download.go); this ensures the negative total-size path in parseTotalSize is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/im/helpers_test.go`:
- Around line 492-527: Add a table-driven test case to TestParseTotalSize that
covers the negative-total branch in parseTotalSize: insert a test entry like
{name: "negative total", contentRange: "bytes 0-15/-100", wantErr: "total size"}
so the test calls parseTotalSize and asserts the returned error contains the
"total size" substring (matching the implementation's totalSize < 0 check in
im_messages_resources_download.go); this ensures the negative total-size path in
parseTotalSize is exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 028b515c-51cf-41b2-a7ef-7fb09176f88f
📒 Files selected for processing (6)
shortcuts/common/runner.goshortcuts/im/helpers_network_test.goshortcuts/im/helpers_test.goshortcuts/im/im_messages_resources_download.goskills/lark-im/SKILL.mdskills/lark-im/references/lark-im-messages-resources-download.md
✅ Files skipped from review due to trivial changes (1)
- skills/lark-im/references/lark-im-messages-resources-download.md
🚧 Files skipped from review as they are similar to previous changes (3)
- skills/lark-im/SKILL.md
- shortcuts/im/im_messages_resources_download.go
- shortcuts/common/runner.go
Change-Id: I38e6f6f9cf8b8711dc40650d19c77503f4e44989
Summary
Changes
Test Plan
lark xxxcommand works as expectedRelated Issues
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation