fix: handle corrupted/partial video files (closes #12)#21
fix: handle corrupted/partial video files (closes #12)#21
Conversation
Add pre-encoding file integrity validation that detects corruption patterns (NAL errors, partial files, missing moov atoms, truncation). Fatal corruption fails fast with clear messages. Recoverable corruption triggers error-tolerant ffmpeg flags (-err_detect ignore_err, -fflags +discardcorrupt+genpts) that salvage valid portions. Also adds corruption-specific error messages in the worker and post-encoding output validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe changes implement comprehensive file corruption detection and error-tolerant encoding for video processing. A pre-validation step after download detects corruption using FFmpeg error detection, logs findings, and applies recovery flags for non-fatal issues. Post-encoding validation verifies output integrity with segment checks. Changes
Sequence DiagramsequenceDiagram
participant Client
participant VideoProcessor
participant FFmpeg
participant FileSystem
participant VideoEncodingWorker
Client->>VideoProcessor: processVideo(jobData)
VideoProcessor->>FileSystem: Download source video
FileSystem-->>VideoProcessor: Source file ready
VideoProcessor->>VideoProcessor: validateFileIntegrity()
activate VideoProcessor
VideoProcessor->>FFmpeg: Run error detection pass
FFmpeg->>FFmpeg: Parse for corruption patterns
FFmpeg-->>VideoProcessor: Corruption results
deactivate VideoProcessor
alt Fatal Corruption
VideoProcessor->>Client: Throw error with details
else Non-Fatal Corruption
VideoProcessor->>VideoProcessor: Merge into probe.issues
VideoProcessor->>VideoProcessor: determineEncodingStrategy()
VideoProcessor->>VideoProcessor: Add error-tolerant flags
end
VideoProcessor->>VideoEncodingWorker: Start encoding with strategy
VideoEncodingWorker->>FFmpeg: Execute encoding
FFmpeg->>FileSystem: Write segments (.ts files)
FFmpeg->>FileSystem: Write output playlist
FFmpeg-->>VideoEncodingWorker: Process complete
activate VideoEncodingWorker
VideoEncodingWorker->>FileSystem: Verify segment existence
VideoEncodingWorker->>FileSystem: Verify playlist non-empty
deactivate VideoEncodingWorker
alt Validation Passes
VideoEncodingWorker->>Client: Return success
else Validation Fails
VideoEncodingWorker->>Client: Throw integrity error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes The changes introduce new validation logic with pattern matching and error detection across two files. The implementation involves straightforward FFmpeg stderr parsing and flag application, but requires careful verification of pattern matching correctness, error handling flow, and the interaction between pre-validation and post-validation checks. Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/services/VideoProcessor.ts`:
- Around line 435-438: The ffmpeg integrity check spawned with execFile (using
ffmpegPath and filePath) treats an ETIMEDOUT as a numeric exit code string and
uses a confusing message and a hardcoded 120000ms timeout that may be too short
for large/long videos; update the Promise callback in VideoProcessor to (1)
detect timeout explicitly (error && error.code === 'ETIMEDOUT' or error.killed)
and produce a clear message like "ffmpeg integrity check timed out" (include
filePath for context), (2) make the timeout configurable (e.g., use a parameter
or a constant like ffmpegValidateTimeout) or derive it from file size/duration
rather than a fixed 120000, and (3) ensure non-timeout ffmpeg failures still
surface their numeric exit code or stderr in the error message so execFile's
callback returns an informative error for both timeout and other failures.
- Around line 781-787: The current hasCorruption branch pushes FFmpeg flags as
separate array elements which is fragile; update the block that checks
probe.issues (the hasCorruption variable) to push each option as a single
combined string into strategy.inputOptions (e.g., combine flag and value into
one element for the err_detect option and keep -fflags with its combined value)
so that the options are atomic when later iterated; reference the hasCorruption
check and the strategy.inputOptions array when making this change.
In `@src/workers/VideoEncodingWorker.ts`:
- Around line 336-339: The current check uses errorMsg.includes('183') which is
too broad and yields false positives; update the condition in the FFmpeg error
handling (the block referencing errorMsg and errorDetails where you append
"FFmpeg exit code 183...") to match only explicit exit/code phrases using a
stricter pattern (e.g., test for whole-word sequences like "exit code 183" or
"code 183" via a regex such as a word-boundary match) so frame numbers or
timestamps like "1830" won't trigger the corruption message; keep the existing
append logic to errorDetails but only run it when the refined match against
errorMsg succeeds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 04d918b1-2a24-45f2-b1f0-758429d7b0f5
📒 Files selected for processing (2)
src/services/VideoProcessor.tssrc/workers/VideoEncodingWorker.ts
| return new Promise((resolve) => { | ||
| const proc = execFile(ffmpegPath, ['-v', 'error', '-i', filePath, '-f', 'null', '-'], { | ||
| timeout: 120000, | ||
| }, (error, _stdout, stderr) => { |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Timeout handling produces confusing message; 120s may be insufficient for large files.
When timeout occurs, error.code is 'ETIMEDOUT' (string), resulting in the message "ffmpeg integrity check exited with code ETIMEDOUT" — which isn't user-friendly.
Additionally, for 2+ hour videos mentioned in PR objectives, a 120-second validation timeout may be too short when reading every frame.
♻️ Suggested improvements
return new Promise((resolve) => {
- const proc = execFile(ffmpegPath, ['-v', 'error', '-i', filePath, '-f', 'null', '-'], {
- timeout: 120000,
+ execFile(ffmpegPath, ['-v', 'error', '-i', filePath, '-f', 'null', '-'], {
+ timeout: 300000, // 5 minutes for large files
}, (error, _stdout, stderr) => {
// ... existing code ...
// Non-zero exit with no recognized patterns — flag as potentially corrupt
- if (error && !corrupted && !fatal && errors.length === 0) {
- const exitCode = (error as any).code;
- if (exitCode) {
- errors.push(`ffmpeg integrity check exited with code ${exitCode}`);
+ if (error && !corrupted && !fatal && errors.length === 0) {
+ const exitCode = (error as any).code;
+ if (exitCode === 'ETIMEDOUT') {
+ errors.push('Integrity check timed out — file may be too large or unreadable');
+ corrupted = true;
+ } else if (exitCode) {
+ errors.push(`ffmpeg integrity check exited with code ${exitCode}`);
corrupted = true;
}
}Also applies to: 480-487
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/services/VideoProcessor.ts` around lines 435 - 438, The ffmpeg integrity
check spawned with execFile (using ffmpegPath and filePath) treats an ETIMEDOUT
as a numeric exit code string and uses a confusing message and a hardcoded
120000ms timeout that may be too short for large/long videos; update the Promise
callback in VideoProcessor to (1) detect timeout explicitly (error && error.code
=== 'ETIMEDOUT' or error.killed) and produce a clear message like "ffmpeg
integrity check timed out" (include filePath for context), (2) make the timeout
configurable (e.g., use a parameter or a constant like ffmpegValidateTimeout) or
derive it from file size/duration rather than a fixed 120000, and (3) ensure
non-timeout ffmpeg failures still surface their numeric exit code or stderr in
the error message so execFile's callback returns an informative error for both
timeout and other failures.
| // 12. 🩹 FILE CORRUPTION: Add error-tolerant decoding flags | ||
| const hasCorruption = probe.issues.some(i => i.type === 'file_corruption'); | ||
| if (hasCorruption) { | ||
| strategy.inputOptions.push('-err_detect', 'ignore_err'); | ||
| strategy.inputOptions.push('-fflags', '+discardcorrupt+genpts'); | ||
| reasons.push('corruption detected - using error-tolerant decoding'); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Input options pushed as separate elements — works but fragile.
Pushing -err_detect and ignore_err as separate array elements works because the worker iterates and FFmpeg will parse them correctly in sequence. However, this is inconsistent with typical fluent-ffmpeg patterns and could break if iteration order changes.
♻️ More robust approach
const hasCorruption = probe.issues.some(i => i.type === 'file_corruption');
if (hasCorruption) {
- strategy.inputOptions.push('-err_detect', 'ignore_err');
- strategy.inputOptions.push('-fflags', '+discardcorrupt+genpts');
+ strategy.inputOptions.push('-err_detect ignore_err');
+ strategy.inputOptions.push('-fflags +discardcorrupt+genpts');
reasons.push('corruption detected - using error-tolerant decoding');
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // 12. 🩹 FILE CORRUPTION: Add error-tolerant decoding flags | |
| const hasCorruption = probe.issues.some(i => i.type === 'file_corruption'); | |
| if (hasCorruption) { | |
| strategy.inputOptions.push('-err_detect', 'ignore_err'); | |
| strategy.inputOptions.push('-fflags', '+discardcorrupt+genpts'); | |
| reasons.push('corruption detected - using error-tolerant decoding'); | |
| } | |
| // 12. 🩹 FILE CORRUPTION: Add error-tolerant decoding flags | |
| const hasCorruption = probe.issues.some(i => i.type === 'file_corruption'); | |
| if (hasCorruption) { | |
| strategy.inputOptions.push('-err_detect ignore_err'); | |
| strategy.inputOptions.push('-fflags +discardcorrupt+genpts'); | |
| reasons.push('corruption detected - using error-tolerant decoding'); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/services/VideoProcessor.ts` around lines 781 - 787, The current
hasCorruption branch pushes FFmpeg flags as separate array elements which is
fragile; update the block that checks probe.issues (the hasCorruption variable)
to push each option as a single combined string into strategy.inputOptions
(e.g., combine flag and value into one element for the err_detect option and
keep -fflags with its combined value) so that the options are atomic when later
iterated; reference the hasCorruption check and the strategy.inputOptions array
when making this change.
| if (errorMsg.includes('183') || errorMsg.includes('exit code 183')) { | ||
| errorDetails = (errorDetails ? errorDetails + ' ' : '') + | ||
| 'FFmpeg exit code 183 indicates input file corruption. Try re-downloading the source.'; | ||
| } |
There was a problem hiding this comment.
Exit code 183 detection pattern is overly broad — risk of false positives.
errorMsg.includes('183') will match any string containing "183", including frame numbers (frame=1830), timestamps (183.5), or other numeric values. This could incorrectly append the corruption message to unrelated errors.
🔧 Suggested fix: use more specific pattern
-if (errorMsg.includes('183') || errorMsg.includes('exit code 183')) {
+if (/\b(exit\s+)?(code\s+)?183\b/i.test(errorMsg) || errorMsg.includes('exited with code 183')) {
errorDetails = (errorDetails ? errorDetails + ' ' : '') +
'FFmpeg exit code 183 indicates input file corruption. Try re-downloading the source.';
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/workers/VideoEncodingWorker.ts` around lines 336 - 339, The current check
uses errorMsg.includes('183') which is too broad and yields false positives;
update the condition in the FFmpeg error handling (the block referencing
errorMsg and errorDetails where you append "FFmpeg exit code 183...") to match
only explicit exit/code phrases using a stricter pattern (e.g., test for
whole-word sequences like "exit code 183" or "code 183" via a regex such as a
word-boundary match) so frame numbers or timestamps like "1830" won't trigger
the corruption message; keep the existing append logic to errorDetails but only
run it when the refined match against errorMsg succeeds.
Summary
ffmpeg -v error -i file -f null --err_detect ignore_err,-fflags +discardcorrupt+genpts) that salvage valid video portionsFiles changed
src/services/VideoProcessor.tsvalidateFileIntegrity()method, wired intoprocessVideo(), corruption flags indetermineEncodingStrategy()src/workers/VideoEncodingWorker.tsTest results
Tested with generated video files:
moov atom not found— fatalInvalid NAL unit,partial fileEnd of file,error reading headermoov atom not found— fatalTest plan
npm run buildcompiles cleanCloses #12
🤖 Generated with Claude Code
Summary by CodeRabbit