Skip to content

Bugfix/fix race condition#41

Merged
bguise987 merged 3 commits intodevelopfrom
bugfix/fix-race-condition
Mar 11, 2026
Merged

Bugfix/fix race condition#41
bguise987 merged 3 commits intodevelopfrom
bugfix/fix-race-condition

Conversation

@bguise987
Copy link
Copy Markdown
Owner

Fix for issue #34

Peek ahead to determine if the next set of bytes is the end of the file, and pass this along sooner in the process of reading in the source file.

bguise987 and others added 3 commits January 22, 2026 17:34
The _process_chunk method checked _last_chunk to determine whether to
use Z_FINISH, but _last_chunk wasn't set until after the read thread
submitted the final chunk. This caused the last chunk to sometimes be
compressed with Z_SYNC_FLUSH instead of Z_FINISH, producing invalid
gzip files with unterminated deflate streams (00 00 FF FF marker).

Fix by peeking ahead in _read_file to determine is_last before
submitting to the pool, and passing the flag directly to _process_chunk.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@bguise987
Copy link
Copy Markdown
Owner Author

Race Condition Bugfix Validation Results

Background

This branch fixes a race condition in pigz_python where the _process_chunk method checked _last_chunk to determine whether to use Z_FINISH, but _last_chunk wasn't set until after the read thread submitted the final chunk. This caused the last chunk to sometimes be compressed with Z_SYNC_FLUSH instead of Z_FINISH, producing invalid gzip files with unterminated deflate streams.

The fix passes an is_last flag directly to _process_chunk by peeking ahead in _read_file, eliminating the race.

Test Methodology

A validation test script (compression_validation_test.py) was used to repeatedly:

  1. Calculate the MD5 hash of an original file
  2. Compress it using pigz_python
  3. Decompress the resulting .gz file using Python's gzip module
  4. Calculate the MD5 hash of the decompressed file
  5. Compare the two hashes

Three test files of varying type and size were used per iteration:

  • text file
  • PDF
  • binary

The script runs in a continuous loop until a failure is detected or the process is manually stopped.

For each trial, the test was run simultaneously against both the master branch and the bugfix branch using the same worker count. The trial was stopped for the bugfix branch once the master branch was observed to fail. Two configurations were tested: 20 worker threads and the default worker count (system CPU count).

How Master Branch Fails

When the race condition is triggered on the master branch, the compressed output contains an unterminated deflate stream. During decompression, Python's gzip.open() hits an error and the script crashes.

Results: 20 Worker Threads

Trial Bugfix Iterations Bugfix Failures Master Iterations Before Hang Master Failed
1 463 0 401 Yes (hung)
2 366 0 41 Yes (hung)
3 67 0 17 Yes (hung)
4 96 0 57 Yes (hung)
5 1,181 0 146 Yes (hung)

Bugfix total (20 workers): 2,173 iterations, 6,514 file validations, 0 failures

Results: Default Worker Count

Trial Bugfix Iterations Bugfix Failures Master Iterations Before Hang Master Failed
1 37 0 37 Yes (hung)
2 2,209 0 1,734 Yes (hung)
3 805 0 241 Yes (hung)
4 4,602 0 3,910 Yes (hung)

Bugfix total (default workers): 7,653 iterations, 22,955 file validations, 0 failures

Summary

Metric Bugfix Branch Master Branch
Total iterations run 9,826 6,584
Total file validations 29,469 19,730
Failures 0 9 (all trials)
Failure mode N/A Hang during decompression of corrupt gzip

Across 9 trial runs, the master branch hung due to gzip corruption in every single trial. The bugfix branch completed all 9,826 iterations (29,469 individual file compress/decompress/validate cycles) with zero failures. The race condition is reliably triggered under load, and the fix eliminates it.

Copy link
Copy Markdown

@Pathfinder216 Pathfinder216 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks for figuring this out and fixing it!

Copy link
Copy Markdown

@coreyhartley coreyhartley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bguise987 bguise987 merged commit 9a52847 into develop Mar 11, 2026
6 checks passed
bguise987 added a commit that referenced this pull request Mar 11, 2026
* Update github username (#39)

* Update Python versions (#40)

* Update tox.ini for newer file format and new Python versions

* Update GitHub Actions workflow to use newer Python versions

* Update GitHub Actions versions

* Bugfix/fix race condition (#41)

* Fix race condition causing intermittent gzip corruption

The _process_chunk method checked _last_chunk to determine whether to
use Z_FINISH, but _last_chunk wasn't set until after the read thread
submitted the final chunk. This caused the last chunk to sometimes be
compressed with Z_SYNC_FLUSH instead of Z_FINISH, producing invalid
gzip files with unterminated deflate streams (00 00 FF FF marker).

Fix by peeking ahead in _read_file to determine is_last before
submitting to the pool, and passing the flag directly to _process_chunk.

* Update tests to reflect passing in is_last to _process_chunk

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
bguise987 added a commit that referenced this pull request Apr 2, 2026
* Update github username (#39)

* Update Python versions (#40)

* Update tox.ini for newer file format and new Python versions

* Update GitHub Actions workflow to use newer Python versions

* Update GitHub Actions versions

* Specify GitHub actions Python versions as strings

* Bugfix/fix race condition (#41)

* Fix race condition causing intermittent gzip corruption

The _process_chunk method checked _last_chunk to determine whether to
use Z_FINISH, but _last_chunk wasn't set until after the read thread
submitted the final chunk. This caused the last chunk to sometimes be
compressed with Z_SYNC_FLUSH instead of Z_FINISH, producing invalid
gzip files with unterminated deflate streams (00 00 FF FF marker).

Fix by peeking ahead in _read_file to determine is_last before
submitting to the pool, and passing the flag directly to _process_chunk.

* Update tests to reflect passing in is_last to _process_chunk

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* Update how package gets version metadata (#43)

* Update how package gets version metadata

* Add pyproject.toml

* Delete setup.py

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
bguise987 added a commit that referenced this pull request Apr 2, 2026
* Update github username (#39)

* Update Python versions (#40)

* Update tox.ini for newer file format and new Python versions

* Update GitHub Actions workflow to use newer Python versions

* Update GitHub Actions versions

* Specify GitHub actions Python versions as strings

* Bugfix/fix race condition (#41)

* Fix race condition causing intermittent gzip corruption

The _process_chunk method checked _last_chunk to determine whether to
use Z_FINISH, but _last_chunk wasn't set until after the read thread
submitted the final chunk. This caused the last chunk to sometimes be
compressed with Z_SYNC_FLUSH instead of Z_FINISH, producing invalid
gzip files with unterminated deflate streams (00 00 FF FF marker).

Fix by peeking ahead in _read_file to determine is_last before
submitting to the pool, and passing the flag directly to _process_chunk.

* Update tests to reflect passing in is_last to _process_chunk

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* Update how package gets version metadata (#43)

* Update how package gets version metadata

* Add pyproject.toml

* Delete setup.py

* Version bump to 2.0.0

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants