Skip to content

Fast path for FileRegressionFixture.check when contents already match#241

Open
adamtheturtle wants to merge 1 commit intoESSS:masterfrom
adamtheturtle:adamtheturtle/fast-path-check
Open

Fast path for FileRegressionFixture.check when contents already match#241
adamtheturtle wants to merge 1 commit intoESSS:masterfrom
adamtheturtle:adamtheturtle/fast-path-check

Conversation

@adamtheturtle
Copy link
Copy Markdown

Closes #240.

Summary

  • Adds an in-memory byte-exact short-circuit to FileRegressionFixture.check so the pass path avoids writing a .obtained file, re-reading both files, and constructing difflib.HtmlDiff machinery.
  • Threaded via a new optional fast_equal_fn parameter on perform_regression_check; no other fixtures change behaviour.
  • The short-circuit is disabled when --force-regen / --regen-all is set, when the user supplies a custom check_fn, or when the expected file does not yet exist. Mismatches fall through to the existing code path unchanged.

Timings

Measured against master with a 200-line text golden, 1000 iterations per run, macOS / Python 3.13.9:

Run Upstream check Byte-exact fast path Speedup
1 86.8 us/call 11.9 us/call 7.3x
2 99.0 us/call 11.9 us/call 8.3x
3 93.9 us/call 12.9 us/call 7.3x

On the mismatch path the fast path adds one read_bytes() + one encode() before falling through — within noise of the ~800 us the existing mismatch branch already spends on the .obtained write and HTML diff.

Tests

  • test_skips_obtained_write_on_match and _binary — fast path does not write .obtained when contents match.
  • test_writes_obtained_on_mismatch — mismatch still writes .obtained and raises FILES DIFFER.
  • test_custom_check_fn_disables_fast_path — user-supplied check_fn always receives an obtained file.
  • All 82 existing tests still pass.

Happy to adjust API shape / docs / CHANGELOG wording as needed.

Short-circuit the pass path with a byte-exact in-memory comparison
so the .obtained file, file re-reads, and difflib machinery are only
triggered on a real mismatch. Behaviour is preserved for mismatches,
custom check_fn, --force-regen, and --regen-all.
Copy link
Copy Markdown
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adamtheturtle for the contribution. Please take a look at my comment. 👍

the basename.
:param obtained_filename: complete path to use to write the obtained file. By
default will prepend `.obtained` before the file extension.
:param fast_equal_fn: Optional function receiving the expected file path and returning
Copy link
Copy Markdown
Member

@nicoddemus nicoddemus Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this extra complexity?

Seems like the short-circuit could just convert the input into bytes, and compare directly with the written file (as bytes).

This is an optimization: if for some reason the user has changed the line ending or the encoding, then the short circuit will fail, and fallback to the standard path of doing the full comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fast path for FileRegressionFixture.check when contents already match

2 participants