Skip to content

Improving performance when verifying 1000s of files in one history#159

Closed
ptrpfn wants to merge 1 commit intoascmitc:masterfrom
ptrpfn:dev/performance-many-files
Closed

Improving performance when verifying 1000s of files in one history#159
ptrpfn wants to merge 1 commit intoascmitc:masterfrom
ptrpfn:dev/performance-many-files

Conversation

@ptrpfn
Copy link
Collaborator

@ptrpfn ptrpfn commented Mar 10, 2025

Limiting a costly check to the renaming scenarios where it's needed.

The symptom was that the I/O speed on fast drives (for example, during the create command) got slower and slower after 1000s of files verified. This now only happens when the -dr option is set (where it's needed).

No tests have been changed.

@pfn-asm pfn-asm requested a review from Copilot May 16, 2025 13:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves performance when verifying a large number of files by limiting a costly I/O check to scenarios involving renaming.

  • Introduces a condition to execute the expensive check only when renaming is detected.
  • Bypasses the check otherwise, directly adding the file path to the new paths set.
Comments suppressed due to low confidence (1)

ascmhl/commands.py:257

  • Consider computing existing_history.get_relative_file_path(file_path) once before the loop to avoid redundant filesystem access inside the loop.
if media_hash.path == existing_history.get_relative_file_path(file_path):

@pfn-asm
Copy link
Collaborator

pfn-asm commented Jul 4, 2025

covered by pull request #166

@pfn-asm pfn-asm closed this Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants