Skip to content

Conversation

@Sola-ris
Copy link
Contributor

Summary

Call ripgrep with explicit UTF-8 encoding to prevent issues with multi-byte characters.

When calling subprocess.Popen, (which is where silent_subprocess::silent_run eventually ends up) with text=True without specifying the encoding, python will use the encoding returned by locale::getencoding.

On Linux this returns UTF-8 while on Windows it will return cp1252 which causes issues with multi-byte characters like em dashes or Japanese characters.
linux
win

Fixes #1195.

Before

before

I've omitted こんにちは.txt from the screenshot since it hangs the program with the error reported in #1195

After

after

Tasks Completed

  • Platforms Tested:
    • Windows x86
    • Windows ARM
    • macOS x86
    • macOS ARM
    • Linux x86
    • Linux ARM
  • Tested For:
    • Basic functionality
    • PyInstaller executable

Copy link
Collaborator

@Computerdores Computerdores left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, also very nice that you wrote a regression test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Crash when adding file to library with Japanese filename

2 participants