fix: call ripgrep with explicit utf-8 encoding. #1199

Sola-ris · 2025-11-11T21:50:31Z

Summary

Call ripgrep with explicit UTF-8 encoding to prevent issues with multi-byte characters.

When calling subprocess.Popen, (which is where silent_subprocess::silent_run eventually ends up) with text=True without specifying the encoding, python will use the encoding returned by locale::getencoding.

On Linux this returns UTF-8 while on Windows it will return cp1252 which causes issues with multi-byte characters like em dashes or Japanese characters.

Fixes #1195.

Before

I've omitted こんにちは.txt from the screenshot since it hangs the program with the error reported in #1195

After

Tasks Completed

Platforms Tested:
- Windows x86
- Windows ARM
- macOS x86
- macOS ARM
- Linux x86
- Linux ARM
Tested For:
- Basic functionality
- PyInstaller executable

Computerdores

Code looks good, also very nice that you wrote a regression test

fix: call ripgrep with explicit utf-8 encoding.

fa64c27

Computerdores approved these changes Nov 12, 2025

View reviewed changes

Sola-ris mentioned this pull request Nov 12, 2025

feat: run tests on windows and macOS. #1201

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: call ripgrep with explicit utf-8 encoding. #1199

fix: call ripgrep with explicit utf-8 encoding. #1199

Uh oh!

Sola-ris commented Nov 11, 2025

Uh oh!

Computerdores left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: call ripgrep with explicit utf-8 encoding. #1199

Are you sure you want to change the base?

fix: call ripgrep with explicit utf-8 encoding. #1199

Uh oh!

Conversation

Sola-ris commented Nov 11, 2025

Summary

Before

After

Tasks Completed

Uh oh!

Computerdores left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants