Skip to content

Problem about encoding handling in testing module on Windows with Chinese locale #194

@aqni

Description

@aqni

Description

I'm encountering what appears to be an encoding issue with the beangulp.testing module when running tests on Windows systems with Chinese locale. I'm not entirely certain if this is an actual issue with the library or if I'm using it incorrectly.

The problem seems to occur because the open() function is called without specifying an encoding parameter in multiple places in the testing module. On Windows systems with Chinese locale, Python's default encoding is GBK, while many tools (including Fava) appear to expect UTF-8 encoding.

Potential Problem Locations

In beangulp/testing.py, there seem to be open() calls without explicit encoding in:

  1. write_expected_file() function (around line 45):
with open(filepath, mode) as expfile:
  1. compare_expected() function (around line 59):
with open(filepath, 'r') as infile:

Observed Behavior

When running tests on Windows systems with Chinese locale:

  1. Expected files seem to be written using GBK encoding
  2. Other tools like Fava appear to attempt to read these files as UTF-8
  3. This seems to cause UnicodeDecodeError or character corruption

Possible Solution

Would it be appropriate to specify UTF-8 encoding explicitly in the open() calls?

# In write_expected_file()
with open(filepath, mode, encoding='utf-8') as expfile:

# In compare_expected()
with open(filepath, 'r', encoding='utf-8') as infile:

Broader Consideration

While I've specifically noticed this in the testing module, it might be worth checking if similar issues exist in other parts of the beangulp codebase. Explicitly specifying UTF-8 encoding for file operations could help ensure consistent behavior across different operating systems and locales throughout the library.

Environment

  • Windows with Chinese locale
  • Python 3.11
  • beangulp version 0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions