Skip to content

[FIX] checks_odoo_module_xml: return bytes from _get_first_tag#186

Merged
moylop260 merged 1 commit intoOCA:mainfrom
suniagajose:fix/xml-header-bytes-str-mismatch
Apr 16, 2026
Merged

[FIX] checks_odoo_module_xml: return bytes from _get_first_tag#186
moylop260 merged 1 commit intoOCA:mainfrom
suniagajose:fix/xml-header-bytes-str-mismatch

Conversation

@suniagajose
Copy link
Copy Markdown
Contributor

Problem

When an XML file in the inspected module contains no tag (e.g. an empty .xml file), ChecksOdooModuleXML._get_first_tag returns the Python str ''. The success branch of the same method returns bytes from b' '.join(buffer), and downstream code in check_xml_header consumes the result as bytes:

if not first_tag.startswith(b"<?xml "):

When first_tag is str, startswith(bytes) raises:

TypeError: startswith first arg must be str or a tuple of str, not bytes

This crashes the entire hook for the repository as soon as one empty/tag-less XML file exists anywhere in the inspected modules — even if xml-header-missing is disabled, because the mismatch happens before the enablement check.

Reproduction

import io
from oca_pre_commit_hooks.checks_odoo_module_xml import ChecksOdooModuleXML

inst = ChecksOdooModuleXML.__new__(ChecksOdooModuleXML)
result = inst._get_first_tag(io.BytesIO(b''))
# Before fix: ('', 0) — str
# After fix:  (b'', 0) — bytes

The existing test_repo/broken_module/xml_empty.xml fixture is a 0-byte file and reproduces the issue on real hook runs.

Fix

Return b"" instead of "" in the no-tag path so both branches of _get_first_tag are type-consistent (always bytes), and downstream startswith(b"<?xml ") / decode("UTF-8") calls keep working.

One-line change in src/oca_pre_commit_hooks/checks_odoo_module_xml.py.

Notes

Found while running v0.2.20 against an Odoo 19 addon that contains an empty XML file. The local test suite does not run on Python 3.14 (distutils removed) so verification was done via targeted reproduction; CI will validate against the supported matrix.

When the XML file contains no tag (e.g. an empty file), _get_first_tag
returned the Python str ''. The success branch returns bytes from
b' '.join(buffer), and check_xml_header consumes the result with
first_tag.startswith(b'<?xml '), which raises:

    TypeError: startswith first arg must be str or a tuple of str, not bytes

Return b'' in the no-tag path so both branches are type-consistent and
downstream bytes operations keep working.
@moylop260 moylop260 merged commit 0dfd4fa into OCA:main Apr 16, 2026
18 of 20 checks passed
@moylop260
Copy link
Copy Markdown
Collaborator

Gracias Papito!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants