Summary
magic(5) allows a \^ prefix on a use subroutine identifier to mean "invoke the named subroutine but FLIP the endianness of every read inside it." This is how the same subroutine body can serve both little-endian and big-endian variants of a format.
Current state (after PR #233)
The parser accepts and silently strips the \^ prefix in parse_name_or_use_meta, then dispatches to the named subroutine identically to a bare use name. The endianness flip is not applied -- reads inside the subroutine use whatever endianness the rule body specifies.
This means rules like >0 use \^squashfs (little-endian Squashfs detection invoking the big-endian-defined squashfs subroutine) execute every internal belong/bequad read as big-endian, producing garbage values. The top-level format is detected (the calling rule's 0 string hsqs Squashfs filesystem, little endian, matches), but the subroutine's metadata fields render incorrectly.
Real-world need
/usr/share/file/magic/filesystems:2206: >0 use \^squashfs -- LE Squashfs uses BE-defined subroutine
/usr/share/file/magic/elf:350: >>0 use \^elf-le -- MSB ELF uses LE-defined subroutine
These are common in any format with both endian variants where the magic-file author wants to share the rule body.
Implementation outline
-
AST -- Either:
- Extend
MetaType::Use from Use(String) to Use { name: String, endian_flip: bool }, OR
- Add
endian_flip: bool to RuleEnvironment lookup context
Option A is cleaner -- the flip is a property of the use-site, not the subroutine.
-
Parser -- parse_name_or_use_meta already detects and strips the \^ prefix; have it record the flag instead of dropping it.
-
Evaluator -- evaluator/engine/mod.rs::evaluate_use_rule (or wherever subroutine dispatch lives). When endian_flip is true, recursively walk the subroutine's rule tree and toggle every Endianness::Little <-> Endianness::Big (and equivalent fields on TypeKind::Short/Long/Quad/Float/Double/Date/QDate/String16/PString/OffsetSpec::Indirect.endian/OffsetSpec::Indirect.pointer_type).
The flip must be applied to the cloned subtree per-invocation -- the same subroutine may be invoked both flipped and unflipped from different sites. Don't mutate the shared RuleEnvironment::name_table entry.
Implementation choice: walk-and-clone at invocation time vs. cache pre-flipped versions. Walk-and-clone is simpler; caching is faster if the same flipped subroutine is invoked many times. Start with walk-and-clone.
-
Codegen -- Update serialize_meta_type (or wherever MetaType::Use is serialized) to round-trip the flag.
-
Tests -- Synthetic LE Squashfs fixture + LE ELF fixture matched against the system magic database; verify the flipped-read fields render correctly (e.g., "version 1.0" not "version 0.1").
Acceptance criteria
Out of scope
- Recursive
\^ (a flipped subroutine that itself uses another \^) -- magic(5) doesn't define this; reject or document.
Refs
Summary
magic(5) allows a
\^prefix on ausesubroutine identifier to mean "invoke the named subroutine but FLIP the endianness of every read inside it." This is how the same subroutine body can serve both little-endian and big-endian variants of a format.Current state (after PR #233)
The parser accepts and silently strips the
\^prefix inparse_name_or_use_meta, then dispatches to the named subroutine identically to a bareuse name. The endianness flip is not applied -- reads inside the subroutine use whatever endianness the rule body specifies.This means rules like
>0 use \^squashfs(little-endian Squashfs detection invoking the big-endian-definedsquashfssubroutine) execute every internalbelong/bequadread as big-endian, producing garbage values. The top-level format is detected (the calling rule's0 string hsqs Squashfs filesystem, little endian,matches), but the subroutine's metadata fields render incorrectly.Real-world need
/usr/share/file/magic/filesystems:2206:>0 use \^squashfs-- LE Squashfs uses BE-defined subroutine/usr/share/file/magic/elf:350:>>0 use \^elf-le-- MSB ELF uses LE-defined subroutineThese are common in any format with both endian variants where the magic-file author wants to share the rule body.
Implementation outline
AST -- Either:
MetaType::UsefromUse(String)toUse { name: String, endian_flip: bool }, ORendian_flip: booltoRuleEnvironmentlookup contextOption A is cleaner -- the flip is a property of the use-site, not the subroutine.
Parser --
parse_name_or_use_metaalready detects and strips the\^prefix; have it record the flag instead of dropping it.Evaluator --
evaluator/engine/mod.rs::evaluate_use_rule(or wherever subroutine dispatch lives). Whenendian_flipis true, recursively walk the subroutine's rule tree and toggle everyEndianness::Little <-> Endianness::Big(and equivalent fields onTypeKind::Short/Long/Quad/Float/Double/Date/QDate/String16/PString/OffsetSpec::Indirect.endian/OffsetSpec::Indirect.pointer_type).The flip must be applied to the cloned subtree per-invocation -- the same subroutine may be invoked both flipped and unflipped from different sites. Don't mutate the shared
RuleEnvironment::name_tableentry.Implementation choice: walk-and-clone at invocation time vs. cache pre-flipped versions. Walk-and-clone is simpler; caching is faster if the same flipped subroutine is invoked many times. Start with walk-and-clone.
Codegen -- Update
serialize_meta_type(or whereverMetaType::Useis serialized) to round-trip the flag.Tests -- Synthetic LE Squashfs fixture + LE ELF fixture matched against the system magic database; verify the flipped-read fields render correctly (e.g., "version 1.0" not "version 0.1").
Acceptance criteria
>0 use \^squashfsflips every read endian inside the subroutine callUse { endian_flip: true }fileoutputOut of scope
\^(a flipped subroutine that itself uses another\^) -- magic(5) doesn't define this; reject or document.Refs
\^prefix consumed but not honored; documented in commit body and AGENTS.md)usedirective