You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
magic(5) defines flag suffixes on search rules (search/N/<flags>) that alter scan semantics:
/s -- search-start: when the pattern matches at offset N inside the window, the new offset becomes match-START (offset N) rather than match-END (offset N + pattern.len()). This affects relative-offset children of the search rule.
The parser accepts and parses-and-drops all flag letters after the search count (e.g., search/4261301/s in /usr/share/file/magic/images line 114, search/256/w in /usr/share/file/magic/python line 219). The flags are recognized so the magic file loads, but the evaluator does not alter scan or anchor-advance behavior.
This means search/N/s rules behave identically to search/N (anchor advances to match-end), which produces wrong offsets for relative-offset children that expect anchor at match-start. Affects TGA footer detection, regex chains, and several archive scan paths.
Real-world need
/usr/share/file/magic/images:114: search/4261301/s TRUEVISION-XFILE.\0 -- TGA footer; relative-offset children walk backwards from the matched signature.
/usr/share/file/magic/python:219: search/1/w #!\040/usr/bin/python -- shebang detection with whitespace flexibility.
/usr/share/file/magic/macintosh:17: search/2652/b (This\ file\ -- BinHex detection with blank-handling.
/usr/share/file/magic/fonts:260: search/432/s name -- sfnt name table.
Parser -- parse_search_suffix in parser/grammar/type_suffix.rs already consumes the flag letters; have it return the flag struct instead of discarding the letters.
search_bytes_consumed returns match_idx + pattern.len() today (match-end). When /s is set, return match_idx instead so the relative-offset anchor lands at match-start. See GOTCHAS S2.6 for the existing match-end vs window-end fix; this is the next axis.
Codegen -- serialize_type_kind for the new field.
Tests -- Each flag combination needs positive/negative tests; /s needs an explicit relative-offset-child anchor test.
Acceptance criteria
/s makes the resolved anchor land at match-START; relative-offset children read from there
/c//C//w//W//b//B alter the literal-pattern match per magic(5)
Round-trip codegen
Conformance against GNU file for the four real-world rules above
Summary
magic(5) defines flag suffixes on
searchrules (search/N/<flags>) that alter scan semantics:/s-- search-start: when the pattern matches at offset N inside the window, the new offset becomes match-START (offset N) rather than match-END (offset N + pattern.len()). This affects relative-offset children of the search rule./b//B-- blank-handling variants/c//C//w//W//t//T-- shared withstring(see Implement string-type flag semantics (/c /C /w /W /B /b /t /T) #234)The semantic of
/sis the most load-bearing for correctness because it changes anchor-advance behavior for&Nchild rules.Current state (after PR #233)
The parser accepts and parses-and-drops all flag letters after the search count (e.g.,
search/4261301/sin/usr/share/file/magic/imagesline 114,search/256/win/usr/share/file/magic/pythonline 219). The flags are recognized so the magic file loads, but the evaluator does not alter scan or anchor-advance behavior.This means
search/N/srules behave identically tosearch/N(anchor advances to match-end), which produces wrong offsets for relative-offset children that expect anchor at match-start. Affects TGA footer detection,regexchains, and several archive scan paths.Real-world need
/usr/share/file/magic/images:114:search/4261301/s TRUEVISION-XFILE.\0-- TGA footer; relative-offset children walk backwards from the matched signature./usr/share/file/magic/python:219:search/1/w #!\040/usr/bin/python-- shebang detection with whitespace flexibility./usr/share/file/magic/macintosh:17:search/2652/b (This\ file\-- BinHex detection with blank-handling./usr/share/file/magic/fonts:260:search/432/s name-- sfnt name table.Implementation outline
TypeKind::Search { range: NonZeroUsize, flags: SearchFlags }.SearchFlagsmirrors thestringflag struct from Implement string-type flag semantics (/c /C /w /W /B /b /t /T) #234 plus an explicitstart_anchor: boolfield for/s.parse_search_suffixinparser/grammar/type_suffix.rsalready consumes the flag letters; have it return the flag struct instead of discarding the letters.read_searchinevaluator/types/search.rshandles/c//C//w//W//b//Blike Implement string-type flag semantics (/c /C /w /W /B /b /t /T) #234 (case + whitespace flexibility on the literal pattern match).search_bytes_consumedreturnsmatch_idx + pattern.len()today (match-end). When/sis set, returnmatch_idxinstead so the relative-offset anchor lands at match-start. See GOTCHAS S2.6 for the existing match-end vs window-end fix; this is the next axis.serialize_type_kindfor the new field./sneeds an explicit relative-offset-child anchor test.Acceptance criteria
/smakes the resolved anchor land at match-START; relative-offset children read from there/c//C//w//W//b//Balter the literal-pattern match per magic(5)filefor the four real-world rules aboveRefs