Dream merge: dynamic platform detection + canonical MIL syntax by codegen-sh[bot] · Pull Request #5 · dermitchell1993/ANE

codegen-sh · 2026-03-04T03:30:39Z

What This Does

Unifies PR #3 (M1/M2 canonical verbose MIL syntax + fp16 I/O fallback) with PR #4 (runtime chip/OS detection) into a single solution that works everywhere AND optimizes per-platform.

The Core Insight

PR #3 and PR #4 address orthogonal dimensions:

PR fix: MIL syntax + M1/M2 backward compatibility #3: Syntax compatibility — canonical verbose MIL that compiles on every CoreML version
PR Fix compile-breaking ane_print_platform signature + hardcoded peak TFLOPS #4: Target optimization — detect the chip/OS and emit the best MIL target

By combining them, we get both. No conflicts, no compromises.

Changes (16 files, +377 / -98)

New:

training/ane_compat.h — Runtime platform detection (chip family M1→M5, macOS 13→15+, peak TFLOPS, MIL target selection)

Converted across 15 files:

38× program(1.0) → program(%s) with g_ane_platform.mil_program
44× func main<ios16> → func main<%s> with ane_mil_target()
1× hardcoded 0.019 TFLOPS → ane_peak_tflops() (chip-specific)
6× appendString:MIL_HDR → appendFormat:MIL_HDR, g_ane_platform.mil_program

Preserved:

PR fix: MIL syntax + M1/M2 backward compatibility #3's fp16 I/O auto-retry mechanism for M1/M2 (cast op fallback)
Canonical verbose buildInfo syntax (universal CoreML compatibility)
int g_fp16_io = 0 linker fix in train.m

Platform Detection Matrix

macOS	MIL Target	Program	Example Chips
13 (Ventura)	ios16	1.0	M1, M2
14 (Sonoma)	ios17	1.0	M2, M3
15+ (Sequoia)	ios18	1.3	M3, M4, M5

TFLOPS (FP16 est.)

M1: 5.5 → M2: 7.9 → M3: 9.0 → M4: 15.8 → M5: 19.0 (Ultra variants 2×)

Supersedes #3 and #4.

💻 View my work • 👤 Initiated by @dermitchell1993 • About Codegen
⛔ Remove Codegen from PR • 🚫 Ban action checks

Port upstream PR #6 (imperatormk) - fixes MIL scalar type syntax from M4-only shorthand to canonical verbose format that compiles on all Apple Silicon (M1/M2/M3/M4). Changes: - program(1.3) to program(1.0), ios18 to ios16 target - Scalar type shorthand to canonical verbose format - Simplified buildInfo dict (no M4-specific version strings) - fp16 I/O fallback: g_fp16_io flag with auto-retry on compile failure for M1/M2 where cast op is unsupported - Dynamic IOSurface byte calculation (bpe: 2 for fp16, 4 for fp32) Tested on M1 Pro, macOS 26.3 (per upstream PR author).

train.m includes ane_mil_gen.h (via backward.h -> model.h) which declares extern int g_fp16_io, but train.m never defined it -- producing an undefined symbol linker error. Changes: - train.m: add g_fp16_io = 0 at file scope, wrap model_compile_kernels with auto-retry (try fp32, on fail set g_fp16_io=1, retry fp16) - model.h: compile_conv_kernel IOSurface byte calculation now uses g_fp16_io ? 2 : 4 (was hardcoded to 4) - .gitignore: add train binary + test/probe binaries

Integrates both PR #3 (M1/M2 canonical verbose MIL syntax + fp16 I/O fallback) and PR #4 (runtime chip/OS detection via ane_compat.h) into a unified solution that works everywhere AND optimizes per-platform. Changes across 16 files: - Add training/ane_compat.h: runtime platform detection library (chip family, macOS version, MIL target selection, peak TFLOPS) - Convert all 38 hardcoded program(1.0) -> program(%s) with g_ane_platform.mil_program dynamic argument - Convert all 44 hardcoded func main<ios16> -> func main<%s> with ane_mil_target() dynamic argument - Replace hardcoded 0.019 TFLOPS constant with ane_peak_tflops() - Add #include ane_compat.h and platform init to 14 consumer files - Preserve PR #3's fp16 I/O auto-retry mechanism for M1/M2 - Use canonical verbose buildInfo syntax (universal compatibility) Co-authored-by: dermitchell1993 <dmitchell1993@aliasvault.net>

codegen-sh bot assigned dermitchell1993 Mar 4, 2026

codegen-sh bot mentioned this pull request Mar 4, 2026

fix: MIL syntax + M1/M2 backward compatibility #3

Closed

codegen-sh bot and others added 3 commits March 5, 2026 15:29

codegen-sh bot force-pushed the codegen-bot/dream-merge-compat-detect-b4e9c2 branch from 929cbe2 to ffd8272 Compare March 5, 2026 15:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dream merge: dynamic platform detection + canonical MIL syntax#5

Dream merge: dynamic platform detection + canonical MIL syntax#5
codegen-sh[bot] wants to merge 3 commits intomainfrom
codegen-bot/dream-merge-compat-detect-b4e9c2

codegen-sh bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

codegen-sh bot commented Mar 4, 2026

What This Does

The Core Insight

Changes (16 files, +377 / -98)

Platform Detection Matrix

TFLOPS (FP16 est.)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant