[Dev] Add Llama3 training example and fix cache save by wtr0504 · Pull Request #14 · SandAI-org/MagiCompiler

wtr0504 · 2026-04-01T11:44:24Z

🗂️ PR Category

📝 Description

Summary

Add end-to-end Llama3 training example (example/training/) with FSDP support, a distributed training script, and an Nsys profiling launch script.
Fix a cache save bug where aot_autograd artifacts were empty, causing compiled graphs to fail to persist correctly.

Changes

example/training/llama3.py — Llama3 model definition adapted to use magi_compile
example/training/train.py — distributed training loop with FSDP and NVTX profiling hooks
example/training/train.sh — torchrun launcher with optional Nsys profiling
magi_compiler/magi_backend/piecewise_compiler.py — workaround for empty aot_autograd artifacts on cache save
magi_compiler/utils/nvtx.py — profiler for iteration
requirements-test.txt — update requirements

example/training/llama3.py

example/training/train.py

example/training/train.sh

jiahy0825

LGTM

wtr0504 added 3 commits April 1, 2026 19:42

Add Llama3 training example and fix cache save

1ce7ec3

fix ci

dbe27ab

chore

29624e0

jiahy0825 reviewed Apr 1, 2026

View reviewed changes

example/training/llama3.py Show resolved Hide resolved

example/training/train.py Outdated Show resolved Hide resolved

example/training/train.sh Outdated Show resolved Hide resolved

Simplify script and add requirement

6a6844a

jiahy0825 approved these changes Apr 2, 2026

View reviewed changes

jiahy0825 merged commit 8f931af into SandAI-org:main Apr 2, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev] Add Llama3 training example and fix cache save#14

[Dev] Add Llama3 training example and fix cache save#14
jiahy0825 merged 4 commits intoSandAI-org:mainfrom
wtr0504:dev/training

wtr0504 commented Apr 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiahy0825 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wtr0504 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗂️ PR Category

📝 Description

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiahy0825 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wtr0504 commented Apr 1, 2026 •

edited

Loading