Skip to content

feat(training): add 1B FineWeb-Edu training script with stochastic-de…

9af8361
Select commit
Loading
Failed to load commit list.
Open

Model improvements: FSDP dtype + MoE dispatch + ACT deadlock fix + stochastic-depth (Option B) #56

feat(training): add 1B FineWeb-Edu training script with stochastic-de…
9af8361
Select commit
Loading
Failed to load commit list.

Workflow runs completed with no jobs