forked from apple/axlearn
-
Notifications
You must be signed in to change notification settings - Fork 6
Pull requests: patrick-toulme/axlearn
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
resume training with next batch of data
#32
opened Dec 11, 2024 by
aws-zhenguo
Collaborator
Loading…
logit_bias support for NEW_UNSHARDED_ATTN_KERNEL
#24
opened Dec 4, 2024 by
HahTK
Collaborator
Loading…
tokens_per_batch fixed to take into account DP and micro-batch accumu…
#13
opened Oct 25, 2024 by
amithrm
Collaborator
Loading…
Gradient accumulation with single graph and lax.scan
#6
opened Apr 11, 2024 by
apoorvtintin
Collaborator
Loading…
Refactor neuron changes to make it compatible with FS repo
#4
opened Mar 27, 2024 by
aws-mengchiy
Collaborator
Loading…
gradient accumulation using optax multisteps*
#2
opened Mar 25, 2024 by
apoorvtintin
Collaborator
Loading…
ProTip!
Adding no:label will show everything without a label.