-
Notifications
You must be signed in to change notification settings - Fork 247
Rename MLM teacher arg #829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
📝 WalkthroughWalkthroughTwo minor parameter updates: one replaces a command-line flag name for teacher model configuration export in a shell script, and another updates the default top-k value for logits selection in a distillation plugin class. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modelopt/torch/distill/plugins/megatron.py (1)
380-386: Confirm the new defaulttop_k=1024is safe for all supported vocabs.This default change can alter loss behavior or assert for small vocab sizes if any code relies on the default. Please verify usages and update any configs/tests/docs that assumed 1000.
#!/bin/bash # Find instantiations relying on the default and config-driven top_k settings. rg -n -C2 -- 'TopKLogitsKLLoss\(' -g'*.py' rg -n -C2 -- 'logit_kl_topk' -g'*.py' -g'*.yaml' -g'*.yml' -g'*.conf'
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #829 +/- ##
=======================================
Coverage 73.82% 73.82%
=======================================
Files 193 193
Lines 19745 19745
=======================================
Hits 14577 14577
Misses 5168 5168 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: Refactor
Overview: MLM arg changed from
--teacher-model-configto--export-kd-teacher-model-configfor consistencyUsage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.