-
Notifications
You must be signed in to change notification settings - Fork 31.2k
Remove unnecessary tied weights for Seamless M4T? #42377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
run-slow: seamless_m4t |
|
This comment contains models: ["models/seamless_m4t"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
CI Results✅ No failing test specific to this PR 🎉 ! |
|
Integration tests are actually failing: https://github.com/huggingface/transformers/actions/runs/19661588575/job/56309175315#step:14:21132 but as mentioned above, this is the case before the above mentioned fix |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: seamless_m4t |
|
Hi! likely linked to #42385 as well |
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty for having a look! this one is a bit bigger would be happy if we have a more general solution
| def tie_weights(self, missing_keys: Optional[set[str]] = None, recompute_mapping: bool = True): | ||
| """We need to overload here to handle the wrong key saved in some main checkpoints.""" | ||
| if self.config.tie_word_embeddings: | ||
| # Some model checkpoints like "facebook/hf-seamless-m4t-medium"'s embedding weight is in decoder.embed_tokens, | ||
| # need check here | ||
| if missing_keys is not None: | ||
| if "lm_head.weight" in missing_keys and "model.decoder.embed_tokens.weight" not in missing_keys: | ||
| self.lm_head.weight = self.decoder.embed_tokens.weight | ||
| missing_keys.discard("lm_head.weight") | ||
|
|
||
| # needs to be done after, otherwise it raises an Error because the correct weights are not present | ||
| super().tie_weights(missing_keys=missing_keys, recompute_mapping=recompute_mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep this makes sense, tho I think the explicit mapping can work in revers basically. #42362.
It affects more models!
|
Converting to draft for now and essentially closing it when we have the reverse mapping PR landing. Seems like more models are affected so it's not worth to create exceptions everywhere! |
What does this PR do?
The tied weights refactoring from #41580 may have introduced an error for the Seamless M4T model? Opening this PR to not lose track.
Perhaps #42362 will address?
Modeling tests before fix (11 errors)
Modeling tests after fix (7 errors)
Essentially such errors are resolved:
But integration tests are failing which is probably better for another PR?
cc: @vasqu, @ArthurZucker