Skip to content

Remove storing gradient of freezed language model parameter#4

Open
floatingsun wants to merge 2 commits intoQuentin-Anthony:masterfrom
floatingsun:master
Open

Remove storing gradient of freezed language model parameter#4
floatingsun wants to merge 2 commits intoQuentin-Anthony:masterfrom
floatingsun:master

Conversation

@floatingsun
Copy link

Language model params except adapter won't be updated, therefore we shouldn't store their gradient. Too much uncessary GPU usage when using large LM model.
This test shows All params of language model requires grad

@floatingsun floatingsun changed the title fix freeze language parameter Remove storing gradient of freezed language parameter Mar 28, 2023
@floatingsun floatingsun changed the title Remove storing gradient of freezed language parameter Remove storing gradient of freezed language model parameter Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant