Skip to content

Fixing numerical instabilities at long skip connections#31

Open
julianSchilcher wants to merge 1 commit intobaofff:mainfrom
julianSchilcher:main
Open

Fixing numerical instabilities at long skip connections#31
julianSchilcher wants to merge 1 commit intobaofff:mainfrom
julianSchilcher:main

Conversation

@julianSchilcher
Copy link
Copy Markdown

I encountered numerical instabilities during training caused by the long skip connections, especially with very deep networks. This should fix it by normalizing inputs to long skip connections (analogous to MHSA and MLP)

…long skip connections (analogous to MHSA and MLP)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant