Skip to content

Conversation

@bfarzin
Copy link

@bfarzin bfarzin commented Jul 19, 2019

Adding init before training Transformer in the 8-Translation Notebook (with a note about why I did it in markdown.) Helps to train better, even without label smoothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant