mask missing in ScaledDotProductAttention call by thierryherrmann · Pull Request #8 · keitakurita/Practical_NLP_in_PyTorch

thierryherrmann · 2019-09-14T20:55:16Z

I'm a newbie in attention mechanisms but I think the mask arg is missing when calling the ScaledDotProductAttention block.

Also I noticed in github the right args are passed in the DecoderBlock when calling the 2nd MultiHeadAttention :
att = self.attn_head(queries=x, keys=enc_out, values=enc_out, mask=tgt_mask)

but in the blog, it's still:
att = self.attn_head(queries=att, keys=x, values=x, mask=tgt_mask)

It would be nice to fix it too there.

Kudos for your fanstastic blog! Enlightening!

mask missing in DotProductAttention call

94ee100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mask missing in ScaledDotProductAttention call#8

mask missing in ScaledDotProductAttention call#8
thierryherrmann wants to merge 1 commit intokeitakurita:masterfrom
thierryherrmann:mask_missing

thierryherrmann commented Sep 14, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thierryherrmann commented Sep 14, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant