Hello, I would like to consult the following line of code.
, mlm_tgt_encodings, * = self.utt_encoder.bert(context_mlm_targets[ctx_mlm_mask], context_utts_attn_mask[ctx_mlm_mask])
context_mlm_targets[ctx_mlm_mask] represents the utterance tokenization before [MASK]
context_utts_attn_mask[ctx_mlm_mask] represents the attention mask after [MASK]
They don't match.
Why not recalculate the attention mask?