Skip to content

Issue with Softmax in _coordinate_selection Leading to Saturated Outputs #15

@dezhi0730

Description

@dezhi0730

Hello,

Thank you for maintaining this repository and the effort you've put into it. While working with the model, I encountered an issue related to the softmax function in the _coordinate_selection function. Specifically, the softmax output often becomes extremely saturated, where only one element in the position_probs tensor is 1, and all others are 0. This behavior is unexpected and may be causing problems with selecting edit positions.

Issue Details:

  • The issue occurs in the _coordinate_selection function.
  • After applying softmax(dim=-1) to the position_probs tensor, the output shows only one element with a value of 1, while all others are 0.
  • As a result, the element with a value of 1 is always selected, and the other edit positions are randomly chosen, which is likely not the desired outcome.
  • If my is_corrupted tensor is targeting a specific region, such as the first half of the tokenized_seq, I noticed that my sequence is still changing in the second half.

Exp:

image

image
image
Please feel free to reach out if further clarification is needed.

Best regards.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions