-
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
In machine.py, we do something like this, so that only scores for non-special tokens are bubbled up:
for item in zipped:
output_ids, scores, sequence_score, attentions = cast(
Tuple[torch.Tensor, torch.Tensor, Optional[float], Optional[torch.Tensor]], item
)
output_tokens: List[str] = []
output_indices: List[int] = []
for i, output_id in enumerate(output_ids):
id = cast(int, output_id.item())
if id not in all_special_ids:
output_tokens.append(self.tokenizer.convert_ids_to_tokens(id))
output_indices.append(i)
scores = scores[output_indices]
In silnlp, we do something similar downstream in hugging_face_config.py:translate().
However, we grab the sequence_scores directly from the model outputs and these sequence scores seem to include the BOS token score which is close to 0. This will presumably bias the sequence score slightly so that shorter output sequences would have a score closer to zero.
We should confirm that these special token scores are being included in the sequence score and then update silnlp accordingly if they are (or maybe even consider submitting an issue in transformers).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
📋 Backlog