Fix batch collator padding for training with batch size > 1#36
Open
stepan-omelka wants to merge 2 commits intoantoniorv6:masterfrom
Open
Fix batch collator padding for training with batch size > 1#36stepan-omelka wants to merge 2 commits intoantoniorv6:masterfrom
stepan-omelka wants to merge 2 commits intoantoniorv6:masterfrom
Conversation
Owner
|
All the changes seem good for me. However, have you tested on the full-page scenario? Note that this program currently covers both cases (system-level and full-page). Is it possible that you send results on these other scenarios in order to merge? |
antoniorv6
reviewed
Mar 12, 2026
Owner
antoniorv6
left a comment
There was a problem hiding this comment.
Waiting until full-page results are presented.
Contributor
Author
|
Hi, I tried to run the fine-tuning, but I am repeatedly running into an error (I created an issue for it). I also tried to run the pretraining and fine-tuning in the Until the issue is solved, I am unable to actually test the fine-tuning on increased batch size. |
f4eb9ef to
cc786a6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[bug]
When running the training script with a batch size greater than 1,
the process crashed due to mismatched tensor lengths in the decoder
input and ground truth targets.
This change ensures all sequence tensors within a batch are dynamically
padded to the maximum sequence length using the dataset's padding token.
As a result, the model can safely process batches larger than 1 without
encountering tensor dimension conflicts during training or validation.
BatchCollatorfor dynamic sequence padding.run pertaining with batching 4 == orange
