Skip to content

Update test_gpt_pp baselines for cross_entropy precision alignment#702

Open
zrr1999 wants to merge 1 commit intoPaddlePaddle:release/0.2from
zrr1999:paa/update-gpt-pp-baseline
Open

Update test_gpt_pp baselines for cross_entropy precision alignment#702
zrr1999 wants to merge 1 commit intoPaddlePaddle:release/0.2from
zrr1999:paa/update-gpt-pp-baseline

Conversation

@zrr1999
Copy link
Copy Markdown
Member

@zrr1999 zrr1999 commented Mar 31, 2026

Summary

devPR: #703

Context

Paddle PR #78044 aligns cross_entropy precision with PyTorch under FLAGS_use_accuracy_compatible_kernel. The grad kernel change (using AccT instead of native T arithmetic) affects gradient computation even in the default path, causing the test_gpt_pp multi-card test to fail with mismatched baseline checksums.

New baseline values are taken from the CI run output of PR #78044.

Test plan

  • Fleet Unit test (multi-card) should pass with updated baselines when paired with Paddle PR #78044

🤖 Generated with Claude Code

…gnment

Update MD5 baseline checksums in test_gpt_pp.py to match the new gradient
values produced by Paddle PR #78044, which aligns cross_entropy precision
with PyTorch under FLAGS_use_accuracy_compatible_kernel.

The cross_entropy grad kernel now uses AccT (MPType) arithmetic instead of
native T arithmetic, which changes intermediate gradient values and
propagates through the pipeline parallel backward pass.

Loss MD5 remains unchanged (bce3fed95247f1b7a165e32b33d6fca7).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant