Update test_gpt_pp baselines for cross_entropy precision alignment by zrr1999 · Pull Request #703 · PaddlePaddle/PaddleFleet

zrr1999 · 2026-03-31T03:14:21Z

Summary

Update MD5 gradient baseline checksums in test_gpt_pp.py to match new values from Paddle PR [PAA] Align cross_entropy precision for hard-label 2D reduction under FLAGS_use_accuracy_compatible_kernel Paddle#78044
The cross_entropy grad kernel now uses AccT (MPType) arithmetic, which changes gradient values propagated through the pipeline parallel backward pass
Loss MD5 remains unchanged (bce3fed95247f1b7a165e32b33d6fca7)

Context

Paddle PR #78044 aligns cross_entropy precision with PyTorch under FLAGS_use_accuracy_compatible_kernel. The grad kernel change (using AccT instead of native T arithmetic) affects gradient computation even in the default path, causing the test_gpt_pp multi-card test to fail with mismatched baseline checksums.

New baseline values are taken from the CI run output of PR #78044.

Test plan

Fleet Unit test (multi-card) should pass with updated baselines when paired with Paddle PR #78044

🤖 Generated with Claude Code

…gnment Update MD5 baseline checksums in test_gpt_pp.py to match the new gradient values produced by Paddle PR #78044, which aligns cross_entropy precision with PyTorch under FLAGS_use_accuracy_compatible_kernel. The cross_entropy grad kernel now uses AccT (MPType) arithmetic instead of native T arithmetic, which changes intermediate gradient values and propagates through the pipeline parallel backward pass. Loss MD5 remains unchanged (bce3fed95247f1b7a165e32b33d6fca7). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Re-enable TestPP that was temporarily skipped by PR PaddlePaddle#698 while waiting for Paddle PR #78013 to merge. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

zrr1999 · 2026-04-01T06:25:49Z

/re-run all-failed

zrr1999 mentioned this pull request Mar 31, 2026

Update test_gpt_pp baselines for cross_entropy precision alignment #702

Open

1 task

zrr1999 and others added 2 commits March 31, 2026 10:55

Revert skip of test_gpt_pp.py unit tests

4371432

Re-enable TestPP that was temporarily skipped by PR PaddlePaddle#698 while waiting for Paddle PR #78013 to merge. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

zrr1999 force-pushed the paa/update-gpt-pp-baseline-develop branch from f3fd895 to 4371432 Compare March 31, 2026 11:30

empty

2ccec73

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update test_gpt_pp baselines for cross_entropy precision alignment#703

Update test_gpt_pp baselines for cross_entropy precision alignment#703
zrr1999 wants to merge 3 commits intoPaddlePaddle:developfrom
zrr1999:paa/update-gpt-pp-baseline-develop

zrr1999 commented Mar 31, 2026 •

edited

Loading

Uh oh!

zrr1999 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zrr1999 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Uh oh!

zrr1999 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zrr1999 commented Mar 31, 2026 •

edited

Loading