Skip to content

[SOT][3.14] Update the internal function to use PyObject_GC_Track replace _PyObject_GC_TRACK#78558

Merged
SigureMo merged 4 commits intoPaddlePaddle:developfrom
gouzil:sot/fix_314x_build_error
Apr 2, 2026
Merged

[SOT][3.14] Update the internal function to use PyObject_GC_Track replace _PyObject_GC_TRACK#78558
SigureMo merged 4 commits intoPaddlePaddle:developfrom
gouzil:sot/fix_314x_build_error

Conversation

@gouzil
Copy link
Copy Markdown
Member

@gouzil gouzil commented Apr 1, 2026

PR Category

Execute Infrastructure

PR Types

Bug fixes

Description

  • 修复 debug 模式下 _PyGen_GetGeneratorFromFrame 内部函数编译错误
  • 修复 _PyObject_GC_TRACK 内部函数因为宏展开导致的更多内部函数使用,报错: undefined symbol: _Py_TriggerGC。 改用公开的 api PyObject_GC_Track

相关链接

是否引起精度变化

Copilot AI review requested due to automatic review settings April 1, 2026 16:15
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 1, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes CPython 3.14 internal-API integration issues in Paddle’s SOT pybind layer, addressing compilation problems in debug builds by adjusting which CPython headers/APIs are used.

Changes:

  • Include internal/pycore_genobject.h to ensure _PyGen_GetGeneratorFromFrame and related generator internals are available for compilation.
  • Replace _PyObject_GC_TRACK usage with the public PyObject_GC_Track API when tracking frame objects for GC.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 143 to 146
}
if (!_PyObject_GC_IS_TRACKED((PyObject *)f)) {
_PyObject_GC_TRACK((PyObject *)f);
PyObject_GC_Track((PyObject *)f);
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title says to replace PyObject_GC_Track with _PyObject_GC_TRACK, but the actual change does the opposite (switches _PyObject_GC_TRACK to PyObject_GC_Track). Please update the title (and/or description) to match the implemented fix to avoid confusion for reviewers and release notes.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title says to replace PyObject_GC_Track with _PyObject_GC_TRACK, but the actual change does the opposite (switches _PyObject_GC_TRACK to PyObject_GC_Track). Please update the title (and/or description) to match the implemented fix to avoid confusion for reviewers and release notes.

已修正。PR 标题和描述里现在都已经改成实际代码方向:将 _PyObject_GC_TRACK 替换为公开 API PyObject_GC_Track

@SigureMo
Copy link
Copy Markdown
Member

SigureMo commented Apr 1, 2026

@ShigureNyako 喵子来 review 下,是 #78484 遇到的 SOT CI 问题的修复 PR

@SigureMo SigureMo changed the title fix: Update the internal function to use _PyObject_GC_TRACK replace PyObject_GC_Track [SOT] Update the internal function to use _PyObject_GC_TRACK replace PyObject_GC_Track Apr 1, 2026
@SigureMo SigureMo changed the title [SOT] Update the internal function to use _PyObject_GC_TRACK replace PyObject_GC_Track [SOT][3.14] Update the internal function to use _PyObject_GC_TRACK replace PyObject_GC_Track Apr 1, 2026
Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这次改动的主线我确认了:它是在修正 Python 3.14 下 SOT cpython_internals 的兼容性问题,实际方向是补齐 _PyGen_GetGeneratorFromFrame 所需的内部头,并把 _PyObject_GC_TRACK 改回公开 API PyObject_GC_Track,不是把修复方向做反了。

我本地按 Python 3.14 头文件做了最小验证:

  1. 当前 develop 版本的 paddle/fluid/pybind/sot/cpython_internals/internals_3_14.c 会因为缺少 <internal/pycore_genobject.h> 而直接报 _PyGen_GetGeneratorFromFrame 未声明;
  2. 只补这个头之后,如果继续保留 _PyObject_GC_TRACK,目标文件会额外依赖私有符号 _Py_TriggerGC(在 macOS 下表现为 __Py_TriggerGC),链接 Python 3.14 framework 时会 unresolved;改成公开的 PyObject_GC_Track 后可以正常链接。

所以这次改动修的是根因,不只是表面绕过;并且和 upstream/CPython 3.14 的预期是一致的:Include/objimpl.h 公开导出的是 PyObject_GC_TrackInclude/internal/pycore_gc.h 里的 _PyObject_GC_TRACK 仍然是 private/internal 路径。对 ABI / 私有符号风险来说,这个方向也是更安全的。

唯一的小问题是 PR 标题/描述里关于 _PyObject_GC_TRACKPyObject_GC_Track 的替换方向仍然写反了,不过这是表述问题,不影响这次代码本身的正确性。整体代码改动我这边 approve。

@gouzil gouzil changed the title [SOT][3.14] Update the internal function to use _PyObject_GC_TRACK replace PyObject_GC_Track [SOT][3.14] Update the internal function to use PyObject_GC_Track replace _PyObject_GC_TRACK Apr 1, 2026
@paddle-bot paddle-bot bot added the contributor External developers label Apr 1, 2026
Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已复查最新 head(439ae1119622933f3413186bea1afeaf6f5f1680)。新增提交只是在 .github/workflows/_SOT.yml 中把 SOT 任务的 runner group 从 GZ_BD-CPU 调整为 SOT-314-debug,没有改动前面已经 review 过的 CPython 3.14 兼容性修复逻辑;PR 标题和描述里关于 PyObject_GC_Track / _PyObject_GC_TRACK 的方向也已经修正。

我同步看了当前失败的 CI:

  • PR-CI-SOT 现在报的是 numpy 在 Python 3.14 环境下 module 'numpy' has no attribute 'dtype',属于导入阶段的环境 / 基线问题,不是这次 _PyGen_GetGeneratorFromFrame / PyObject_GC_Track 改动触发的逻辑回归;
  • Linux-IXUCA 失败点是外部 Paddle-iluvatar patch apply 失败,也和这次 SOT 代码改动无直接关系。

基于最新提交内容和当前日志,我这边对这次 PR 的代码结论不变,继续 approve。

@gouzil
Copy link
Copy Markdown
Member Author

gouzil commented Apr 2, 2026

@ShigureNyako 排查一下 PR-CI-SOT 的日志看看是什么原因导致的

@ShigureNyako
Copy link
Copy Markdown
Contributor

@ShigureNyako 排查一下 PR-CI-SOT 的日志看看是什么原因导致的

我把 PR-CI-SOT 的完整日志过了一遍,结论比较明确:这次失败不是当前 PR 里 _PyGen_GetGeneratorFromFrame / PyObject_GC_Track 那两处改动打出来的,而是 Python 3.14 测试环境里的 NumPy 状态坏掉了

关键信息链如下:

  1. 这条 CI 用到的脚本是 ci/run_sot_test.sh:45-51,也是 [SOT][3.14] Enable all SOT unittests in Python 3.14 #76804 当时为 Python 3.14 加的 NumPy workaround:

    • pip uninstall -y numpy
    • pip install "numpy>=2.3.5"
  2. 这次日志里 Python 3.14 环境一开始看到的是 numpy 2.3.5

    • Requirement already satisfied: numpy>=1.21 in /usr/local/lib/python3.14/dist-packages ... (2.3.5)
  3. 装完当前 PR 产出的 wheel 之后,脚本执行了这段 3.14 特判:

    • Found existing installation: numpy 2.3.5
    • Successfully uninstalled numpy-2.3.5
    • 紧接着又出现 Requirement already satisfied: numpy>=2.3.5 ... (2.4.2)

    也就是说,这一步并没有做一次干净的重装,而是命中了 runner 里已经残留的 numpy 2.4.2 状态。

  4. 随后第一个用 python3.14 跑的用例 ./test_01_basic.py 就在 import paddle 阶段挂了,栈直接落到:

    • paddle/pir/core.py:57
    • np.dtype("float32")
    • AttributeError: module 'numpy' has no attribute 'dtype'

我本地单独拿干净的 Python 3.14 环境验证过,正常安装的 NumPy 2.4.2 是有 np.dtype,所以这次 CI 里的问题不是“NumPy 2.4.2 本身不支持这个调用”,而是 runner 上的 Python 3.14 site-packages 处在一个脏/残缺状态。

所以这次 PR-CI-SOT 更像是 Python 3.14 baseline / CI 环境问题,不是这次 PR 改动本身引入的回归。要让这条 CI 真正恢复有效,ci/run_sot_test.sh 这段 3.14 分支最好改成强制重装,例如:

python3.14 -m pip uninstall -y numpy
python3.14 -m pip install --force-reinstall --no-cache-dir 'numpy>=2.3.5'

否则现在这种“卸了 2.3.5,但又直接命中残留的 2.4.2”状态,后面 rerun 很可能还会是同样的导入错误。

Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最新这笔提交我复查过了:ci/run_sot_test.sh 在 Python 3.14 分支里把

python3.14 -m pip install "numpy>=2.3.5"

改成了

python3.14 -m pip install --force-reinstall --no-cache-dir "numpy>=2.3.5"

这和上一轮日志分析里定位到的问题是对齐的:之前 PR-CI-SOT 失败的关键不是当前 PR 的 SOT/CPython 3.14 兼容性修复本身,而是 runner 上 Python 3.14 的 NumPy 处在脏环境状态;这次补丁正是在 3.14 特判路径里把 NumPy 重装做实,方向正确。

结合前面已经确认过的两处核心代码修复(补 pycore_genobject.h、改用公开 API PyObject_GC_Track),我这边对最新 head 继续给 approve。当前 PR-CI-SOT 已经基于新提交重新跑起,后续看 rerun 结果即可。

Copy link
Copy Markdown
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow 🐾

看来确实可以了,喵子牛哇

@SigureMo SigureMo merged commit 4f0218b into PaddlePaddle:develop Apr 2, 2026
102 of 105 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants