[Cpp API Compatibility] Delete useless code and rename test files#78580
[Cpp API Compatibility] Delete useless code and rename test files#78580youge325 wants to merge 11 commits intoPaddlePaddle:developfrom
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR cleans up the C++ compat test suite by standardizing test file/target naming to ATen/c10 conventions, removing unused-code paths used to silence warnings, and deleting a shared CUDA-runtime skip utility header.
Changes:
- Renamed several compat test targets/files (e.g.,
compat_*/torch_library_test→ATen_*/c10_*) and updatedtest/cpp/compat/CMakeLists.txtaccordingly. - Deleted
test/cpp/compat/cuda_test_utils.hand removed its include + skip macro usages from CUDA/HIP-related tests. - Added/adjusted tests and small warning-suppression tweaks (e.g.,
(void)state,(void)threw_exception, new assertions).
Reviewed changes
Copilot reviewed 22 out of 26 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| test/cpp/compat/cuda_test_utils.h | Removed shared CUDA runtime availability helper + skip macro. |
| test/cpp/compat/CMakeLists.txt | Renamed several test targets/source filenames to ATen/c10 naming. |
| test/cpp/compat/c10_Stream_test.cc | Removed CUDA runtime skip macro usage from CUDA stream tests. |
| test/cpp/compat/c10_Event_test.cc | Removed CUDA runtime skip macro usage from CUDA event tests. |
| test/cpp/compat/c10_cuda_generator_test.cc | Silenced an unused variable warning. |
| test/cpp/compat/ATen_Utils_test.cc | Removed CUDA runtime skip macro usage from CUDA tensor backend tests. |
| test/cpp/compat/ATen_toString_test.cc | Added renamed toString() API tests (CPU + optional CUDA/HIP section). |
| test/cpp/compat/ATen_torch_library_test.cc | Added renamed torch library/registry API tests. |
| test/cpp/compat/ATen_to_test.cc | Removed CUDA runtime skip macro usage from CUDA to(...) tests. |
| test/cpp/compat/ATen_split_test.cc | Removed CUDA runtime skip macro usage from CUDA split tests. |
| test/cpp/compat/ATen_select_test.cc | Removed CUDA runtime skip macro usage from CUDA select/index_select/masked_select tests. |
| test/cpp/compat/ATen_record_stream_test.cc | Removed CUDA runtime skip macro usage from record_stream tests. |
| test/cpp/compat/ATen_pin_memory_creation_test.cc | Removed CUDA runtime skip macro usage from pinned-memory creation tests. |
| test/cpp/compat/ATen_memory_test.cc | Removed CUDA runtime skip macro usage from CUDA reciprocal/detach tests. |
| test/cpp/compat/ATen_local_scalar_dense_test.cc | Removed CUDA runtime skip macro usage from CUDA local-scalar tests. |
| test/cpp/compat/ATen_from_blob_test.cc | Removed CUDA runtime skip macro usage from GPU-pointer from_blob tests. |
| test/cpp/compat/ATen_eye_test.cc | Removed CUDA runtime skip macro usage from CUDA eye tests. |
| test/cpp/compat/ATen_equal_test.cc | Removed CUDA runtime skip macro usage from CUDA equality tests. |
| test/cpp/compat/ATen_empty_test.cc | Removed CUDA runtime skip macro usage from pinned-memory empty tests. |
| test/cpp/compat/ATen_dense_sparse_conversion_test.cc | Added renamed dense/sparse conversion tests for _PD_ConvertToSparseIfNeeded. |
| test/cpp/compat/ATen_CUDAContext_test.cc | Removed CUDA runtime skip macro usage from CUDA context light tests. |
| test/cpp/compat/ATen_CUDABlas_test.cc | Removed CUDA runtime skip macro usage from CUDABlas tests. |
| test/cpp/compat/ATen_cuda_test.cc | Removed CUDA runtime skip macro usage from Tensor::cuda() tests; adjusted preprocessor structure. |
| test/cpp/compat/ATen_clamp_test.cc | Silenced an unused variable warning in an edge-case test. |
| test/cpp/compat/ATen_basic_test.cc | Added renamed “basic” tests; includes additional CUDA/HIP coverage blocks. |
| test/cpp/compat/ATen_as_strided_test.cc | Added an assertion about data_ptr changes after as_strided_ with offset. |
Comments suppressed due to low confidence (7)
test/cpp/compat/ATen_to_test.cc:199
- This test assumes a CUDA device is available and calls at::tensor(... device=c10::kCUDA). If the binary is run on a machine without GPUs, it will throw and fail the run. Please add a runtime guard (e.g., if !torch::cuda::is_available() then GTEST_SKIP()/return) before the first CUDA tensor creation in each CUDA/HIP test.
#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)
TEST(TensorToTest, ToDtype_GPU_FloatToDouble) {
at::Tensor t = at::tensor(
{1.0f, 2.0f},
at::TensorOptions().dtype(at::kFloat).device(c10::Device(c10::kCUDA, 0)));
at::Tensor result = t.to(at::kDouble);
ASSERT_EQ(result.scalar_type(), at::kDouble);
ASSERT_EQ(result.device().type(), c10::DeviceType::CUDA);
test/cpp/compat/ATen_pin_memory_creation_test.cc:46
- Pinned-memory tests are compiled under CUDA/HIP, but now run unconditionally. If the test binary is executed without an available CUDA runtime/device, pinned-memory allocations and/or CUDA device constructs can throw and fail the run. Consider guarding these tests with a runtime availability check (e.g., if !torch::cuda::is_available() then GTEST_SKIP()/return) before exercising pinned-memory behavior that depends on CUDA/HIP.
TEST(ATenPinMemoryCreationTest, FullPinMemory) {
// Test using TensorOptions with pinned_memory
auto by_options = at::full(
{2, 3}, 1.5f, at::TensorOptions().dtype(at::kFloat).pinned_memory(true));
AssertPinned(by_options);
test/cpp/compat/ATen_memory_test.cc:320
- These CUDA/HIP tests now allocate CUDA tensors and invoke CUDA ops without checking runtime/device availability. On CUDA/HIP builds executed on machines without a GPU, at::empty(... device=at::kCUDA) / at::arange(... device=at::kCUDA) can throw and fail the suite. Add a runtime guard (e.g., if !torch::cuda::is_available() then GTEST_SKIP()/return) before the first CUDA tensor creation in each CUDA/HIP test.
#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)
// Test reciprocal on CUDA
TEST(ReciprocalTest, ReciprocalCUDA) {
auto tensor =
at::empty({4}, at::TensorOptions().dtype(at::kFloat).device(at::kCUDA));
auto cpu_tensor = at::empty({4}, at::TensorOptions().dtype(at::kFloat));
cpu_tensor.data_ptr<float>()[0] = 1.0f;
test/cpp/compat/ATen_from_blob_test.cc:125
- GPU-pointer tests now run without any runtime/device availability guard. On CUDA/HIP builds executed without GPUs (or with an unusable runtime), cudaMalloc/hipMalloc and subsequent from_blob calls will fail and break the suite. Add a runtime guard (e.g., if c10::cuda::device_count() <= 0 or !torch::cuda::is_available() then GTEST_SKIP()/return) before the first GPU allocation in these tests.
// No device specified: GPU pointer → tensor must be on CUDA automatically.
TEST(ATenFromBlobTest, GpuPtrDefaultsToCuda) {
float* d_data = nullptr;
#if defined(PADDLE_WITH_CUDA)
cudaMalloc(&d_data, 4 * sizeof(float));
#else
hipMalloc(&d_data, 4 * sizeof(float));
#endif
test/cpp/compat/ATen_equal_test.cc:46
- This CUDA/HIP test assumes a CUDA device exists and constructs a CUDA tensor. If the test binary is run without an available GPU, tensor creation will throw and fail the suite. Add a runtime guard (e.g., if !torch::cuda::is_available() then GTEST_SKIP()/return) before creating the CUDA tensor.
#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)
TEST(TensorEqualTest, DeviceMismatchThrows) {
at::Tensor cpu = at::ones({2, 2}, at::kFloat);
at::Tensor gpu =
at::ones({2, 2}, at::TensorOptions().dtype(at::kFloat).device(at::kCUDA));
test/cpp/compat/ATen_CUDABlas_test.cc:82
- This CUDA test file performs cudaMalloc/cudaMemcpy/cudaDeviceSynchronize without any runtime/device availability guard. If the binary is executed on a CUDA build without GPUs (device_count==0) or with an unusable runtime, these calls will fail and break the test run. Consider adding a check at the start of each test (e.g., if !at::cuda::is_available() then GTEST_SKIP()/return) before running device allocations.
void Run() {
std::vector<T> h_a = {T(1), T(3), T(2), T(4)};
std::vector<T> h_b = {T(5), T(7), T(6), T(8)};
std::vector<T> h_c(N * N, T(0));
MathT alpha = static_cast<MathT>(1);
MathT beta = static_cast<MathT>(0);
runOnDevice(h_a, h_b, &h_c, [&](T* d_a, T* d_b, T* d_c) {
test/cpp/compat/ATen_cuda_test.cc:40
- The Tensor::cuda() tests now run without checking CUDA availability. In CUDA/HIP builds executed on machines without any GPU devices, cpu_t.cuda() is expected to throw and will fail the suite. Please add a runtime guard (e.g., if !torch::cuda::is_available() then GTEST_SKIP()/return) before calling cuda() in these tests.
// After cuda(), the tensor should reside on a GPU device.
TEST(TensorCudaTest, CpuTensorMovesToCuda) {
at::Tensor cpu_t = at::tensor({1.0f, 2.0f, 3.0f}, at::kFloat);
ASSERT_TRUE(cpu_t.is_cpu());
at::Tensor cuda_t = cpu_t.cuda();
ASSERT_TRUE(cuda_t.is_cuda());
ASSERT_FALSE(cuda_t.is_cpu());
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ShigureNyako
left a comment
There was a problem hiding this comment.
这次 PR 的主方向我理解为两件事:
- 统一
test/cpp/compat下部分测试的命名(ATen / c10); - 顺手清理未使用变量、删除
cuda_test_utils.h。
我这边先不给过,主要有两个需要先处理的问题:
-
测试 target 改名带来了工程风险
当前CMakeLists.txt不只是改了文件名,也把多个 test target 一起改名了。Coverage job 已经把这些 target 识别成“删除单测”并失败,日志里明确列出了torch_library_test、compat_basic_test、compat_toString_test、cuda_generator_test、compat_dense_sparse_conversion_test。如果只是想统一源码文件名,建议保留原 target 名;如果确实要改 target 名,需要把对应的删测校验 / 注册链路一并处理掉。 -
删除运行时 guard 后,CUDA/HIP 测试会在无可用设备环境下硬失败
这批改动把很多SKIP_IF_CUDA_RUNTIME_UNAVAILABLE()一起删掉了,但#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)只保证编译期开启 GPU,不保证执行时一定有可用 GPU / runtime。像getCurrentCUDAStream()、getNumGPUs()、构造 CUDA tensor 这类调用,在无卡或驱动不可用环境里会直接失败。这里更稳妥的做法还是保留原有 guard,或者统一改成at::cuda::is_available()+GTEST_SKIP()。
补充看了一下 CI:
- 当前 mergeability 显示只差 review/approval,required checks 没有拦住合并;
- 但有两个非通过项里,Coverage failure 是和这次改动直接相关的;
Linux-IXUCA / Build and Test看起来是外部 patch apply 失败,和本 PR 改的文件关系不大,我没有把它作为这次阻塞结论的依据。
建议先把上面两个问题收掉,我再继续看下一轮。
test/cpp/compat/CMakeLists.txt
Outdated
| nv_test(c10_cuda_generator_test SRCS c10_cuda_generator_test.cc) | ||
| nv_test(c10_generator_impl_test SRCS c10_generator_impl_test.cc) | ||
| nv_test(c10_intrusive_ptr_lifecycle_test | ||
| SRCS c10_intrusive_ptr_lifecycle_test.cc) | ||
| nv_test(c10_DispatchKey_test SRCS c10_DispatchKey_test.cc) | ||
| nv_test(c10_DispatchKeySet_test SRCS c10_DispatchKeySet_test.cc) | ||
| cc_test(torch_library_test SRCS torch_library_test.cc) | ||
| cc_test(ATen_torch_library_test SRCS ATen_torch_library_test.cc) |
There was a problem hiding this comment.
这里改的不只是文件名,还改了测试 target 名。当前 Coverage job 已经把这些 target 识别成“删除单测”,并直接报错:torch_library_test、compat_basic_test、compat_toString_test、cuda_generator_test、compat_dense_sparse_conversion_test 都在失败日志里被列出来了。
如果这次 PR 的目标主要是统一源码文件命名,建议保留原来的 target 名,只修改 SRCS 指向的新文件名;如果确实需要连 target 一起改名,就要同步处理删测校验/注册逻辑,否则这条 CI 会一直是红的。
There was a problem hiding this comment.
这个没关系,等 review 就好了
收到,我刚补看了一下 Coverage 日志,这里确实更像是 approval-gated 的删测校验,不是实现本身必然有问题:当前是因为 target rename 被识别成删除了 torch_library_test / compat_basic_test / compat_toString_test / cuda_generator_test / compat_dense_sparse_conversion_test,但还没有对应 review,所以先报红。
如果 maintainer 这边确认“改 target 名”本身可以接受,那这条我就不把它单独当成实现层面的 blocker 了;我现在更关注的还是下面这个运行时 guard 被删掉后的行为退化问题。
| TEST(CUDAFunctionsTest, DeviceSynchronize) { | ||
| SKIP_IF_CUDA_RUNTIME_UNAVAILABLE(); | ||
| // Exercises the PADDLE_ENFORCE_GPU_SUCCESS(cudaDeviceSynchronize()) branch | ||
| ASSERT_NO_THROW(c10::cuda::device_synchronize()); | ||
| } | ||
|
|
||
| TEST(CUDAFunctionsTest, StreamSynchronize) { | ||
| SKIP_IF_CUDA_RUNTIME_UNAVAILABLE(); | ||
| // Exercises phi::backends::gpu::GpuStreamSync() | ||
| auto stream = c10::cuda::getCurrentCUDAStream(); | ||
| ASSERT_NO_THROW(c10::cuda::stream_synchronize(stream)); | ||
| } | ||
|
|
||
| TEST(CUDAFunctionsTest, AtNamespaceAliases) { | ||
| SKIP_IF_CUDA_RUNTIME_UNAVAILABLE(); | ||
| // Exercises the using aliases in at::cuda namespace | ||
| ASSERT_NO_THROW(at::cuda::device_synchronize()); | ||
| auto stream = c10::cuda::getCurrentCUDAStream(); |
There was a problem hiding this comment.
这里把运行时 guard 一起删掉后,#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP) 只保证“编译时启用了 GPU”,并不保证“执行环境一定有可用设备 / runtime”。
像下面这些调用:getCurrentCUDAStream()、getNumGPUs()、getCurrentDeviceProperties(),在无卡或驱动不可用的环境里都可能直接抛错;而原来的 cuda_test_utils.h 实际上同时检查了 device_count() 和 cudaFree/hipFree 是否可用。
建议保留这层运行时判断,或者统一改成 if (!at::cuda::is_available()) { GTEST_SKIP() << ...; }。否则这批 compat tests 会从“可跳过”变成“硬失败”。
There was a problem hiding this comment.
@youge325 话说这个当初为啥要加,现在又为啥要删?是因为只在 CUDA 上跑吗?其实我觉得如果可以的话,我还是希望这些单测能测就尽可能在 CPU 上测一测的,当初 CMakeLists 加 WITH_GPU 只是早期希望快速跑通而已
There was a problem hiding this comment.
当初用 Agent 加测试提高 coverage rate 的时候自动加的,按我的理解,如果能编译成功的话,那么测试也不能随便就 skip,所以现在就删了。
有些单测也确实可以在CPU上跑,条件编译一下测试用例就行,比如 #if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP), 就 ASSERT_NO_THROW ,#else , ASSERT_THROW,我看看有哪些测试用例可以这样改
There was a problem hiding this comment.
@ShigureNyako 帮我看看 test/cpp/compat/CMakeLists.txt 是否可以把 if(WITH_GPU) 删掉
There was a problem hiding this comment.
@ShigureNyako 请给我一个 GPU build 但运行时无可用 device/runtime 的例子,在 CI 当中并没有出现这种情况, GPU build 就是为了在有可用 device/runtime 的场景中运行,我原来的代码写错了所以现在删掉,你可能倒果为因了
There was a problem hiding this comment.
请给我一个 GPU build 但运行时无可用 device/runtime 的例子,在 CI 当中并没有出现这种情况, GPU build 就是为了在有可用 device/runtime 的场景中运行,我原来的代码写错了所以现在删掉,你可能倒果为因了
你这个提醒是对的,我这里把表述收敛一下:就这条 PR 当前的 CI 证据来看,我手头没有一个已经实际打出来的 “GPU build 但运行时无 device/runtime” 失败样例。 现在能直接看到的红灯,还是 Coverage 那条删测校验,以及 Linux-IXUCA 的外部 patch apply 失败。
我这边现在更准确的关注点,其实是另一层:不要因为补了部分 CPU-only compiling branch,就直接推导出可以把整块 if(WITH_GPU) 去掉。
原因是当前这批 test 里仍然混着不少明确的 GPU-only case,例如:
ATen_CUDABlas_test.cc直接依赖cudaMalloc/cudaMemcpy/cudaDeviceSynchronizeATen_cuda_test.cc直接测试Tensor::cuda()ATen_record_stream_test.cc现在在 fixtureSetUp()里也还是会创建cuda_tensor
所以我认同你说的:“当前 CI 没有证明 runtime 不可用场景会打到这里”。但从代码拆分角度看,我还是倾向于把结论限定成:
- 这次
a0be3dee对ATen_CUDAContext_test.cc/c10_Stream_test.cc/c10_Event_test.cc这类补 CPU compiling branch 的方向是对的; - 但如果后面要把更多 compat tests 真正下沉到 CPU 路径,还是得先把 CPU-capable case 和 GPU-only case 拆开,而不是现在就整体删
if(WITH_GPU)。
我会按这个收敛后的判断继续看,不再把“当前 CI 已经证明有 runtime availability 问题”当成论据本身。
There was a problem hiding this comment.
@ShigureNyako 如果从严谨性上来看,确实会有编译期 WITH_GPU 但是实际 runtime 没有的场景,比如我们 CI 里所有机器都是 CPU 编包的,即便是 Coverage build 也是,但是虽然 Coverage build 本身在 CPU 上,但其产物一定会在后续 runtime 有 GPU 的 Coverage test 上运行,也就是说目前 CI 上基本上有一个假设就是,编译时带了 GPU,那运行时就一定带 GPU,这其实不应该是考虑的重点
当初早期 C++ 兼容性第一个 PR #74402 为赶 3.2 发版在拉分支前夕合入临时跳过了一些内容,应该是 defd6a3 跳过了一些新硬件(根据上一个 commit 99f7b88 的报错和该 commit message 看起来是 XPU 和 NPU,但日志已经过期看不到了)上的运行时问题,请着眼于该场景分析下具体可能存在的问题,以及是否有推荐的解决方案
注意,我们即便不能推广到所有硬件(比如 XPU 和 NPU),即便跳过部分硬件,也比直接放弃 CPU 测试更好,我们的目标不是一次修复所有硬件上的问题(当然可以更好),只是希望在确保 CPU 上能够有所监控
There was a problem hiding this comment.
请着眼于该场景分析下具体可能存在的问题,以及是否有推荐的解决方案
我按你给的线索回看了一下 PR #74402 里的相关 commit,能从代码里直接确认到的点大概是这样:
99f7b88d321d68d7fe8597c4d5bd31f7b424272b本身只是给ivalue.h里的generic_to补前置声明,主要是 warning / 编译告警层面的修正,和 XPU/NPU 场景本身关系不大。- 真正直接指向“新硬件 build 问题”的是
defd6a3cc60ca6bc3a2df5f58f2043ee0616a48c (fix build on xpu&npu),它至少做了三件很关键的事:test/cpp/compat/CMakeLists.txt里把compat_basic_test和torch_library_test从外层挪进了if(WITH_GPU);paddle/utils/pybind.h里把<ATen/core/TensorBody.h>包进了#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP);- 同一次 commit 还删除了旧的
torch_like_api/ATen/core/TensorBase.h。
所以如果按这个历史上下文收敛,我现在更倾向于认为:当时真正要绕开的,不是单纯“这些测试在 runtime 上能不能 skip”,而是“CPU-capable 的测试文件里混入了 CUDA-only 头文件 / 链接依赖,导致 XPU/NPU build 只能整体退回到 WITH_GPU 里面”。
也就是说,这个场景下更具体的问题其实是:
- 一部分 case 本身是 CPU 可测的;
- 但它们和真正依赖 CUDA runtime / CUDA headers / CUDA libs 的 case 混在一个 target / 一个文件里;
- 结果为了规避少数异构硬件问题,只能把整文件都锁回
WITH_GPU,CPU 监控也一起没了。
如果按这个目标来推,我这边比较推荐的解法会是分层处理,而不是一步到位追求“所有后端都修完”:
-
先把 CPU-capable case 拆出来恢复 CPU 监控
- 这类最明显的还是
torch_library_test、compat_basic_test里的 CPU 段,以及ATen_toString_test的 CPU 段; - 这部分更适合独立成
cc_test/ CPU target,而不是继续和 GPU-only case 绑在一起。
- 这类最明显的还是
-
把“CPU 可编译,但 GPU 语义不同”的 case 继续保留成条件分支
- 比如这次
ATen_CUDAContext_test.cc新补的ASSERT_THROW/ASSERT_FALSE这种 CPU-only branch,我觉得方向是对的; - 它解决的是“同一个 API 在 CPU-only build 下的 contract”。
- 比如这次
-
真正依赖 CUDA runtime 的 case 继续留在
if(WITH_GPU)/nv_test下- 例如
ATen_CUDABlas_test.cc、ATen_cuda_test.cc、ATen_record_stream_test.cc、GPU pointer 的ATen_from_blob_test.cc; - 这些 case 不适合为了扩 CPU 覆盖而整体挪到 CPU 路径。
- 例如
-
XPU / NPU 上如果仍有个别 API 不通,优先做 backend-specific 规避,而不是整体放弃 CPU 测试
- 也就是尽量把“跳过部分硬件”收敛到具体 case / 具体 backend,避免再次把整批 CPU-capable 监控一起关掉。
从这个角度看,我会把当前 review 立场进一步收敛成:支持把 CPU-capable case 往 CPU 流水线拆,但不支持在这一步直接整体移除 if(WITH_GPU),也不支持把仍然混有 GPU-only 依赖的 target 直接默认成 CPU 可跑。
这样比较贴近当年 defd6a3 想绕开的具体问题,也更符合“先把 CPU 上能监控的部分恢复起来,再逐步收敛 XPU/NPU 等异构硬件差异”的目标。
There was a problem hiding this comment.
我看了最新提交 d6e9be4 (fix xpu),能理解你这里是在为 XPU build 继续收敛 CMakeLists,但我这里还是想把结论再收窄一下:我不太建议把这批 case 继续整体收回 nv_test。
这次从 CPU 路径挪回 GPU 路径的文件里,至少有几类情况其实不太一样:
-
明显还是 CPU-capable 的 case
ATen_all_test.ccATen_as_strided_test.ccATen_index_test.cc
这几个文件我回看了下,正文基本都是纯 CPU tensor / indexing / view 语义,没有实际用到 CUDA runtime。这里更像是“文件级依赖或目标划分”问题,而不是 test case 本身只能放在 GPU 路径里。
-
正文基本是 CPU case,但文件头带了无关 CUDA include
ATen_transpose_test.ccATen_viewAs_test.cc
这两个文件目前也没有真正用到
c10::cuda::*/at::cuda::*逻辑,感觉更像是可以先把无关的 CUDA 头删掉,再继续留在 CPU 路径监控。 -
CPU / GPU case 混在同一个文件里,适合继续拆分
ATen_basic_test.ccATen_from_blob_test.cc
这类我能理解为什么会在 XPU 上出问题,但如果问题来源是同文件里混有 GPU 段,我还是更倾向于:把 CPU-only 部分和 GPU-only 部分拆开,而不是把整文件都回退到
nv_test。
所以从 review 角度看,我现在的判断还是一致的:如果目标是“先把 CPU 上能监控的部分保住”,那优先级应该是清理无关 CUDA include / 拆分混合文件,而不是把一批本来可以在 CPU 上监控的 compat tests 再收回 GPU-only 路径。
也就是说,这个 fix xpu 方向我理解,但我这边还不能把它当成最终解法;更理想的收敛方式还是尽量把 CPU-capable case 留在 cc_test,只把确实有 CUDA 依赖的部分留在 nv_test。
This reverts commit 7cee6e8.
|
/re-run all-failed |
|
@ShigureNyako 这不对吧,为什么 Mac-CPU 的编译会出现这种错误 [2487/3112] Building CXX object test/CMakeFiles/ATen_basic_test.dir/cpp/compat/ATen_basic_test.cc.o
FAILED: test/CMakeFiles/ATen_basic_test.dir/cpp/compat/ATen_basic_test.cc.o
/opt/homebrew/bin/ccache /Library/Developer/CommandLineTools/usr/bin/c++ -DGLOG_NO_ABBREVIATED_SEVERITIES -DHPPL_STUB_FUNC -DLAPACK_FOUND -DPADDLE_DISABLE_PROFILER -DPADDLE_SLEEF_POW_PRECISION=10 -DPADDLE_USE_ACCELERATE -DPADDLE_VERSION=0.0.0 -DPADDLE_VERSION_INTEGER=0 -DPADDLE_WITH_ARM -DPADDLE_WITH_CRYPTO -DPADDLE_WITH_POCKETFFT -DPADDLE_WITH_SLEEF -DPADDLE_WITH_TESTING -DPHI_INNER -DPHI_SHARED -DYAML_CPP_STATIC_DEFINE -I. -I../paddle/fluid/framework/io -Ithird_party/install/zlib/include -Ithird_party/install -Ithird_party/install/gflags/include -Ithird_party/install/glog/include -I../third_party/eigen3 -I../third_party/threadpool -I../third_party/dlpack/include -Ithird_party/install/xxhash/include -Ithird_party/install/warpctc/include -Ithird_party/install/warprnnt/include -Ithird_party/install/utf8proc/include -Ithird_party/install/protobuf/include -I../third_party/nlohmann_json/include -Ithird_party/install/yaml-cpp/include -I/Library/Frameworks/Python.framework/Versions/3.10/include/python3.10 -I/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/numpy/_core/include -Ithird_party/pybind/src/extern_pybind/include -Ithird_party/install/gtest/include -Ithird_party/install/libuv/include -Ithird_party/install/cryptopp/include -Ithird_party/pocketfft/src -Ithird_party/install/sleef/include -I../ -I../paddle/phi/api/include/compat -I../paddle/phi/api/include/compat/torch/csrc/api/include -DCRYPTOPP_ARM_CRC32_AVAILABLE=0 -std=c++17 -Wno-deprecated-register -Werror=format -Werror=braced-scalar-init -Werror=uninitialized -Werror=tautological-constant-out-of-range-compare -Werror=literal-conversion -Werror=pragma-pack -Werror=c++17-extensions -fPIC -O3 -DNDEBUG -arch arm64 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk -mmacosx-version-min=15.1 -MD -MT test/CMakeFiles/ATen_basic_test.dir/cpp/compat/ATen_basic_test.cc.o -MF test/CMakeFiles/ATen_basic_test.dir/cpp/compat/ATen_basic_test.cc.o.d -o test/CMakeFiles/ATen_basic_test.dir/cpp/compat/ATen_basic_test.cc.o -c ../test/cpp/compat/ATen_basic_test.cc
In file included from ../test/cpp/compat/ATen_basic_test.cc:15:
In file included from ../paddle/phi/api/include/compat/ATen/Functions.h:17:
In file included from ../paddle/phi/api/include/compat/ATen/ops/_local_scalar_dense.h:21:
In file included from ../paddle/phi/api/include/compat/ATen/core/Tensor.h:17:
In file included from ../paddle/phi/api/include/compat/ATen/core/TensorBody.h:18:
In file included from ../paddle/phi/api/include/compat/ATen/core/TensorBase.h:18:
In file included from ../paddle/phi/api/include/compat/c10/core/Device.h:37:
In file included from ../paddle/phi/core/platform/device_event_base.h:19:
In file included from ../paddle/phi/core/platform/device_context.h:25:
In file included from ../paddle/phi/core/memory/malloc.h:22:
../paddle/phi/core/memory/allocation/allocator.h:293:20: warning: 'AllocateImpl' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
phi::Allocation* AllocateImpl(size_t UNUSED) { return nullptr; }
^
../paddle/phi/core/memory/allocation/allocator.h:210:28: note: overridden virtual function is here
virtual phi::Allocation* AllocateImpl(size_t size) = 0;
^
../test/cpp/compat/ATen_basic_test.cc:222:42: error: a type named 'TensorOptions' is hidden by a declaration in a different namespace
torch::from_blob(pp, {3}, torch::TensorOptions().dtype(torch::kInt64));
~~~~~~~^
../paddle/phi/api/include/compat/c10/core/TensorOptions.h:55:19: note: type declaration hidden
struct PADDLE_API TensorOptions {
^
../paddle/phi/api/include/compat/ATen/core/TensorBody.h:61:7: note: declaration hides type
using TensorOptions = c10::TensorOptions;
^
1 warning and 1 error generated. |
消息的自动处理还是有不少问题,这个又没响应,还在优化
这些是不是可以看看能不能 XPU 先跳过下 |
|
@youge325 你这个提醒是对的,我补看了你贴的 Mac-CPU 日志之后,这里把判断再收窄一下。 对 torch::from_blob(pp, {3}, torch::TensorOptions().dtype(torch::kInt64));在你贴的那条 Mac-CPU 日志里,这里会触发
所以 不过我这边想表达的点也还是再限定一下:
所以我现在更准确的立场是:
我会按这个更精确的口径继续看后面的 CI / 拆分情况。 |
|
@ShigureNyako 我给你贴一下完整的日志,证明确实是 xpu 不支持部分算子,毕竟 cpu 都没问题 =========================================
This is the fourth time to re-run
=========================================
The following unittest will be re-run:
ATen_all_test
ATen_as_strided_test
ATen_from_blob_test
ATen_index_test
ATen_transpose_test
ATen_viewAs_test
ATen_basic_test
test_compat_slogdet
=========================================
Test project /paddle/build
Start 207: ATen_all_test
1/8 Test #207: ATen_all_test ....................***Failed 0.41 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 48 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 17 tests from TestAll
[ RUN ] TestAll.AllNoDim
/paddle/test/cpp/compat/ATen_all_test.cc:40: Failure
Expected equality of these values:
result.item<bool>()
Which is: true
false
[ FAILED ] TestAll.AllNoDim (0 ms)
[ RUN ] TestAll.AllWithDim
/paddle/test/cpp/compat/ATen_all_test.cc:56: Failure
Expected equality of these values:
result_dim0.data_ptr<bool>()[0]
Which is: true
false
[ FAILED ] TestAll.AllWithDim (0 ms)
[ RUN ] TestAll.AllWithDimKeepdim
[ OK ] TestAll.AllWithDimKeepdim (0 ms)
[ RUN ] TestAll.AllWithOptionalDim
[ OK ] TestAll.AllWithOptionalDim (0 ms)
[ RUN ] TestAll.AllNoDimAllFalse
[ OK ] TestAll.AllNoDimAllFalse (0 ms)
[ RUN ] TestAll.AllNoDimSingleElement
[ OK ] TestAll.AllNoDimSingleElement (0 ms)
[ RUN ] TestAll.AllWithNegativeDim
/paddle/test/cpp/compat/ATen_all_test.cc:106: Failure
Expected equality of these values:
result.data_ptr<bool>()[0]
Which is: true
false
[ FAILED ] TestAll.AllWithNegativeDim (0 ms)
[ RUN ] TestAll.AllWithDimKeepdimTrue
/paddle/test/cpp/compat/ATen_all_test.cc:117: Failure
Expected equality of these values:
result_dim0.data_ptr<bool>()[0]
Which is: true
false
[ FAILED ] TestAll.AllWithDimKeepdimTrue (1 ms)
[ RUN ] TestAll.AllWithOptionalDimNullopt
[ OK ] TestAll.AllWithOptionalDimNullopt (0 ms)
[ RUN ] TestAll.AllWithOptionalDimNulloptHasFalse
/paddle/test/cpp/compat/ATen_all_test.cc:143: Failure
Expected equality of these values:
result.item<bool>()
Which is: true
false
[ FAILED ] TestAll.AllWithOptionalDimNulloptHasFalse (0 ms)
[ RUN ] TestAll.AllWithOptionalDimKeepdim
[ OK ] TestAll.AllWithOptionalDimKeepdim (0 ms)
[ RUN ] TestAll.AllWithOptionalMultipleDims
[ OK ] TestAll.AllWithOptionalMultipleDims (0 ms)
[ RUN ] TestAll.MemberAllWithOptionalNullopt
[ OK ] TestAll.MemberAllWithOptionalNullopt (0 ms)
[ RUN ] TestAll.MemberAllWithOptionalNulloptKeepdim
[ OK ] TestAll.MemberAllWithOptionalNulloptKeepdim (0 ms)
[ RUN ] TestAll.StandaloneFunction
/paddle/test/cpp/compat/ATen_all_test.cc:188: Failure
Expected equality of these values:
result.item<bool>()
Which is: true
false
[ FAILED ] TestAll.StandaloneFunction (0 ms)
[ RUN ] TestAll.StandaloneFunctionWithDim
/paddle/test/cpp/compat/ATen_all_test.cc:198: Failure
Expected equality of these values:
result.data_ptr<bool>()[0]
Which is: true
false
[ FAILED ] TestAll.StandaloneFunctionWithDim (0 ms)
[ RUN ] TestAll.AllWith3DTensor
/paddle/test/cpp/compat/ATen_all_test.cc:212: Failure
Expected equality of these values:
result_all.item<bool>()
Which is: true
false
[ FAILED ] TestAll.AllWith3DTensor (0 ms)
[----------] 17 tests from TestAll (1 ms total)
[----------] 24 tests from TestAllclose
[ RUN ] TestAllclose.AllcloseBasic
[ OK ] TestAllclose.AllcloseBasic (0 ms)
[ RUN ] TestAllclose.AllcloseNotEqual
/paddle/test/cpp/compat/ATen_all_test.cc:237: Failure
Expected equality of these values:
result
Which is: true
false
[ FAILED ] TestAllclose.AllcloseNotEqual (0 ms)
[ RUN ] TestAllclose.StandaloneFunction
[ OK ] TestAllclose.StandaloneFunction (0 ms)
[ RUN ] TestAllclose.AllcloseWithCustomRtol
/paddle/test/cpp/compat/ATen_all_test.cc:257: Failure
Expected equality of these values:
result_default
Which is: true
false
[ FAILED ] TestAllclose.AllcloseWithCustomRtol (0 ms)
[ RUN ] TestAllclose.AllcloseWithCustomAtol
/paddle/test/cpp/compat/ATen_all_test.cc:272: Failure
Expected equality of these values:
result_default
Which is: true
false
[ FAILED ] TestAllclose.AllcloseWithCustomAtol (0 ms)
[ RUN ] TestAllclose.AllcloseMemberWithAllParams
[ OK ] TestAllclose.AllcloseMemberWithAllParams (0 ms)
[ RUN ] TestAllclose.AllcloseMemberNotClose
/paddle/test/cpp/compat/ATen_all_test.cc:295: Failure
Expected equality of these values:
result
Which is: true
false
[ FAILED ] TestAllclose.AllcloseMemberNotClose (0 ms)
[ RUN ] TestAllclose.AllcloseMemberWithCustomTolerance
/paddle/test/cpp/compat/ATen_all_test.cc:305: Failure
Expected equality of these values:
tensor1.allclose(tensor2)
Which is: true
false
[ FAILED ] TestAllclose.AllcloseMemberWithCustomTolerance (0 ms)
[ RUN ] TestAllclose.AllcloseExactZeros
[ OK ] TestAllclose.AllcloseExactZeros (0 ms)
[ RUN ] TestAllclose.AllcloseHighDim
[ OK ] TestAllclose.AllcloseHighDim (0 ms)
[ RUN ] TestAllclose.AllcloseEqualNanDefaultFalse
[ OK ] TestAllclose.AllcloseEqualNanDefaultFalse (0 ms)
[ RUN ] TestAllclose.AllcloseEqualNanTrue
[ OK ] TestAllclose.AllcloseEqualNanTrue (0 ms)
[ RUN ] TestAllclose.AllcloseEqualNanTrueAllNan
[ OK ] TestAllclose.AllcloseEqualNanTrueAllNan (0 ms)
[ RUN ] TestAllclose.AllcloseMemberEqualNanTrue
[ OK ] TestAllclose.AllcloseMemberEqualNanTrue (0 ms)
[ RUN ] TestAllclose.AllcloseMixedNanAndValues
[ OK ] TestAllclose.AllcloseMixedNanAndValues (0 ms)
[ RUN ] TestAllclose.AllcloseDouble
/paddle/test/cpp/compat/ATen_all_test.cc:429: Failure
Expected equality of these values:
result_diff
Which is: true
false
[ FAILED ] TestAllclose.AllcloseDouble (0 ms)
[ RUN ] TestAllclose.AllcloseDoubleEqualNan
[ OK ] TestAllclose.AllcloseDoubleEqualNan (0 ms)
[ RUN ] TestAllclose.AllcloseStandaloneWithExplicitParams
[ OK ] TestAllclose.AllcloseStandaloneWithExplicitParams (0 ms)
[ RUN ] TestAllclose.AllcloseInfinityValues
[ OK ] TestAllclose.AllcloseInfinityValues (0 ms)
[ RUN ] TestAllclose.AllcloseInt32
/paddle/test/cpp/compat/ATen_all_test.cc:498: Failure
Expected equality of these values:
result_diff
Which is: true
false
[ FAILED ] TestAllclose.AllcloseInt32 (0 ms)
[ RUN ] TestAllclose.AllcloseInt64
/paddle/test/cpp/compat/ATen_all_test.cc:518: Failure
Expected equality of these values:
result_diff
Which is: true
false
[ FAILED ] TestAllclose.AllcloseInt64 (0 ms)
[ RUN ] TestAllclose.AllcloseEmptyTensor
[ OK ] TestAllclose.AllcloseEmptyTensor (0 ms)
[ RUN ] TestAllclose.AllcloseScalarTensor
[ OK ] TestAllclose.AllcloseScalarTensor (0 ms)
[ RUN ] TestAllclose.AllcloseWithDifferentRtolAtolOrder
/paddle/test/cpp/compat/ATen_all_test.cc:570: Failure
Expected equality of these values:
result2
Which is: true
false
[ FAILED ] TestAllclose.AllcloseWithDifferentRtolAtolOrder (0 ms)
[----------] 24 tests from TestAllclose (0 ms total)
[----------] 7 tests from TestAbsolute
[ RUN ] TestAbsolute.AbsoluteBasic
[ OK ] TestAbsolute.AbsoluteBasic (0 ms)
[ RUN ] TestAbsolute.AbsoluteNegativeOnly
[ OK ] TestAbsolute.AbsoluteNegativeOnly (0 ms)
[ RUN ] TestAbsolute.AbsoluteZero
[ OK ] TestAbsolute.AbsoluteZero (0 ms)
[ RUN ] TestAbsolute.AbsoluteInPlace
[ OK ] TestAbsolute.AbsoluteInPlace (0 ms)
[ RUN ] TestAbsolute.AbsoluteInPlaceNegative
[ OK ] TestAbsolute.AbsoluteInPlaceNegative (0 ms)
[ RUN ] TestAbsolute.AbsoluteDouble
[ OK ] TestAbsolute.AbsoluteDouble (0 ms)
[ RUN ] TestAbsolute.AbsoluteMatchesAbs
[ OK ] TestAbsolute.AbsoluteMatchesAbs (0 ms)
[----------] 7 tests from TestAbsolute (0 ms total)
[----------] Global test environment tear-down
[==========] 48 tests from 3 test cases ran. (1 ms total)
[ PASSED ] 31 tests.
[ FAILED ] 17 tests, listed below:
[ FAILED ] TestAll.AllNoDim
[ FAILED ] TestAll.AllWithDim
[ FAILED ] TestAll.AllWithNegativeDim
[ FAILED ] TestAll.AllWithDimKeepdimTrue
[ FAILED ] TestAll.AllWithOptionalDimNulloptHasFalse
[ FAILED ] TestAll.StandaloneFunction
[ FAILED ] TestAll.StandaloneFunctionWithDim
[ FAILED ] TestAll.AllWith3DTensor
[ FAILED ] TestAllclose.AllcloseNotEqual
[ FAILED ] TestAllclose.AllcloseWithCustomRtol
[ FAILED ] TestAllclose.AllcloseWithCustomAtol
[ FAILED ] TestAllclose.AllcloseMemberNotClose
[ FAILED ] TestAllclose.AllcloseMemberWithCustomTolerance
[ FAILED ] TestAllclose.AllcloseDouble
[ FAILED ] TestAllclose.AllcloseInt32
[ FAILED ] TestAllclose.AllcloseInt64
[ FAILED ] TestAllclose.AllcloseWithDifferentRtolAtolOrder
17 FAILED TESTS
Start 209: ATen_as_strided_test
2/8 Test #209: ATen_as_strided_test .............***Failed 0.40 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 11 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 11 tests from TensorAsStridedTest
[ RUN ] TensorAsStridedTest.AsStridedBasic
[ OK ] TensorAsStridedTest.AsStridedBasic (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedWithOffset
[ OK ] TensorAsStridedTest.AsStridedWithOffset (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedWithDifferentStrides
[ OK ] TensorAsStridedTest.AsStridedWithDifferentStrides (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedInplace
[ OK ] TensorAsStridedTest.AsStridedInplace (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedInplaceWithOffset
[ OK ] TensorAsStridedTest.AsStridedInplaceWithOffset (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedInplaceModifiesView
[ OK ] TensorAsStridedTest.AsStridedInplaceModifiesView (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedScatterBasic
[ OK ] TensorAsStridedTest.AsStridedScatterBasic (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedScatterOriginalUnchanged
[ OK ] TensorAsStridedTest.AsStridedScatterOriginalUnchanged (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedScatterWithOffset
[ OK ] TensorAsStridedTest.AsStridedScatterWithOffset (0 ms)
[ RUN ] TensorAsStridedTest.AsStridedTranspose
unknown file: Failure
C++ exception with description "
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 float* phi::DenseTensor::data<float>()
1 phi::DenseTensor::data()
2 phi::DenseTensor::check_memory_size() const
3 common::enforce::GetCurrentTraceBackString[abi:cxx11](bool)
----------------------
Error Message Summary:
----------------------
FatalError: FLAGS_use_stride_kernel is closed. Not contiguous Tensor found, something wrong has happened! (at /paddle/paddle/phi/core/tensor_meta.cc:221)
" thrown in the test body.
[ FAILED ] TensorAsStridedTest.AsStridedTranspose (4 ms)
[ RUN ] TensorAsStridedTest.AsStridedContiguous
unknown file: Failure
C++ exception with description "
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 common::enforce::GetCurrentTraceBackString[abi:cxx11](bool)
----------------------
Error Message Summary:
----------------------
FatalError: FLAGS_use_stride_kernel is closed. Not contiguous Tensor found, something wrong has happened! (at /paddle/paddle/phi/core/tensor_meta.cc:221)
" thrown in the test body.
[ FAILED ] TensorAsStridedTest.AsStridedContiguous (1 ms)
[----------] 11 tests from TensorAsStridedTest (5 ms total)
[----------] Global test environment tear-down
[==========] 11 tests from 1 test case ran. (5 ms total)
[ PASSED ] 9 tests.
[ FAILED ] 2 tests, listed below:
[ FAILED ] TensorAsStridedTest.AsStridedTranspose
[ FAILED ] TensorAsStridedTest.AsStridedContiguous
2 FAILED TESTS
Start 221: ATen_from_blob_test
3/8 Test #221: ATen_from_blob_test ..............***Failed 0.40 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 7 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 7 tests from ATenFromBlobTest
[ RUN ] ATenFromBlobTest.CpuPtrDefaultsToCpu
[ OK ] ATenFromBlobTest.CpuPtrDefaultsToCpu (1 ms)
[ RUN ] ATenFromBlobTest.CpuPtrWithCpuOptions
[ OK ] ATenFromBlobTest.CpuPtrWithCpuOptions (0 ms)
[ RUN ] ATenFromBlobTest.DataPtrPreserved
[ OK ] ATenFromBlobTest.DataPtrPreserved (0 ms)
[ RUN ] ATenFromBlobTest.ShapeAndStrides
[ OK ] ATenFromBlobTest.ShapeAndStrides (0 ms)
[ RUN ] ATenFromBlobTest.ExplicitStrides
unknown file: Failure
C++ exception with description "
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 phi::DenseTensor::ResetHolder(std::shared_ptr<phi::Allocation> const&)
1 common::enforce::GetCurrentTraceBackString[abi:cxx11](bool)
----------------------
Error Message Summary:
----------------------
FatalError: FLAGS_use_stride_kernel is closed. Not contiguous Tensor found, something wrong has happened! (at /paddle/paddle/phi/core/tensor_meta.cc:221)
" thrown in the test body.
[ FAILED ] ATenFromBlobTest.ExplicitStrides (2 ms)
[ RUN ] ATenFromBlobTest.DeleterCalled
[ OK ] ATenFromBlobTest.DeleterCalled (0 ms)
[ RUN ] ATenFromBlobTest.DeleterWithStrides
[ OK ] ATenFromBlobTest.DeleterWithStrides (0 ms)
[----------] 7 tests from ATenFromBlobTest (3 ms total)
[----------] Global test environment tear-down
[==========] 7 tests from 1 test case ran. (3 ms total)
[ PASSED ] 6 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] ATenFromBlobTest.ExplicitStrides
1 FAILED TEST
Start 223: ATen_index_test
4/8 Test #223: ATen_index_test ..................***Failed 0.49 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 15 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 9 tests from TensorIndexTest
[ RUN ] TensorIndexTest.IndexWithSingleTensor
[ OK ] TensorIndexTest.IndexWithSingleTensor (0 ms)
[ RUN ] TensorIndexTest.SliceKeepsStrideWithoutContiguousCopy
/paddle/test/cpp/compat/ATen_index_test.cc:58: Failure
Value of: transposed.is_contiguous()
Actual: true
Expected: false
[ FAILED ] TensorIndexTest.SliceKeepsStrideWithoutContiguousCopy (0 ms)
[ RUN ] TensorIndexTest.IndexWithEmptyInitializerListReturnsSelf
[ OK ] TensorIndexTest.IndexWithEmptyInitializerListReturnsSelf (0 ms)
[ RUN ] TensorIndexTest.IndexWithTensorInitializerList
[ OK ] TensorIndexTest.IndexWithTensorInitializerList (0 ms)
[ RUN ] TensorIndexTest.MemberIndexWithArrayRefTensorIndices
/paddle/test/cpp/compat/ATen_index_test.cc:107: Failure
Expected equality of these values:
sliced.strides()
Which is: { 3, 1 }
c10::IntArrayRef({1, 6})
Which is: { 1, 6 }
[ FAILED ] TensorIndexTest.MemberIndexWithArrayRefTensorIndices (0 ms)
[ RUN ] TensorIndexTest.MixedSliceAndTensorIndicesThrows
[ OK ] TensorIndexTest.MixedSliceAndTensorIndicesThrows (0 ms)
[ RUN ] TensorIndexTest.IndexWithEmptyList
[ OK ] TensorIndexTest.IndexWithEmptyList (0 ms)
[ RUN ] TensorIndexTest.IndexWithMultipleIndices
[ OK ] TensorIndexTest.IndexWithMultipleIndices (0 ms)
[ RUN ] TensorIndexTest.IndexWithOptionalNone
[ OK ] TensorIndexTest.IndexWithOptionalNone (0 ms)
[----------] 9 tests from TensorIndexTest (0 ms total)
[----------] 6 tests from TensorIndexPutTest
[ RUN ] TensorIndexPutTest.IndexPutInplaceWithTensor
[ OK ] TensorIndexPutTest.IndexPutInplaceWithTensor (70 ms)
[ RUN ] TensorIndexPutTest.IndexPutInplaceWithScalar
[ OK ] TensorIndexPutTest.IndexPutInplaceWithScalar (0 ms)
[ RUN ] TensorIndexPutTest.IndexPutNonInplace
[ OK ] TensorIndexPutTest.IndexPutNonInplace (1 ms)
[ RUN ] TensorIndexPutTest.IndexPutAccumulate
[ OK ] TensorIndexPutTest.IndexPutAccumulate (0 ms)
[ RUN ] TensorIndexPutTest.IndexPutWith2D
[ OK ] TensorIndexPutTest.IndexPutWith2D (0 ms)
[ RUN ] TensorIndexPutTest.IndexPutNonInplaceAccumulate
[ OK ] TensorIndexPutTest.IndexPutNonInplaceAccumulate (0 ms)
[----------] 6 tests from TensorIndexPutTest (71 ms total)
[----------] Global test environment tear-down
[==========] 15 tests from 2 test cases ran. (71 ms total)
[ PASSED ] 13 tests.
[ FAILED ] 2 tests, listed below:
[ FAILED ] TensorIndexTest.SliceKeepsStrideWithoutContiguousCopy
[ FAILED ] TensorIndexTest.MemberIndexWithArrayRefTensorIndices
2 FAILED TESTS
Start 238: ATen_transpose_test
5/8 Test #238: ATen_transpose_test ..............***Failed 0.40 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 6 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 6 tests from TensorTransposeInplaceTest
[ RUN ] TensorTransposeInplaceTest.Transpose2D_SwapDims
[ OK ] TensorTransposeInplaceTest.Transpose2D_SwapDims (0 ms)
[ RUN ] TensorTransposeInplaceTest.Transpose3D_SwapFirstTwo
[ OK ] TensorTransposeInplaceTest.Transpose3D_SwapFirstTwo (0 ms)
[ RUN ] TensorTransposeInplaceTest.Transpose3D_SwapLastTwo
[ OK ] TensorTransposeInplaceTest.Transpose3D_SwapLastTwo (0 ms)
[ RUN ] TensorTransposeInplaceTest.TransposeInplace_PreservesValues
/paddle/test/cpp/compat/ATen_transpose_test.cc:81: Failure
Expected equality of these values:
t[2][0].item<float>()
Which is: 3
2.0f
Which is: 2
[ FAILED ] TensorTransposeInplaceTest.TransposeInplace_PreservesValues (0 ms)
[ RUN ] TensorTransposeInplaceTest.TransposeInplace_SameDim_NoOp
[ OK ] TensorTransposeInplaceTest.TransposeInplace_SameDim_NoOp (0 ms)
[ RUN ] TensorTransposeInplaceTest.TransposeInplace_DoubleTranspose_RestoresShape
[ OK ] TensorTransposeInplaceTest.TransposeInplace_DoubleTranspose_RestoresShape (0 ms)
[----------] 6 tests from TensorTransposeInplaceTest (0 ms total)
[----------] Global test environment tear-down
[==========] 6 tests from 1 test case ran. (1 ms total)
[ PASSED ] 5 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] TensorTransposeInplaceTest.TransposeInplace_PreservesValues
1 FAILED TEST
Start 241: ATen_viewAs_test
6/8 Test #241: ATen_viewAs_test .................***Failed 0.40 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 5 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 5 tests from TensorViewAsTest
[ RUN ] TensorViewAsTest.ViewAsSameShape
[ OK ] TensorViewAsTest.ViewAsSameShape (0 ms)
[ RUN ] TensorViewAsTest.ViewAsDifferentShape_CompatibleNumel
/paddle/test/cpp/compat/ATen_viewAs_test.cc:52: Failure
Expected equality of these values:
result.dim()
Which is: 1
2
[ FAILED ] TensorViewAsTest.ViewAsDifferentShape_CompatibleNumel (0 ms)
[ RUN ] TensorViewAsTest.ViewAsPreservesData
unknown file: Failure
C++ exception with description "
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 paddle::experimental::slice(paddle::Tensor const&, std::vector<long, std::allocator<long> > const&, paddle::experimental::IntArrayBase<paddle::Tensor> const&, paddle::experimental::IntArrayBase<paddle::Tensor> const&, std::vector<long, std::allocator<long> > const&, std::vector<long, std::allocator<long> > const&, paddle::optional<paddle::Tensor*>)
1 common::enforce::GetCurrentTraceBackString[abi:cxx11](bool)
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: The axis value should be less than the rank of input, but received axes[0] = 0, rank of input is 0.
[Hint: Expected axis < in_dims.size(), but received axis:0 >= in_dims.size():0.] (at /paddle/paddle/phi/kernels/funcs/slice_utils.h:252)
" thrown in the test body.
[ FAILED ] TensorViewAsTest.ViewAsPreservesData (2 ms)
[ RUN ] TensorViewAsTest.ViewAs1D_Flattens
/paddle/test/cpp/compat/ATen_viewAs_test.cc:76: Failure
Expected equality of these values:
result.dim()
Which is: 3
1
[ FAILED ] TensorViewAsTest.ViewAs1D_Flattens (0 ms)
[ RUN ] TensorViewAsTest.ViewAs_SameDataPointer
[ OK ] TensorViewAsTest.ViewAs_SameDataPointer (0 ms)
[----------] 5 tests from TensorViewAsTest (2 ms total)
[----------] Global test environment tear-down
[==========] 5 tests from 1 test case ran. (2 ms total)
[ PASSED ] 2 tests.
[ FAILED ] 3 tests, listed below:
[ FAILED ] TensorViewAsTest.ViewAsDifferentShape_CompatibleNumel
[ FAILED ] TensorViewAsTest.ViewAsPreservesData
[ FAILED ] TensorViewAsTest.ViewAs1D_Flattens
3 FAILED TESTS
Start 243: ATen_basic_test
7/8 Test #243: ATen_basic_test ..................***Failed 0.40 sec
XCCL /paddle/build/python/paddle/base/../libs/libbkcl.so loaded
[==========] Running 18 tests from 8 test cases.
[----------] Global test environment set-up.
[----------] 9 tests from TensorBaseTest
[ RUN ] TensorBaseTest.DataPtrAPIs
[ OK ] TensorBaseTest.DataPtrAPIs (0 ms)
[ RUN ] TensorBaseTest.TypeDeviceAPIs
[ OK ] TensorBaseTest.TypeDeviceAPIs (0 ms)
[ RUN ] TensorBaseTest.ModifyOperationAPIs
/paddle/test/cpp/compat/ATen_basic_test.cc:111: Failure
Expected equality of these values:
viewed.sizes()
Which is: { 2, 3 }
std::vector<int64_t>{6}
Which is: { 6 }
[ FAILED ] TensorBaseTest.ModifyOperationAPIs (0 ms)
[ RUN ] TensorBaseTest.LayoutAPI
[ OK ] TensorBaseTest.LayoutAPI (0 ms)
[ RUN ] TensorBaseTest.ResetAPI
[ OK ] TensorBaseTest.ResetAPI (0 ms)
[ RUN ] TensorBaseTest.IsNonOverlappingAndDenseAPI
/paddle/test/cpp/compat/ATen_basic_test.cc:380: Failure
Value of: transposed.is_contiguous()
Actual: true
Expected: false
[ FAILED ] TensorBaseTest.IsNonOverlappingAndDenseAPI (0 ms)
[ RUN ] TensorBaseTest.UndefinedAndNonDenseBranchCoverage
unknown file: Failure
C++ exception with description "
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 common::enforce::GetCurrentTraceBackString[abi:cxx11](bool)
----------------------
Error Message Summary:
----------------------
FatalError: FLAGS_use_stride_kernel is closed. Not contiguous Tensor found, something wrong has happened! (at /paddle/paddle/phi/core/tensor_meta.cc:221)
" thrown in the test body.
[ FAILED ] TensorBaseTest.UndefinedAndNonDenseBranchCoverage (2 ms)
[ RUN ] TensorBaseTest.ToDeviceAndMemoryFormatUnsupportedBranches
[ OK ] TensorBaseTest.ToDeviceAndMemoryFormatUnsupportedBranches (0 ms)
[ RUN ] TensorBaseTest.ToDtypeCastsWhenSupported
[ OK ] TensorBaseTest.ToDtypeCastsWhenSupported (0 ms)
[----------] 9 tests from TensorBaseTest (2 ms total)
[----------] 1 test from tensor_clone_test
[ RUN ] tensor_clone_test.BasicClone
[ OK ] tensor_clone_test.BasicClone (0 ms)
[----------] 1 test from tensor_clone_test (0 ms total)
[----------] 1 test from compat_basic_test
[ RUN ] compat_basic_test.BasicCase
Result[0] = 12
Result[1] = 12
Result[2] = 12
Result[3] = 12
Result[4] = 12
Result[5] = 12
10, 20, 30
[ OK ] compat_basic_test.BasicCase (0 ms)
[----------] 1 test from compat_basic_test (0 ms total)
[----------] 2 tests from TestDevice
[ RUN ] TestDevice.DeviceAPIsOnCUDA
[ OK ] TestDevice.DeviceAPIsOnCUDA (0 ms)
[ RUN ] TestDevice.DeviceAPIsOnCPU
[ OK ] TestDevice.DeviceAPIsOnCPU (0 ms)
[----------] 2 tests from TestDevice (0 ms total)
[----------] 1 test from TestTranspose
[ RUN ] TestTranspose.TransposeAPI
[ OK ] TestTranspose.TransposeAPI (0 ms)
[----------] 1 test from TestTranspose (0 ms total)
[----------] 1 test from TestSize
[ RUN ] TestSize.SizeNegativeIndex
[ OK ] TestSize.SizeNegativeIndex (0 ms)
[----------] 1 test from TestSize (0 ms total)
[----------] 1 test from TestTensorOperators
[ RUN ] TestTensorOperators.SubScriptOperator
[ OK ] TestTensorOperators.SubScriptOperator (0 ms)
[----------] 1 test from TestTensorOperators (0 ms total)
[----------] 2 tests from TensorBodyTest
[ RUN ] TensorBodyTest.ToBackendUnsupportedBranch
[ OK ] TensorBodyTest.ToBackendUnsupportedBranch (0 ms)
[ RUN ] TensorBodyTest.MetaUnsupportedBranch
[ OK ] TensorBodyTest.MetaUnsupportedBranch (0 ms)
[----------] 2 tests from TensorBodyTest (0 ms total)
[----------] Global test environment tear-down
[==========] 18 tests from 8 test cases ran. (2 ms total)
[ PASSED ] 15 tests.
[ FAILED ] 3 tests, listed below:
[ FAILED ] TensorBaseTest.ModifyOperationAPIs
[ FAILED ] TensorBaseTest.IsNonOverlappingAndDenseAPI
[ FAILED ] TensorBaseTest.UndefinedAndNonDenseBranchCoverage
3 FAILED TESTS
Start 1272: test_compat_slogdet
8/8 Test #1272: test_compat_slogdet .............. Passed 1.62 sec
13% tests passed, 7 tests failed out of 8
Total Test time (real) = 4.59 sec
The following tests FAILED:
207 - ATen_all_test (Failed)
209 - ATen_as_strided_test (Failed)
221 - ATen_from_blob_test (Failed)
223 - ATen_index_test (Failed)
238 - ATen_transpose_test (Failed)
241 - ATen_viewAs_test (Failed)
243 - ATen_basic_test (Failed)
Errors while running CTest |
@ShigureNyako 让你分析问题,你怎么把问题陈述一遍就完事了,现在是让你干活呢,不是让你做决策,你现在不是作为 reviewer,而是作为开发者,请分析下这个问题具体是怎么来的,日志在 https://github.com/PaddlePaddle/Paddle/actions/runs/24000029847/job/69994604415?pr=78580 就有,给出修复方案 (感觉这类任务可能更适合新开 session 啊) |
|
@SigureMo 我补看了你点的 Mac-CPU 日志( 1.
|
你这真的不是隐藏问题吗?torch 下这样的单测也是会挂的吗?请确定问题的根本原因进行修复,另外也请专注于 macOS 上的问题,现在问题不是 XPU 上的 |
|
@SigureMo 你这个追问是对的,我把 macOS 这条单独复核 了一遍,这里更正一下我上一条的表述:这次确实就是 hidden 问题本身,不是别的现象。 根因链路我现在可以收敛得更明确:
所以在
clang/mac 在这里对 torch::TensorOptions()做 qualified lookup 时,就会报现在日志里的这个错误:
我这里已经用一个最小复现把这个模式单独跑过了,和 Mac-CPU 日志报法是一致的。也就是说,问题不在 对这条 PR 来说,最小修复我建议直接改调用点,保留 torch::Tensor t =
torch::from_blob(pp, {3}, c10::TensorOptions().dtype(torch::kInt64));这样可以直接绕开 macOS 上这条 lookup 冲突。 如果后面要从 compat 层根治,我觉得可以再单独处理头文件导出方式,例如把 |
你说得对,应该把这里的导出删掉的,不能直接从 c10 导出到 torch |

PR Category
Execute Infrastructure
PR Types
Improvements
Description
统一使用 ATen 或 c10 命名测试文件,便于回归测试
处理未使用的变量来抑制警告
删除无用的预处理文件
是否引起精度变化
否