-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Fix smart_holder
multiple/virtual inheritance bugs in shared_ptr
and unique_ptr
to-Python conversions
#5836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
``` /__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:44:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors] 44 | virtual ~Left() = default; | ~~~~~~~ ^ | override /__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:48:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors] 48 | virtual ~Right() = default; | ~~~~~~~ ^ | override ```
CI results @ commit 5f77da3: 66 failing, 2 skipped, 17 successful checks CI: https://github.com/pybind/pybind11/actions/runs/17707681525?pr=5836 |
CI results @ commit 3620ceb: 20 failing, 2 skipped, 63 successful checks CI: https://github.com/pybind/pybind11/actions/runs/17713020523?pr=5836 |
ChatGPT: * shared_ptr’s ctor can throw (control-block alloc). Using get() keeps unique_ptr owning the memory if that happens, so no leak. * Only after the shared_ptr is successfully constructed do you release(), transferring ownership exactly once.
…ampoline code (which often uses the term "alias", too)
…ithin test_class_sh_mi_thunks.cpp
``` /__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:67:5: error: 'auto ptr' can be declared as 'auto *ptr' [readability-qualified-auto,-warnings-as-errors] 67 | auto ptr = new Diamond; | ^~~~ | auto * ```
smart_holder
multiple/virtual inheritance bugs in shared_ptr
and unique_ptr
to-Python conversions
@henryiii This PR is ready for review. @MannixYang If you get a chance to try this out, please report back. |
@rwgk Thank you very much indeed. Yes, I will give it a try. It should be within the next two days. |
@rwgk Great! This is very useful. My project can now run. Thank you again. |
Thanks a lot for confirming! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @iwanders, is there a chance that you could help reviewing this PR?
👍 definitely, I did go through it and made two minor comments. The code around the handling of deleters looks good to me.
// Relinquish ownership only after successful construction of owner | ||
(void) unq_ptr.release(); | ||
|
||
// Publish either the subobject alias (for identity/VI) or the full object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'd elaborate a bit in this comment. This part feels a bit like magic to me why this is necessary and fixes a bug when this is used in msvc. I think I understand the problem to be that the shared pointer 'owner' must always point to the start of the entire object (and the deleter operates on that), but that the actual smart holder should 'hold' the subclass, which may not be at the start of the object itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: commit 31886ec
tests/test_class_sh_mi_thunks.cpp
Outdated
int vbase_tag = 42; // ensure it's not empty | ||
}; | ||
|
||
// Left/right add some weight to steer layout differences across compilers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this matters, but Left
and Right
are pretty similar a long long
is guaranteed to be 64 bits, and the 7 long array is likely going to be padded to 8, which would make them the same size? To me the comment contradicts the code a bit in that sense. Maybe just having any attribute is enough, but if the sizes ought to be different I'd make it very clear, say char pad_l[4]
and char pad_r[16]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: commit ba89e1d
Here is what ChatGPT is saying:
Good catch — the original comment was misleading, since char[7]
and long long
can end up the same size after padding. The real intent is simply to make the virtual bases non-empty and, ideally, asymmetrically sized/aligned so compilers are more likely to give us non-zero subobject offsets. I’ve updated the code to use small but clearly asymmetric paddings (char[4]
vs char[16]
, with optional alignment) and clarified the comment to reflect this. The test still works correctly even if a compiler chooses offset 0, since that case is explicitly logged via test_virtual_base_at_offset_0()
.
In Ralf's own words: I was just super happy when the GhatGPT-generated test triggered failures on both Linux and Windows [see comment from two weeks ago] and didn't look as closely at the test code as you did. However, now I pushed ChatGPT harder, to "really bend things out of shape", but it assured me that the new code is as good as we can make it.
I'll rerun the CI. When it's done I'll look for test_virtual_base_at_offset_0()
skip messages again. (I think I did that before, although I forgot to log that here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just super happy when the GhatGPT-generated test triggered failures on both Linux and Windows
Agreed, that is pretty impressive and definitely speeds things up. Thanks for sharing your ChatGPT conversation btw, I’ve bookmarked that to read through when I’m not on the road. I don’t have much experience using llms, so expect it’ll be educational for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just super happy when the GhatGPT-generated test triggered failures on both Linux and Windows
Agreed, that is pretty impressive and definitely speeds things up. Thanks for sharing your ChatGPT conversation btw, I’ve bookmarked that to read through when I’m not on the road. I don’t have much experience using llms, so expect it’ll be educational for me.
My time budget for pybind11 maintenance work has become super tiny. I think without ChatGPT I'd have to give up.
Thanks for your review, this was super helpful.
A couple minutes ago I replaced the pytest.skip()
with an assert
, to guarantee test quality in the future. It'd be great if you could take a final look and formally approve if it looks good to you.
``` "D:\a\pybind11\pybind11\build\ALL_BUILD.vcxproj" (default target) (1) -> "D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj" (default target) (13) -> (ClCompile target) -> D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(70,17): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(80,43): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: 'std::_Ref_count_obj2<_Ty>': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: with [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: [ [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: _Ty=test_class_sh_mi_thunks::Diamond [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: ] [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] D:\a\pybind11\pybind11\include\pybind11\detail\init.h(77,21): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj] ``` The warning came from alignas(16) making Diamond over-aligned, while regular new/make_shared aren’t guaranteed to return 16-byte aligned memory on MSVC (hence C4316). I’ve removed the explicit alignment and switched to asymmetric payload sizes (char[4] vs char[24]), which still nudges MI layout without relying on over-alignment. This keeps the test goal and eliminates the warning across all MSVC builds. If we ever want to stress over-alignment explicitly, we can add aligned operator new/delete under __cpp_aligned_new, but that’s more than we need here.
Checking for
|
…set_0() and replace pytest.skip() with assert. Add helpful comment for future maintainers.
Description
Closes #5786.
This PR fixes two independent bugs in the
smart_holder
machinery, both exposed by new tests involving virtual inheritance.1. Fix in
pybind11/detail/type_caster_base.h
The return-value caster for
std::shared_ptr<T>
was incorrectly usingsrc.get()
(the most-derived pointer) instead ofst.first
(the adjusted subobject pointer appropriate for the registeredtinfo
).On MSVC with virtual inheritance, this mismatch led to passing an
Animal*
(see below) toregister_instance
when bindingTiger
, which caused the virtual-base offset walker to dereference a bogus vbptr.This is the bug reported in #5786. Using
st.first
ensures the correct subobject is registered.2. Fix in
pybind11/detail/struct_smart_holder.h
smart_holder::from_unique_ptr
was storing the subobject pointer as the owned resource in its control block. Under MSVC, destroying through that misaligned pointer caused NX faults in the virtual destructor path.The fix mirrors the
shared_ptr
path: always own the realT*
object start with the deleter, and, if needed, create an aliasingshared_ptr<void>
at the subobject pointer for registration/identity.Tests
test_class_sh_mi_thunks
) was added. This was designed to also fail on Linux/Clang/GCC without the fix (not just on MSVC). Indeed, the diamond case fails across allci.yml
jobs and severaltests-cibw.yml
jobs pre-fix, and passes after fix (see comment).smart_holder_from_unique_ptr
, which immediately exposed the second bug. That test produced 20 MSVC failures pre-fix (comment), and passes after the fix.test_virtual_base_at_offset_0
is included to make it explicit when a compiler/layout places the virtual base at offset 0. This isn’t a true skip, but a way to document in the pytest summary when the test can’t exercise the MI/VI code path. In practice, this skip has never been triggered in any GitHub Actions job.assert
s remain inpybind11/detail/class.h
. They were invaluable during debugging and experimentation, have no impact on non-debug builds, and may be useful again if MI/VI edge cases resurface.Context
The changes here are in the same spirit as PR #4380, which fixed earlier oversights in
smart_holder
MI handling. As noted there, MI use cases are uncommon in downstream projects (and even discouraged in some organizations), so coverage has historically been thin. The new tests give us much better confidence that bothshared_ptr
andunique_ptr
adoption paths are now robust under virtual and multiple inheritance.Full ChatGPT conversation guiding the work on the fixes (very long and involved): https://chatgpt.com/share/68c7950a-61b8-8008-8678-53b96643ce7e
A lot of the credit for the fixes goes to ChatGPT 5 Pro
Suggested changelog entry:
Fixed two
smart_holder
bugs inshared_ptr
andunique_ptr
adoption with multiple/virtual inheritance:shared_ptr
to-Python caster now registers the correct subobject pointer (fixes [BUG]: Using smart_holder in virtual inheritance relationship error #5786).unique_ptr
adoption now owns the proper object start while aliasing subobject pointers for registration, fixing MSVC crashes during destruction.