-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[WIP] Interoperability with other Python binding frameworks #5800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Wow.
That's awesome. I see @pablogsal is involved, so "standard for Python <-> C-ish binding frameworks" seems believable. I'm not sure how much I can help here. This is a huge PR, I don't have a lot of spare time to review. But I sure like the direction.
After this is merged and released, wait a year or so for adoption, then optimize. The ABI break won't matter as much anymore. I'd also purge the conduit code at that time. |
I appreciate the kind words, but I only contributed a small typo fix in a macro after reading the code. All the credit for this great work goes to @oremanj. |
…hat we use 'foreign' specifically for types/frameworks that don't use our internals, rather than the concept of sharing with them
…mation needed for interop with them, plus a destructor that unregisters the enum when it's destroyed
0688784
to
64a6011
Compare
a31f732
to
2c8d4c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I glanced through most of the changes in this PR. I looked carefully only at a few.
I think it'll be most efficient if we video-meet for a couple hours 1:1 to go through together.
Things to discuss:
- pymetabind → pyunibind ("universal" sounds better and is more to the point than "meta")?
But I'm sticking to "meta" for the rest:
To some degree, the current naming
- equates "interop" with "metabind"
- equates "foreign" with "metabind"
Related:
Existing code in include/pybind11/detail/type_caster_base.h:
if (auto *result = foreign_typeinfo->module_local_load(src.ptr(), foreign_typeinfo)) {
This PR reinterprets "foreign" there as "remote". Hm.
Regarding "interop", see the main README:
pybind11 (v3) — Seamless interoperability between C++ and Python
The terms "interop" and "foreign" are too generic and ambiguous. I think it'll be significantly clearer to always use "metabind" when it actually is about metabind.
E.g.:
py::interoperate_by_default
→ py::metabind_by_default
pybind11/detail/foreign.h -> metabind_adapters.h
foreign_ok
-> metabind_ok
interop_internals
-> metabind_internals
get_interop_internals
-> get_metabind_internals
etc.
Last but not least: WDYT about breaking out a couple smaller PRs, as a preparation for this big one? E.g. the changes in detail/common.h or the native_enum code, or some of the changes in and around type_caster_base.h?
does mean that ``shared_ptr::use_count()`` won't work like you expect. (If | ||
``T`` inherits ``std::enable_shared_from_this``, then pybind11 can use that | ||
to find the existing ``shared_ptr``, and will do so instead.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does mean that ``shared_ptr::use_count()`` won't work like you expect. (If | |
``T`` inherits ``std::enable_shared_from_this``, then pybind11 can use that | |
to find the existing ``shared_ptr``, and will do so instead.) | |
does mean that ``shared_ptr::use_count()`` won't work like you expect. However, if | |
``T`` inherits from ``std::enable_shared_from_this``, then pybind11 will leverage that | |
to find the existing ``shared_ptr`` for reuse. |
* each known foreign binding, in the order in which they were imported, | ||
without making any distinction between other versions of pybind11 and | ||
non-pybind11 frameworks. (If automatic import is enabled, then the import | ||
order will match the original export order.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the order in which they were imported
In a large system, adding a new dependency could alter the import order and trigger subtle, hard-to-debug behavior changes.
If left unaddressed, it may be very difficult to retrofit a solution later. Some form of priority mechanism seems necessary; even a simple one would be far better than none.
For example, exporters could attach a list of string tags (e.g. ["pybind11", "major3", "minor1"]
or ["nanobind", …]
), and an import-side API could use these tags to disambiguate. We don’t need to get fancy in this PR, but establishing the mechanism now would leave the door open for more sophistication later, as needed.
this allows you to return instances of such bindings from outside the module in | ||
which they were defined. | ||
|
||
When performing C++-to-Python conversion of a type for which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was struggling to understand this paragraph, so I quizzed my favorite LLM about it. It came back suggesting this as a possible rewrite (I didn't try to fix up the markup):
When performing C++-to-Python conversion of a type for which :ref:automatic downcasting is applicable, the downcast occurs in whichever binding library actually constructs the Python object from the C++ value. For example, if a pybind11 function returns a Base*, pybind11 will attempt to downcast it to a Derived if appropriate. But if a foreign framework returns the Base*, then its own downcasting rules (if any) apply.
Some frameworks may only support limited downcasting (e.g. to the primary base class, with no multiple-inheritance adjustment), or may not support downcasting at all. In the latter case, the Python object will always appear as the base class, even if the underlying C++ pointer actually refers to a derived object. This means Python code may lose access to derived-class methods or attributes that would otherwise be available if the downcast had occurred.
This does not cause undefined behavior, but it may produce surprising results when mixing bindings from different frameworks. If your code relies on downcasting, ensure that the library responsible for constructing the Python object supports the kind of downcasting you expect.
deprecated APIs and their replacements, build system changes, general code | ||
modernization and other useful information. | ||
|
||
v3.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: I'm skipping over this until later.
#ifndef PYBIND11_INTERNALS_VERSION | ||
// REMINDER for next version bump: remove loader_life_support_tls | ||
// REMINDER for next version bump: remove loader_life_support_tls + | ||
// merge interop_internals into internals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ... is merging actually a good thing? Leaving this factored would seem more readable/maintainable?
WDYT about moving the new large struct below to e.g. detail/metabind_internals.h, again for clarity?
|
||
using copy_or_move_ctor = void *(*) (const void *); | ||
|
||
// Information to support working with types from other binding frameworks or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: I didn't review this in detail.
uint32_t size_bytes; | ||
bool is_signed; | ||
|
||
static const char *attribute_name() { return "__pybind11_enum__"; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "__pybind11_native_enum__"
? (To not mix up terminology for py::enum_
and py::native_enum
.)
#define PYBIND11_MODULE_LOCAL_ID \ | ||
"__pybind11_module_local_v" PYBIND11_TOSTRING(PYBIND11_INTERNALS_VERSION) \ | ||
PYBIND11_COMPILER_TYPE_LEADING_UNDERSCORE PYBIND11_PLATFORM_ABI_ID "__" | ||
#define PYBIND11_ABI_TAG \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it's also used in foreign.h:
self->name = "pybind11 " PYBIND11_ABI_TAG;
The name is referenced only in three places, one in a different file (I'm thinking of that as "far away").
My preference: long and meaningful, easy to read also in foreign.h:
E.g.: PYBIND11_INTERNALS_VERSION_PLATFORM_ABI_ID
|
||
private: | ||
object parent_scope; | ||
handle parent_scope; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is keeping the parent scope alive? Could you please add a comment to explain?
str enum_name; | ||
str native_type_name; | ||
std::string class_doc; | ||
capsule enum_info; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capsule metabind_info;
?
See also wjakob/nanobind#1140, the same feature for nanobind.
pymetabind is a proposed standard for Python <-> C-ish binding frameworks to be able to find and work with each other's types. For example, assuming versions of both pybind11 and nanobind that have adopted this standard, it would allow a pybind11-bound function to accept a parameter whose type is bound using nanobind, or to return such a type, or vice versa. Interoperability between different ABI versions or different domains of the same framework is supported under the same terms as interoperability between different frameworks. Compared to the
_pybind11_conduit_v1_
API, this one also supports implicit conversions and to-Python conversions, and should have significantly less overhead.The essence of this technique has been in use in production by my employer for a couple of years now to enable a large amount of pybind11 binding code to be ported to nanobind one compilation unit at a time. Almost everything that works natively works across framework boundaries too, at only a minor performance cost. Inheritance relationships and relinquishment (from-Python conversion of
unique_ptr<T>
) don't work cross-framework, and some of the more subtle corners of shared_ptr support probably don't transfer over; if you try to get ashared_ptr<T>
from a foreign T, you'll either get a new shared_ptr control block whose deleter drops a pyobject reference, or a new reference to theenable_shared_from_this
shared_ptr if T has one of those.This PR adds pybind11 support for exposing pybind11 types to pymetabind for other frameworks to use ("exporting") and using other frameworks' types that they have exposed to pymetabind ("importing"). Types bound by a different framework than the internals version of pybind11 that an extension module links with are called "foreign" to that module. This PR does not introduce an ABI break, but there are some ways that it could be simplified and sped up if/when you're willing to take one. I know I just missed 3.0 so I'm guessing that won't be for a while. :-)
One notable impact of this PR is that pybind11 now generates copy and move constructors for every type, instead of generating them lazily when an instance of that type is returned from a bound function. This is needed because we might be asked to copy or move an instance of that type via the foreign bindings framework even though it's never returned from a pybind11-bound function. The new constructors can break code that previously worked: if a type has no copy constructor and none of its immediate members are non-copyable but some of their subobjects are non-copyable, it will look copyable to std type traits, but actually generating the copy constructor produces an error. Similarly, you can now get "definition of implicit copy constructor is deprecated because it has a user-declared destructor" warnings on types that previously didn't produce them. I'm not overly worried about this since it's no worse than what would already happen when returning even
const Foo&
, but we could consider a #define that suppresses the constructor generation if you think this is too much of an upgrade hazard.Current status: nominally code complete and existing tests pass, but I haven't added interop-specific tests or public-facing docs yet.
Performance: Due to the inability to modify internals and type_info without an ABI break, once pybind11 knows about any foreign types, all failed casts incur an additional map lookup. Overhead should still be low for people who don't use the interoperability features at all. I haven't measured it.
Things that need to happen before this can be released:
[x] add unit tests
[x] add user-facing documentation
[ ] test correctness of nanobind/pybind11 interop
[ ] test performance
[ ] solicit feedback from maintainers of other binding libraries
[ ] release pymetabind v1.0, incorporating said feedback
Suggested changelog entry:
py::interoperate_by_default()
, a function or method that is bound using pybind11 can accept and return values of types that were bound using other binding libraries that support pymetabind, notably including nanobind versions !TBD! and later. This is intended to replace the pybind11 "conduit" feature eventually, but the conduit is still supported for compatibility with existing extensions. Compared to the conduit feature, this interoperability mechanism supports both from- and to-Python conversions, and correctly handles implicit conversions, conversions to and from copyable holders, and conversions tounique_ptr
.📚 Documentation preview 📚: https://pybind11--5800.org.readthedocs.build/