Skip to content

Conversation

@rFlex
Copy link

@rFlex rFlex commented Aug 15, 2025

This change adds support for generating C API from djinni files. This allows consumers to use Djinni APIs if they are not using C++, or they want to use an API from a previously built dynamic library without having to worry about ABI stability.

The change has two main components:

  • Runtime C helper utility: provides core reusability functionality used by the C codegen.
  • Djinni C codegen support: generates C header and C++ impl files that bridges C++ to C and vice versa.

To provide a stable ABI, one of the core aspect of this change is that beside primitive types like ints, floats, most types are represented as ref countable pointers. Optional primitive types are represented as an optional struct, the rest of the types are heap allocated pointers (array, record, interface etc...).

Here is an example of djinni file:

client_returned_record = record {
    record_id: i64;
    content: string;
    misc: optional<string>;
}

and its C header output:

/** Record returned by a client */
typedef djinni_record_ref testsuite_client_returned_record_ref;

testsuite_client_returned_record_ref testsuite_client_returned_record_create(int64_t record_id, djinni_string_ref content, djinni_string_ref misc);
int64_t testsuite_client_returned_record_get_record_id(testsuite_client_returned_record_ref instance);
void testsuite_client_returned_record_set_record_id(testsuite_client_returned_record_ref instance, int64_t value);

djinni_string_ref testsuite_client_returned_record_get_content(testsuite_client_returned_record_ref instance);
void testsuite_client_returned_record_set_content(testsuite_client_returned_record_ref instance, djinni_string_ref value);

djinni_string_ref testsuite_client_returned_record_get_misc(testsuite_client_returned_record_ref instance);
void testsuite_client_returned_record_set_misc(testsuite_client_returned_record_ref instance, djinni_string_ref value);

Here is the type mapping between djinni and djinni c:

i8 -> int8_t
i16 -> int16_t
i32 -> int32_t
i64 -> int64_t
f32 -> float
f64 -> double
bool -> bool
string -> djinni_string_ref
binary -> djinni_binary_ref
DataRef -> djinni_binary_ref
DataView -> djinni_binary_ref
list<T> -> djinni_array_ref
set<T> -> djinni_array_ref
map<K, V> -> djinni_keyval_ref
optional<i32> -> djinni_optional_int64_t (same for all primitive types)
optional<string> -> djinni_string_ref (check for nullptr to see if the type is null)

record -> djinni_record_ref (each record gets a new dedicated typedef)
interface -> djinni_interface_ref (each interface gets a new dedicated typedef)
enum -> <generated> (each enum gets its own C enum)

Note that set types are just converted into array, and map are just converted into key value arrays (an array of key value).
All Djinni C objects inherit a common base ref countable types, which make it possible to create re-usable containers like djinni_array_ref. Passing binaries around are copy less in many scenarios.

C Proxy objects are also supported. For +interface types, the C codegen generates two methods:

  • proxy_class_new: Creates a new proxy class with a provided method table
  • new: Creates a new proxy object with the given proxy class and opaque pointer.

Example:

typedef struct 
{
    void (*callForObjC)(void *, testsuite_objc_only_listener_ref);
    testsuite_objc_only_listener_ref (*returnForObjC)(void *);
    void (*callForJava)(void *, testsuite_java_only_listener_ref);
    testsuite_java_only_listener_ref (*returnForJava)(void *);
} testsuite_uses_single_language_listeners_method_defs;

testsuite_uses_single_language_listeners_proxy_class_ref testsuite_uses_single_language_listeners_proxy_class_new(const testsuite_uses_single_language_listeners_method_defs *method_defs, djinni_opaque_deallocator opaque_deallocator);

testsuite_uses_single_language_listeners_ref testsuite_uses_single_language_listeners_new(testsuite_uses_single_language_listeners_proxy_class_ref proxy_class, void *opaque);

Unlike on the other languages integration, there is no pointer stability when sharing proxy objects between languages, meaning that it's not possible to do == at the C level for checking whether the instance is actually the same.

For Protobuf support, djinni_binary_ref is used and the codegen automatically serializes/deserializes the Proto message. C users are expected to provide Proto data as binary.

The PR supports also the additional lib types that djinni supports, like future and outcome. C code can create promises and provide future on them to C++ code. C++ code can provide futures to C code, C code can observe completion of these futures.

@li-feng-sc
Copy link
Contributor

li-feng-sc commented Aug 20, 2025

Thanks @scorsinsc .

  • For String and Binary, since they can't contain other objects, I think it might be possible to implement them based on their C++ types (std::string, std::vector). This means we can potentially move the C++ type into them (especially when they are return values).
  • Correct me but I think it should be possible to implement object identity. The Java and ObjC proxy cache was implemented in a rather convoluted way using templates. But the wasm marshalling library has a relatively clean reimplementation. The object identity feature not only allows instance comparison, but also prevents nested proxies when the same object crosses the boundary back and force multiple times.

@li-feng-sc
Copy link
Contributor

btw, the CI error is from a sanity checker that checks whether all generated files are up to date after changing codegen.

@rFlex
Copy link
Author

rFlex commented Aug 20, 2025

@li-feng-sc yes I considered this too, that's why I kept the move semantics in the helpers for strings and binaries to make this optimization possible:

class String {
public:
  static djinni_string_ref fromCpp(std::string &&str);
  static djinni_string_ref fromCpp(const std::string &str);
  static std::string toCpp(djinni_string_ref str);
};

class Binary {
public:
  static std::vector<uint8_t> toCpp(djinni_binary_ref binary);
  static djinni_binary_ref fromCpp(std::vector<uint8_t> &&binary);
  static djinni_binary_ref fromCpp(const std::vector<uint8_t> &binary);
};

I didn't already do it for string because the only cases where you'd get some level of gain for strings are for heap allocated strings that are above 23 characters. We heap allocate the ref counted String object anyway, with an inline container that is sized precisely (+ alignment padding) to the actual size of the string. Any string less than 23 characters would end up using less memory with this system that if it was embedding an std string. As such I wasn't sure it was worth it. The best case scenario for moving the string are for very large strings, which are copied anyway in the other languages when crossing language boundaries, so data ref or data view are just better representation anyway for these types. I'll add the optimization for binary, this one is a straightforward case.

Let me see for the wasm impl if I can find some inspiration to implement this easily. Note that object equality will work at the C++ level: a proxy created from C results in a single C++ proxy object that is shared (as the C wrapper holds an actual C++ object). What doesn't work is the other way around, passing a C++ object (interface, not record) back into C, which recreates the proxy (which is just a single heap allocated object holding the std::shared_ptr). However you can't have nested proxies, because unwrapping works in one direction.

@rFlex
Copy link
Author

rFlex commented Aug 20, 2025

actually I forgot I had already implemented that binary optimization: https://github.com/Snapchat/djinni/pull/202/files#diff-828edbe15d68e62e31ea2a16831565b4f308ed3d20c8f708c23250ffed33db52R19 .
The only case where binary is a copy is when retrieving binary from a member of a record. That is because we can't manage memory on that binary vector: how can we make sure the vector that is inside the record stays alive as long as the C side holds it? I thought to solve this in these cases by having the binary ref holds a retaining reference on the record itself (since it is ref counted in this C system).

For example for:

UUID = record {
  data: binary;
}

could have an equivalent C implementation like this:

static void djiniBinaryReleaseRef(uint8_t *data, size_t size, void *opaque) {
  djinni_ref_release(opaque);
}

djinni_binary_ref UUID_get_data(UUID_ref *instance) {
  const auto &vec = toCpp<UUID>(instance);
  djinni_ref_retain(instance);
  return djinni_binary_new_with_bytes(vec.data(), vec.size(), instance, &djiniBinaryReleaseRef);
}

@li-feng-sc
Copy link
Contributor

The only case where binary is a copy is when retrieving binary from a member of a record

This has to copy. We cannot mutate a record simply by accessing its member.

Comment on lines 338 to 348
val proxyClassNameCpp = writeProxyClass(w, ident, methodDefsStructName, resolvedMethods)
w.w(s"${proxyClassName} ${prefix}_proxy_class_new(const ${methodDefsStructName} *method_defs, djinni_opaque_deallocator opaque_deallocator)")
w.braced {
w.wl(s"return ::djinni::c_api::ProxyTranslator<${methodDefsStructName}>::makeClass(method_defs, opaque_deallocator);")
}
w.wl

w.w(s"${typeName} ${prefix}_new(${proxyClassName} proxy_class, void *opaque)")
w.braced {
w.wl(s"return ::djinni::c_api::ProxyTranslator<${methodDefsStructName}>::make<${proxyClassNameCpp}, ${selfCpp}>(proxy_class, opaque);")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to generate these when the interface is not marked as implementable in C (+cc).

Most interfaces are +c only, and we can save some code size by omitting the proxy code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed so that proxy are not emitted unless +cc is passed

Comment on lines 29 to 32
// Any way to make this better?

return futureHolder->getFuture().then(
[](::djinni::Future<T> value) { return value.get(); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class FutureHolder {
    Future<T> moveFuture() { return std::move(_future); }
}
...
return futurHolder->moveFuture();

Will this work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added complete support for future

Comment on lines 24 to 35
constexpr inline ValueType *getContainerStartPtr(T *object) {
return reinterpret_cast<ValueType *>(
reinterpret_cast<uint8_t *>(object) +
alignUp(sizeof(T), alignof(ValueType)));
}

constexpr inline const ValueType *getContainerStartPtr(const T *object) {
return reinterpret_cast<const ValueType *>(
reinterpret_cast<const uint8_t *>(object) +
alignUp(sizeof(T), alignof(ValueType)));
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be made static. So you won't need to create an instance just to get data offset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you don't customize the allocator with user states, you could just call operator new directly auto *arrayRegion = operator new(allocSize); and turn the allocate method to static too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this class from Valdi. It was using the same pattern from C++: a stateless std::allocator . Constructing the instance had no effect, but changed to static methods anyway.

@rFlex
Copy link
Author

rFlex commented Aug 21, 2025

@li-feng-sc updated the PR a bunch:

  • No C proxy generated unless +cc is passed
  • Moved ProxyTranslator methods into InterfaceTranslator
  • Cleaned up inline container allocation code a bit following your suggestions
  • Codegen no longer uses lambda for converting nested types, it passes nested dependent translator types as type arguments to the "container" translators:
::djinni::c_api::OptionalPtrTranslator<::djinni::c_api::InterfaceTranslator<::testsuite::ClientInterface>>::toCpp(i)

@rFlex
Copy link
Author

rFlex commented Aug 21, 2025

Example of promise created from C, passed to C++ as future, C provides an int, C++ receives the value as int, convert to string, C retrieves the value back as string:

TEST(DjinniCAPI, supportsFutureCreatedFromC) {

Exception of exception handling:
TEST(DjinniCAPI, canCatchExceptions) {

C code can call djinni_push_exception_handler() to put an exception handler on top of the exception handler stack.

@li-feng-sc
Copy link
Contributor

lgtm. happy to merge once all the ci failures are fixed (i think right now it's still complaining about some stale codegen)

@rFlex
Copy link
Author

rFlex commented Aug 22, 2025

@li-feng-sc fixed it, I updated Bazel to a more a recent version in the PR (djinni was using an outdated and deprecated version of Bazel) and had to explicitly disable bzlmod to prevent additional files to be generated by Bazel.

@li-feng-sc li-feng-sc merged commit 3c6267a into Snapchat:experimental-record Aug 26, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants