Skip to content

Discussion: musli as buffa's text-format framework #100

@iainmcgin

Description

@iainmcgin

Came up while planning YAML support (a protoyaml-go equivalent for buffa). Recording the analysis here so the option can be re-evaluated as both buffa and musli mature. This is not a near-term plan — the conclusion is "not now" — but the architectural fit is strong enough that it deserves a written case rather than a Slack thread.

Summary

buffa's three text-format paths today use three different mechanisms:

Format Mechanism Lives in
protobuf-canonical JSON serde::{Serialize, Deserialize} derives + with-modules buffa::json_helpers, buffa-types/*_ext.rs, codegen #[serde(...)] attributes
textproto hand-rolled TextFormat trait buffa::text
YAML (proposed) reuse the JSON serde derives behind a YAML carrier buffa-yaml extension crate

musli is a serialization framework in the same vein as serde, by udoprog. It has two design properties that map directly onto problems buffa has worked around:

  1. Mode-parameterized encode/decode. The same type can have different encoding rules per "mode" — selected by a type parameter, not ambient state.
  2. First-class Context. Diagnostics and configuration are passed in, not smuggled through thread-locals or globals.

What musli's mode system fixes

The thread-local / global wart

serde's Deserialize trait has no context parameter. To vary parsing behaviour at runtime — e.g. ignore_unknown_enum_values, or accepting YAML-canonical float specials only when parsing YAML — buffa uses ambient state (buffa::json::JsonParseOptions):

  • std: a thread-local, scoped via with_json_parse_options(&opts, || ...).
  • no_std: a leaked Box behind an AtomicPtr, set once for process lifetime via set_global_json_parse_options, with a debug_assert! if options change after first set.

This works but is a recurring source of friction: the two APIs are mutually exclusive, the no_std variant has set-once semantics that surprise, and the container-filtering behaviour (repeated_enum/map_enum skip-unknown) is unavailable under no_std because it needs the scoped override.

musli's Decode trait is parameterized over a mode type:

pub trait Decode<'de, M, A>: Sized
where
    A: Allocator,
{
    fn decode<D>(decoder: D) -> Result<Self, D::Error>
    where
        D: Decoder<'de>;
}

Format variations become distinct mode types resolved at compile time:

struct ProtoJson;   // strict protojson semantics
struct ProtoYaml;   // protoyaml-go-style lenience

impl<'de, A: Allocator> Decode<'de, ProtoJson, A> for Duration {
    fn decode<D: Decoder<'de>>(d: D) -> Result<Self, D::Error> {
        // strict: only "1.5s"
    }
}
impl<'de, A: Allocator> Decode<'de, ProtoYaml, A> for Duration {
    fn decode<D: Decoder<'de>>(d: D) -> Result<Self, D::Error> {
        // also "1m30.5s"
    }
}

Compiler-checked, monomorphized, no ambient state, no std/no_std split. The shared parts (most of the implementation) factor into a private helper that both modes call; only the divergent parsing arms differ.

The diagnostics gap

The value proposition of protoyaml-go over plain yaml.Unmarshal is its error reporting: every error carries file:line:col, the offending source line, and a ^ pointer. To do this it hand-walks the YAML AST tracking node spans alongside protoreflect field descriptors.

serde has no built-in path or span tracking. The serde_path_to_error crate bolts it on by wrapping the deserializer, but only path-level (no spans), and it cannot reach inside with-module or custom-Visitor code.

musli's Context trait carries this natively:

pub trait Context: Copy {
    type Mark;
    fn mark(self) -> Self::Mark;
    fn message_at<M>(self, mark: &Self::Mark, message: M) -> Self::Error;
    fn enter_struct(self, type_name: &'static str);
    fn enter_named_field<F>(self, type_name: &'static str, field: F);
    fn enter_variant<V>(self, type_name: &'static str, tag: V);
    fn enter_map_key<K>(self, field: K);
    fn enter_sequence_index(self, index: usize);
    // ...
}

The derive macro emits enter_* calls automatically, and a tracing Context implementation collects them. A YAML format implementation that maps Mark to byte offsets gets protoyaml-go-equivalent diagnostics — line/col plus field path — without any per-message hand-written code.

Unifying textproto

buffa::text is a hand-written TextFormat trait because serde's data model is a poor fit for textproto (extension keys, [type.googleapis.com/...] Any syntax, field-name/field-number addressing). musli's data model is intentionally "does not speak Rust" — it is closer to a protocol-shaped abstract machine — which makes it a more natural fit for textproto's quirks. A mode::TextProto under the same framework would replace the bespoke trait and benefit from the same Context diagnostics.

What it costs

There is no musli-yaml

musli ships musli::storage, musli::wire, musli::descriptive, musli::json, musli::value, and musli-serde (a serde→musli bridge). There is no YAML format, first- or third-party (verified against crates.io as of 2026-05). Adopting musli for YAML means writing a YAML encoder/decoder from scratch, or wrapping an existing parser (e.g. saphyr) the way serde-saphyr does. That is a project in itself, not an integration step.

Migration surface

Replacing serde with musli for buffa's text formats touches every text-format-aware part of the codebase:

Component Lines (approx.) Change
buffa/src/json_helpers.rs 1773 Rewrite with-modules as per-mode Encode/Decode impls
buffa-types/src/*_ext.rs ~400 (serde portions) Rewrite hand-written Serialize/Deserialize for WKTs
buffa-codegen attribute emission spread across message.rs, oneof.rs Replace #[serde(rename, alias, with, flatten, skip)] with #[musli(...)] equivalents
buffa-codegen custom Deserialize generate_custom_deserialize Reimplement against musli's MapDecoder
Conformance n/a Re-validate the full JSON suite
New musli-yaml format crate ~thousands Write

Downstream API contract

Generated message structs currently derive serde::Serialize and serde::Deserialize. Downstream consumers depend on this for things buffa has nothing to do with — axum extractors, sqlx::types::Json, config crates, anything that takes T: Serialize. Removing the serde derives is a hard breaking change for those consumers.

Two non-options:

  • Coexist. #[derive(serde::Serialize, serde::Deserialize, musli::Encode, musli::Decode)] is legal, but then the protobuf-special encode/decode logic (camelCase mapping, int64-as-string, base64 bytes, WKT special forms, oneof flattening) exists twice. That is duplication, not migration. The maintenance burden of keeping two parallel implementations conformant is worse than the wart it's replacing.
  • Bridge. musli-serde lets a musli format drive serde traits, but in the JSON-out direction: a musli encoder serializes a serde-only type. It does not give serde-only consumers access to a musli-defined format. It cannot save the migration surface.

The realistic shape is a major version bump where buffa's text-format API changes from "your type is serde::Serialize" to "use buffa::json::to_string(&msg)". Downstream serde integrations would need a thin adapter.

no_std and allocator alignment

musli's Decode<'de, M, A> carries an Allocator parameter. buffa uses extern crate alloc for no_std heap types. These are compatible in principle but differ in idiom — buffa's types own their Vec/String allocations directly; musli's allocator parameter is for the framework's own internal buffers. The fit needs validation, particularly for the decode_view zero-copy path.

musli's stability trajectory

The bare version 0.0.149 is misleading. The author uses a single 0.0.x series for the umbrella musli crate, but the cadence and the recent split of musli-core tell a more nuanced story.

Release timeline for the musli crate:

  • 2024-03 to 2024-04: 14 releases (0.0.1070.0.122). Heavy churn — GAT migration, Context introduction, allocator rework.
  • 2024-05 to 2025-04: 9 releases over a year (0.0.1220.0.131). Slowing.
  • 2025-08 to 2025-09: 18 releases in five weeks (0.0.1320.0.149). Cleanup burst. Notable changes: unitempty rename (in musli::value), SystemGlobal allocator rename, #[non_exhaustive] additions, Value internals encapsulation, derive-macro restructuring ("remove type-specific attributes"), #[musli::trait_defaults] requirement for manual Decoder impls.
  • 2025-09-14 to 2026-05: zero musli releases. Eight months of stability.

musli-core (the crate housing Encode, Decode, Encoder, Decoder, Context, and the mode machinery) was split out and given its own version series at 0.1.0 on 2025-09-10. It went 0.1.00.1.4 in eight days, then nothing. This is a deliberate stabilization signal — the trait surface is now versioned independently from the umbrella crate, and the author is implicitly committing to a 0.1.x compatibility window for it.

Diffing musli-core/src/de/decoder.rs and en/encoder.rs between 0.0.131 (2025-04) and 0.0.149 (2025-09) shows the changes are dominated by rustfmt re-indentation; the only functional change to the Decoder trait itself in that window is the #[musli::trait_defaults] requirement for manual implementors. The breaking changes are concentrated in musli::value (the buffered tree type), the allocator naming, and the derive macro.

Verdict: the trait surface buffa would build on (Encode/Decode/Encoder/Decoder/Context in musli-core) has been stable since September 2025 and is now versioned independently. The derive macro syntax churned more recently. Pinning musli-core and watching its changelog is viable. The risk is real but smaller than the version number suggests.

Recommendation

Not now. The case against is decisive in the near term:

  1. The serde derives on generated types are part of buffa's public contract. Removing them breaks downstream consumers in ways unrelated to text formats.
  2. The migration surface (json_helpers.rs, *_ext.rs, codegen attribute emission, custom-Deserialize codegen, conformance) is measured in thousands of lines.
  3. There is no musli-yaml, so this does not unblock YAML support — it just changes what's underneath the YAML support that still has to be written.
  4. musli-core is at 0.1.x. It is settling, but buffa publishing crates.io artifacts against a 0.x framework dependency means breaking bumps in musli are breaking bumps for buffa consumers.

Revisit when:

  • buffa is contemplating a 1.0 / 2.0 API break for unrelated reasons. The serde-derive removal cost is amortized across whatever else breaks.
  • musli-core reaches 1.0, or has a multi-year 0.1.x/0.2.x track record.
  • A musli-yaml exists, or buffa wants to invest in writing one (in which case writing it as a musli format is no more work than writing it as a serde format).
  • The JsonParseOptions thread-local / global design is causing measurable pain (today it is awkward but functional).

What to do instead, now

buffa-yaml on serde_norway, reusing the existing serde derives. The JsonParseOptions mechanism can grow a gated option (accept_yaml_float_literals) if a future carrier needs it; with serde_norway it is not needed — its scalar resolver delivers .inf / .nan to visit_f64 rather than as canonical strings, so buffa's existing float/double helpers already accept them.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions