feat: opt-in non-Candid wire format via encoder / decoder parenthetical#5996
Draft
feat: opt-in non-Candid wire format via encoder / decoder parenthetical#5996
encoder / decoder parenthetical#5996Conversation
actor methods
Adds syntax `(with encoder = <func>) public func name() : async T = ...` to annotate the serialization encoder for an actor method's return value. - `vis'` type gains `exp option` (the annotation) alongside the deprecation string; moved into the `exp` mutual recursion group to resolve the forward-reference - Parser: `vis → parenthetical PUBLIC` with `%prec VIS_NO_PAREN` to resolve the 3 shift/reduce conflicts from the nullable vis prefix; `--strict` preserved - Type checker: `check_vis_parenthetical` validates the encoder in checking mode against the expected type `ret_typ → Blob` (async peeled via `Promises`/`ts2`); non-method placements warn with M0212; effects forbidden (M0215) - Test: `test/run-drun/parenthetical-public.mo` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…freedom IR representation: - `FuncE` gains an 8th `exp option` field for the encoder (None everywhere except public actor methods with a `(with encoder = …)` annotation) - All IR passes, rename, subst_var, arrange, freevars, interpreter, and both codegens updated: pass-through or wildcard `_enc` as appropriate Desugaring: - `build_encoders` mirrors `build_stabs` to correctly pair each IR dec with its optional encoder across IncludeD/TypD expansion - In `build_actor` the encoder expression is extracted from the vis parenthetical's `encoder` field and injected into the IR `FuncE` slot Checking: - IR type-checker (`check_ir.ml`): verifies encoder type is `ret_typ → Blob` and effect is `T.Triv` - Source type-checker (`typing.ml`): `check_vis_parenthetical` additionally guards `note_eff = T.Triv`, emitting M0215 if the annotation has effects Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Refactors `ICReplyPrim` to carry an `exp option` encoder slot instead of using a mutable `reply_encoder` field on the compilation environment. The encoder expression (user-supplied `T -> Blob` function) is injected by `desugar.ml`'s `build_actor` into the `FuncE` IR node, propagated through all IR passes (`erase_typ_field`, `show`, `eq`, `await`, `async`), and consumed by `async.ml`'s CPS transform to build the reply continuation `k` as `ICReplyPrim (ts, Some enc')` instead of the default Candid path. Both classical and enhanced backends call the encoder closure and send the raw blob bytes via `IC.reply_with_data`, bypassing Candid serialization. `arrange_ir` and `check_ir` are updated to print/validate the encoder. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- `check_vis_parenthetical` rewritten to do per-field bidirectional
checking against a known-fields table. `encoder` keeps its type
`ret_typ -> Blob`; `decode` is added with the symmetric type
`Blob -> arg_typ` (the method's ingress type). No desugaring yet —
the field is plumbed through typing only, future work will tighten
semantics or invert the direction (codec types driving method sig).
- Effect-free check is now per-field and names the offending label
in M0215, replacing the encoder-only message.
- Positive test (`parenthetical-decode.mo`): a `Blob -> ?Nat` flow
pipeline composing `decodeUtf8` (Blob -> ?Text) with `Nat.fromText`
(Text -> ?Nat) via `do ?` ; method returns Nat over standard Candid.
- Fail tests (`parenthetical-{encoder,decode}-effect.mo`): a parenthetical
field with embedded async block triggers M0215.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rejects a `Blob -> ?Text` decoder when the method's ingress is `?Nat` — bidirectional check pushes the expected return into the FuncE body and catches the mismatch at M0095 (finer than the field-level M0214). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror of parenthetical-decode-mismatch on the encoder side: a `Nat -> Text` encoder for a method returning `Nat` is rejected at M0095, pointed precisely at the `"oops"` literal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The AST interpreter never visits `encoder`/`decode` payloads on `Public(_, Some par)`: it models the high-level `[run]` semantics where Candid serialization isn't modelled, so any wire-byte transform is moot. Comment surfaces this so reviewers don't wonder why the parenthetical is invisible here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the ad-hoc \`exp option (* encoder *)\` trailing field on
\`FuncE\` with a labeled record \`{ encoder; decoder }\`. The encoder
behaviour is unchanged; \`decoder\` is always \`None\` until desugaring
lands. Pure rename/repackage — no behaviour change, all tests pass.
Threading sites updated:
- ir.ml: type def + record + doc note (incl. actor-level inheritance
TODO)
- construct.ml: \`no_codecs\` helper, three FuncE constructors
- check_ir.ml: type-check the decoder field on the same footing as
encoder (Blob -> seq ts1, effect-free)
- arrange_ir.ml: print decoder alongside encoder
- desugar.ml: build_actor still installs encoder; decoder stays None
- async.ml, await.ml, erase_typ_field.ml: thread codecs through
- rename/subst_var/tailcall/freevars/const/eq/show/interpret_ir/
compile_classical/compile_enhanced: pass-through pattern bindings
silently retyped from \`exp option\` to \`codecs\`, no source change
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the encoder/decoder parenthetical work — current state, pending desugaring + codegen, actor-level inheritance idea, OpenAPI/ Web2 motivation as the strategic driver. Replaces an in-code TODO in ir.ml with a pointer to the plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Desugaring (\`desugar.ml\`):
- Factor \`find_codec_in_par lab\` from the existing \`find_encoder_in_par\`.
- New \`find_decode_in_par\` mirror.
- \`build_codecs\` (replacing \`build_encoders\`) returns \`(enc_opt,
dec_opt) list\`, one per dec-field.
- \`build_actor\` now stamps both fields onto the FuncE's \`codecs\`
record, preserving any pre-existing per-field override.
Codegen (\`compile_classical.ml\` & \`compile_enhanced.ml\`):
- New \`?(decoder=None)\` optional argument threaded through
\`FuncDec.{lit,closed,compile_const_message}\`. Type:
\`(E.t -> VarEnv.t -> G.t) option\` — a thunk that compiles the
decoder \`exp\` with the inner env/ae of the message handler.
Caller (in \`compile_exp\` / \`compile_const_exp\`) wraps
\`compile_exp_vanilla\` so the closure is generated where the
function is actually in scope.
- Inside \`compile_const_message\`, branch on \`decoder\` at the
argument-decoding step: \`None\` keeps \`Serialization.deserialize\`;
\`Some compile_dec\` emits \`compile_dec; closure-call\` on the raw
\`IC.arg_data\` instead.
Test (\`parenthetical-decode.mo\` & \`.ok\` regen):
- Method ingress is \`?Nat\`; decoder is \`Blob -> ?Nat\` built as
\`do ? { Nat.fromText((decodeUtf8 b)!)! }\`.
- //CALL payload is the raw three ASCII bytes \"123\" (0x313233) —
*not* a Candid envelope. The decoder turns it into \`?123\`; the
method echoes \`123\` back via standard Candid (\`0x...017d7b\`).
- Demonstrates that ingress Candid is genuinely bypassed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
actor methodsencoder / decode parenthetical
The codec-annotation commit (f9b629c) widened \`Public\`'s payload from \`None\` to \`(None, None)\`, taking \`extract.ml:84\` over the line-length limit. Splitting the trailing \`@@ d.at)\` onto its own line restores ocamlformat conformance and unblocks CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Methods that opt out of Candid via \`encoder\`/\`decode\` shouldn't appear in the Candid interface surfaces: - the \`__get_candid_interface_tmpl_v1\` canister metadata blob the IC serves to Candid-aware tooling, and - the \`.did\` file produced by \`moc --idl\`. Candid-only clients (other canisters, \`dfx call\`, \`didc\`) would otherwise Candid-encode arguments and Candid-decode replies that the canister never produces. Cleanest first cut is to suppress the whole method when either codec is set; partial entries (e.g. decoder-only methods keeping their Candid-shaped reply in the dictionary) can come later. The \`Type.t\` for an actor is currently codec-blind, so we'll either need a side-table mapping method-name → codec-presence, or to filter at the IR level before type extraction. Side-table keeps \`Type.t\` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two pending items \"decoder desugaring\" and \"codegen hook\" both landed in 07b02b8. Section heading updated, content rewritten to describe what shipped (incl. the thunk parameter through FuncDec and the \`0x313233 -> Nat 123\` end-to-end test). Removed the two items from \"Pending work\" and renumbered the rest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
encoder / decode parentheticalencoder / decode parenthetical
Symmetry with \`encoder\`. Touch list: - typing.ml: known-fields table flips the label string. - desugar.ml: \`find_decode_in_par\` → \`find_decoder_in_par\`, \`find_codec_in_par "decoder"\` lookup. - Three test files renamed (\`parenthetical-decode*.mo\` → \`-decoder*\`); .ok files regenerated via \`accept\` to pick up the new field name in the rendered diagnostics and the new test-name prefix in the location prefix. - Plan doc rewritten throughout to use \`decoder\`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
encoder / decode parentheticalencoder / decoder parenthetical
…d-trip
First step toward a benchmark that decodes a real Apple Event Object
Specifier (a non-Candid wire format) on ingress, identity-passes it
through the actor body, and re-encodes it on egress — counting cycles
and heap delta on each codec leg.
This commit lays the foundation:
- Adds the `query.md` design plan (AEOM-inspired heap querying) to
the tracked plans, since the bench architecture follows it.
- Scaffolds `test/bench/object-spec.mo` with the full type surface
from that plan — `ObjectSpec`, `KeyForm`, `BoolExpr`, `Comparison`,
`CandidValue` — and a builder for the running example query
("every client's yearly income whose country is Germany and age
between 45 and 55 years").
- Codec stubs (`encode`, `decode`) plus a harness that already wires
up `payload_bytes` / `decode_{heap,cycles}` / `encode_{heap,cycles}`
reporting, so the schema is stable as the codec fills in.
The roadmap (sketched in the file's header comment): AE binary samples
generated via macOS `osarun`/`osacompile` → Motoko AE decoder → matching
encoder → wired into a `(with decoder = …; encoder = …)` parenthetical
on a public actor method whose body is the identity. Each step is a
separate commit and the harness already reports the right keys.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes from review: 1. Rename `encode` → `encoder`, `decode` → `decoder` so the function names match the parenthetical-field names they'll eventually be wired into (`(with encoder = …; decoder = …)`). 2. Move the cycle/heap measurement *inside* `encoder` and `decoder` themselves, parameterised by a `stage` label. Each call now self-reports — including the previously-untimed pre-encode that builds the wire fixture. When v4 lands a real encoder, all three legs (pre / ingress / egress) will yield separate cost lines under `encoder/pre`, `decoder/ingress`, `encoder/egress`. `go` now does no measurement of its own; it just sequences the three calls. Output schema gains a `stage` key so a parser can attribute costs unambiguously. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an `appscript-src` flake input pointing at hhas/appscript on GitHub and a `nix/ae-encoder.nix` derivation that builds the Python `appscript` package from its `py-appscript/` subdirectory and exposes a small CLI script via `pkgs.writeShellApplication`. The intent is to provide a reproducible AE compact-binary fixture generator for `test/bench/object-spec.mo` without checking any Python code into the motoko source tree — the harness body lives as a string literal in `nix/ae-encoder.nix`, not as a file in the repo. The derivation is darwin-only: `appscript` links against `AEvent.framework` and uses PyObjC, neither of which has working Linux builds. `flake.nix` exposes `packages.<system>.ae-encoder` only on `aarch64-darwin` / `x86_64-darwin` via `lib.optionalAttrs`; `nix flake show` on Linux silently omits the attribute. Build fix-ups for upstream's setuptools incompatibility: - `lib/appscript/__init__.py` declares `__version__ = 'dev'` which modern setuptools (PEP 440) rejects; `postPatch` substitutes a conventional placeholder version `1.3.0`. - `doCheck = false` since upstream ships no usable Python test suite. v1 harness is a smoke-test only — `nix run .#ae-encoder` confirms the appscript/aem stack imports cleanly. Subsequent commits grow the harness into a real fixture generator that takes a query name and prints AE-binary hex for the bench to embed as `Blob` literals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the v1 smoke-test stub with a real fixture generator. The
harness now constructs the German-clients query
every client's yearly income whose country is Germany and 45 <= age <= 55
via aem's reference builder (`app.elements('clnt').byfilter(...)`),
packs it through `AEM_packself`, and prints the flattened compact-
binary form as `<name>=<hex>` on stdout. The hex output matches the
spec verbatim — `obj `, `want`, `clnt`, `cmpd`, `logi`, `AND `,
`>= `, `Germany` (UTF-16), all visible in the bytes.
Catalogue is a single entry today (`german_midlife_client_income`);
new queries register by adding a builder function and a `QUERIES`
entry — no other surface area to touch.
Subsequent commits will:
- pipe the hex output into the bench's `Blob` literals (manually for
now; possibly via a generated `.mo` file later);
- write the Motoko AE decoder so the bench actually times the
ingress codec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`check_vis_parenthetical` was being invoked with the *outer* env at
[infer_obj][src/mo_frontend/typing.ml#L3805], not with the env
enriched by the actor's `scope`. As a result, a parenthetical like
(with encoder; decoder)
public func go(spec : ObjectSpec) : async ObjectSpec { spec }
failed with `M0057 unbound variable encoder` whenever `encoder` and
`decoder` were sibling actor-field bindings rather than module-level
or imported names — even though the bindings are in scope everywhere
else inside the actor body.
Fix: build a single `par_env = adjoin_vals env scope.val_env` once
before the iteration and pass it to every `check_vis_parenthetical`
call. The full actor scope is now visible to parenthetical typing,
which matches the user-facing scoping intuition (other actor-fields
*are* in scope).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`go(spec : ObjectSpec) : async ObjectSpec { spec }` is now decorated
with the punning parenthetical `(with encoder; decoder)`, pairing the
sibling actor-fields with the framework's ingress/egress hooks. The
body is the identity, so any cycles/heap reported come purely from the
codec round-trip.
The //CALL hex below is the AE compact-binary form of the German-
clients query, produced by `nix run .#ae-encoder` (see
`nix/ae-encoder.nix`). Until v3 lands real codec bodies the encoder
still returns "" and the decoder returns `#root` — the wiring is
end-to-end correct, the numbers are just trivial today.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drops walkthrough-style commentary (parenthetical-pun explainer, roadmap, decorative banners) — readers are Motoko experts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decoder/encoder reports both fire end-to-end (cycles=279 each, stub bodies); reply is empty because encoder returns "". Numbers become meaningful once the AE codec bodies land. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Forward-only Iter<Nat8> reader for AE binary parsing — `take n` returns ?Blob (null on short read, no zero-fill), `readU32BE` reads big-endian u32. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parseDescBody dispatches 'obj '/'null'; parseObjBody walks the 4 record fields and recurses on 'from'. Errors trap via prim. parseValue scaffolded for utxt/long/enum/null. 117k cycles / 2.8k heap-words on the 638-byte fixture; want/form/seld bodies still consumed-and-discarded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decoder: SELD now interprets the 4cc body as a property name when form=='prop' (test form remains a TODO; logi predicate body is skipped). Encoder: Writer class with pre-computed length, encDescLen, writeDesc, writeObjBody — emits the full 'dle2' envelope + recursive obj/null tree. 35k cycles / 1.7k heap-words to encode a 152-byte wire on the partially- decoded fixture; 'inco' property round-trips, predicate is lost. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decoder: parseBoolExpr / parseValue / parseLogiBody / parseCmpdBody + 'exmn' (typeObjectBeingExamined) collapsed to #root. SELD interprets the 'logi' predicate when form=='test'; 'cmpd' obj1/relo/obj2 yield #compare with prop/op/value extracted from the obj1 ObjectSpec, the relo enum, and the obj2 literal (utxt/long/enum/null). Encoder still stubs #test as 'prop' form + 4-zero seld (writeBoolExpr TODO). go() now decodes the encoded wire and prints the round-trip, making the encoder loss visible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
writeBoolExpr / writeValue / writeLogiHeader emit 'logi'/'cmpd'/'utxt'/ 'long'/'enum'/'null' descriptors; valueDescLen and boolExprDescLen feed the pre-allocated Writer. encDescLen is now seld-aware so test-form objs size correctly. Round-trip print confirms decoded == roundtrip (structurally; original 'exmn' iterand collapses to 'null' since #it isn't modeled). 638-byte wire in / 638-byte wire out. counters() now returns (Int, Nat64), matching iter.mo / alloc.mo / heap-32.mo. textToUtf16 carries an ASCII-assumption note pointing at surrogate-pair encoding for non-BMP if extended later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
100-element flat client array (deterministic Array_tabulate) plus a hand-coded countMatchers helper. With 60% Germany / 50% age-in-range, joint hit rate is 31/100 — within the planned 30% band for the running query (country=="Germany" AND 45<=age<=55). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smurf = typed accessor for one Client property, indexed by the AE 4cc
name the decoder produces ("cntr"/"age "/"inco"). lookupSmurf maps the
4cc → Client field reader; cmp does typed comparison over CandidValue;
evalBoolExpr recursively evaluates the BoolExpr tree.
extractPredicate digs the #test out of the running query shape; go now
runs countMatchersDecoded against the 100-client DB and reports its
own cycles/heap. Decoded-predicate matchers (31) match the hand-coded
countMatchers (31), proving the predicate round-trips through the wire
into a faithful boolean.
Limitation: Smurfs are monomorphic. Future work — when Client gets a
nested Address subobject (or any non-leaf field) — will need an
existential "this is your container, go fishing" Smurf shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
runQuery walks the running shape (#obj prop → #obj clnt #test → #root) and emits one CandidValue per matching client (the requested property). For the German-midlife-income query against the 100-client mock DB, that's 31 yearly incomes from $51k to $146k. Two-pass resolution since `mo:⛔` has no Buffer: count matchers, allocate [var CandidValue], fill, freeze via Array_tabulate. 369k cycles / 13k heap-words for the full query. Smurf type carries a TODO marking the zipper-like evolution needed once nested entities (Address, etc.) appear. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Frees the name 'Smurf' for the upcoming existential-via-Candid protocol (generic, blob-keyed, polymorphic). The current monomorphic accessor becomes PropReader — the typed-fast-path used by the running query while the protocol lands. Behaviour unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smurf, Accessor, LookupKey land as type definitions. Smurf carries the existential boundary (blob: Blob; methods close over T via from_candid<T>); Accessor is the navigation hook (form, fourcc, lookUp). Mutual recursion typechecks. No implementations or wiring yet — those come in subsequent commits as constructors and concrete accessors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three protocol-level cleanups, all type-only: - blob is now a thunk `() -> Blob`. VarAccessor-class Smurfs can return "" when no consumer pulls; eager Candid encode is paid only at boundaries that need it. - primaryKey field gone — it was an implementation detail of toDesc. Each Smurf constructor closes over the primary-key logic locally and bakes it into toDesc directly. - Accessor.lookUp now takes the whole parent Smurf, not just a Blob. The child closes over parent.toDesc() (the zipper edge) and pulls parent.blob() lazily when from_candid<P> needs it. No instances yet; constructors and concrete accessors land separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The AE-404 of the Smurf protocol: every accessor returns this when lookup misses. blob is "", accessors empty, enumerate immediately exhausted, readField returns null, isNotFound=true so the encoder can special-case (eventual: emit errAENoSuchObject = -1728 envelope). Underscore-prefixed since no accessor instances exist yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Terminal leaf in the Smurf protocol. Reads `fieldName` from `parent` at construction (via parent.readField), stores the CandidValue, and discards the parent reference. The resulting Smurf has classFourcc="" (no class), no accessors, no enumeration; readField returns the stored value regardless of name; toDesc is a placeholder (#root) since AE ObjSpecs are references — the encoder treats leaf values via classFourcc="" and routes through the value rather than the spec. Underscore-prefixed since no instances yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Typed escape hatch for stable-var-backed accessors. Captures `stab : [T]` at construction and ignores `parent.blob()` — no Candid round-trip on input. `wrap : T -> Smurf` lifts each typed element back into the existential protocol; for #indexed keys it picks `stab[n-1]` (1-based, AppleScript convention) and applies wrap. Negative or out-of-bounds positions return _notFoundSmurf. #named and #test still TODO. Class instance is structurally an Accessor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_clientSmurf : Client -> Smurf wraps a Client as an existential Smurf.
readField maps wire 4ccs ("cntr"/"age "/"inco") to typed fields; blob
lazily Candid-encodes via to_candid; classFourcc is "clnt"; accessors
empty for now (per-property accessors land separately). toDesc still a
placeholder (#root) until VarAccessor threads parent through.
_actorSmurf is the canister root: classFourcc="", accessors hosts a
single _VarAccessor<Client>(clients, "clnt", #indexed, _clientSmurf).
The class instance fits [Accessor] by structural subtyping. toDesc=#root.
Both underscore-prefixed; runQuery still uses the typed-fast-path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Public method that exercises the existential protocol: looks up _actorSmurf.accessors[0] (the clnt VarAccessor), calls lookUp(parent, #indexed 1), and surfaces a few fields of the resulting clientSmurf via readField. Output proves the chain: VarAccessor takes clients[0], clientSmurf wraps, readField bridges 4ccs to typed fields. Got the expected (Germany, 35, 50000) for the first generated client. Demonstrates that class instances satisfy the Accessor type via structural subtyping and that readField is the existential boundary the predicate evaluator will use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Client gains a Text `name` (the natural primary key for stable
references via #name). Names dispatch by country: German clients draw
from {Hans/Anna/Otto/Maria/Karl/Helga} × {Müller/Schmidt/Weber/Fischer/
Bauer/Hoffmann}, French from {Jean/Marie/Pierre/Anne/Michel/Claire} ×
{Martin/Bernard/Dubois/Petit/Moreau/Leroy}. First client at index 0
(German) is "Hans Müller".
readField on _clientSmurf accepts "name" → #text c.name; tiny1 surfaces
it in its debug print.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three coupled changes:
- _VarAccessor's wrap signature is now (T, Smurf) -> Smurf. lookUp
threads `parent` through to the wrap callback so the resulting child
Smurf can close over `parent.toDesc()` (the zipper edge).
- _clientSmurf accepts (Client, Smurf). Its toDesc returns
#obj { class_="clnt"; container=parent.toDesc(); key=#name (c.name) },
the AppleScript-equivalent of `client "<name>" of <root>`.
- tiny1 now returns ObjectSpec via (with encoder) and emits
`s.toDesc()`. Encoder side gains #name keyform support
(NAME = 'name', seld='utxt' with the BE UTF-16 of c.name).
The 104-byte reply is `'dle2' + 'obj ' clnt … name "Hans Müller" … 'null'`.
ü leaks via the ASCII-assumption bug in textToUtf16 — deterministic but
malformed; surrogate-pair handling stays a TODO.
Also pun spec=spec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two coupled tweaks to _VarAccessor.lookUp: - Negative indices count from the end (AE/AppleScript convention): -1 is last, -size is first; out of range → _notFoundSmurf. Math is local since the accessor knows stab.size(). - Dispatch on (form_, key) tuple so the indexed branch fires only when the accessor was declared with form_ = #indexed. A #named- declared VarAccessor receiving a #indexed key returns notFound. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tiny1(i : Int) drives the clnt accessor with #indexed i, returning the resulting clientSmurf's stable reference (or the AE-404 spec when out of range). Two //CALL lines exercise the new convention: tiny1(1) returns "Hans Müller", tiny1(-1) returns "Anne Moreau" — the 100th client via negative-from-end addressing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 6×6 pools collided every i mod 36 — 0 and 36 both mapped to
"Hans Müller" (both inside the German subset). Two coupled fixes:
- Expand each name pool to 10 entries: with firstIdx=i%10 and
lastIdx=(i/10)%10, all i in [0, 99] yield a unique (firstIdx, lastIdx)
pair. Adds Klaus/Ingrid/Werner/Ursula + Schulz/Wagner/Becker/Koch
(German) and Henri/Sophie/Paul/Camille + Roux/Vincent/Fournier/Girard
(French).
- Add an init-time assertion (`do { for-for }`, O(n²)): for each client,
count occurrences of its name across the whole array and trap if it
isn't exactly 1. Caught at canister_init so a future tweak that
reintroduces collisions screams loudly.
tiny1(-1) now resolves to "Camille Girard" (was "Anne Moreau" with the
old pools).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Demonstrates tiny1(100) and tiny1(-1) resolve to the same client
("Camille Girard") via the two branches of the indexed math:
positive 100 → n=100, negative -1 → n=size-1+1=100. Same array
index 99, identical 108-byte reply, identical cycle count.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Snapshot of the AEOM-inspired bench against the original roadmap as of 2026-04-27. What's shipped (AE codec, Smurf protocol skeleton, _VarAccessor typed-fast-path, _clientSmurf with parent-zipper toDesc, _actorSmurf, tiny1 public method, mock DB with uniqueness assertion). What's left (full resolver wiring through the existential boundary, #named/#test forms, Accessors.mo codegen library, certified-data integration, HTTP/JSON endpoint, RTS hooks). Plus the protocol-level gaps (#named lookup, #test as filter on Smurf, leaf accessors on _clientSmurf, 'exmn' ↔ #it, real UTF-16 BE, #ne desugar). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three coupled additions to drive a two-step navigation through the
existential protocol:
- _ValueSmurf now closes over parent.toDesc() (not the parent's data),
so its toDesc emits `<#property fieldName> of <parentDesc>`. The leaf
is now a real spec node in the chain, not a placeholder.
- _VarAccessor gains #named handling: scans `stab` for the entry whose
`getName` matches the lookup key (relies on the init-time uniqueness
assertion). Constructor takes a getName : T -> Text param (ignored
for #indexed).
- _actorSmurf hosts two clnt accessors: one #indexed at accessors[0],
one #named at accessors[1].
tiny2(input : Text) drives accessors[1].lookUp(parent, #named input),
materialises `_ValueSmurf(clientS, "name")`, returns its toDesc.
Demoed with tiny2("Hans Müller") → 172-byte AE wire reply.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per-public-method, opt-in escape hatch from Candid in both directions:
encoder : T -> Blobreplaces Candid serialization on reply.decoder : Blob -> Areplaces Candid deserialization on ingress.Both fields are independently optional; either, both, or neither can appear, and unrecognised fields warn (M0212). The strategic driver is direct OpenAPI/Web2 interfacing — see
.claude/plans/non-candid.mdfor the design memo, including the actor-level inheritance idea and the motivation. (The OpenAPI integration is its own follow-on.)Implementation
Frontend (
src/mo_frontend/typing.ml)check_vis_parentheticaldoes per-field bidirectional checking against a known-fields table[("encoder", T -> Blob); ("decoder", Blob -> A)]. Encoder type is driven by the method's return type, decoder by its ingress type. Direction (method → codec) is documented in a comment alongside the alternative inverse direction (codec types driving the method signature) for future exploration.IR (
src/ir_def/ir.ml)FuncE's trailingexp option (* encoder *)replaced by a labeled recordcodecs = { encoder : exp option; decoder : exp option }.no_codecshelper for the common no-annotation case. Future codec-shaped fields (e.g. inbound-cycles caps, schema pins) land additively.check_irindependently type-checks both fields.IR passes
rename,subst_var,freevars,await,async,const,tailcall,erase_typ_field,eq,show— pass-through, retyped fromexp optiontocodecs.arrange_ir.mlprints both fields side by side.Desugaring (
src/lowering/desugar.ml)find_codec_in_parfactored out from the existingfind_encoder_in_par.find_decoder_in_parmirror.build_codecs(replacingbuild_encoders) returns(enc_opt, dec_opt) list.build_actorstamps both fields onto each public method'sFuncE.codecs, preserving any pre-existing per-field override.Async lowering (
src/ir_passes/async.ml)codecs.encoderwhen synthesising the reply continuation, lifting it intoICReplyPrim(ts, Some enc). Decoder unaffected here — it lives only onFuncE.Codegen (
src/codegen/compile_classical.ml&compile_enhanced.ml)ICReplyPrimbranches onenc_opt—Some→ call closure,IC.reply_with_data;None→Serialization.serialize.FuncDec.{lit, closed, compile_const_message}gain a?(decoder=None)thunk parameter(E.t -> VarEnv.t -> G.t) option. The thunk is constructed at the FuncE call site (wherecompile_exp_vanillais in scope) wrapping the decoder expression. Insidecompile_const_message, branch at the argument-decoding step:None→Serialization.deserialize;Some compile_dec→compile_dec env ae0 ; closure-callon rawIC.arg_data.AST interpreter (
src/mo_interpreter/interpret.ml)Tests
Run-drun:
parenthetical-public.mo— encoder, returns(), all phases.parenthetical-decoder.mo— full pipeline. Method's ingress is?Nat; decoder is the flowBlob -> ?Text -> ?Natcomposed viadecodeUtf8andNat.fromText. The//CALLpayload is the raw three ASCII bytes"123"(0x313233) — not a Candid envelope. With the decoder active that blob deserialises as?123, and the reply is CandidNat 123(0x4449444c00017d7b). Without the decoder Candid would reject0x313233as malformed input — so the green test is end-to-end proof that ingress Candid is bypassed.Fail (matched pairs, encoder ↔ decoder):
parenthetical-{encoder,decoder}-effect.mo— M0215 effect-free check fires.parenthetical-{encoder,decoder}-mismatch.mo— M0095 (finer than field-level M0214) on a wrong codec signature.Test plan
make -C test/run-drun parenthetical-public.onlypassesmake -C test/run-drun parenthetical-decoder.onlypassesmake -C test/failincludes the four new fail tests with stable output🤖 Generated with Claude Code