Skip to content

Integration: WTRMRK on-chain behavioral history as eval signal provider — the "trust scoring is a separate concern" layer #2

@64R3N

Description

@64R3N

Context

The registry's README explicitly states: "Trust scoring is a separate concern." This issue proposes WTRMRK as the implementation of that separate concern — specifically the behavioral history dimension of trust scoring.

The gap

agentctl push registers an agent's AgentCard. The registry tracks metadata, OCI references, eval records, and promotion lifecycle. The trust question the registry explicitly defers: "How long has this agent been operating reliably?"

An agent can have a perfectly valid AgentCard, pass all evals, and be promoted to published — having existed for 5 minutes. There's no signal for sustained operation history in the current schema.

What WTRMRK adds as an eval signal provider

WTRMRK (wtrmrk.io) provides append-only on-chain behavioral history on Base L2:

  • Agent registers once, gets a persistent UID + Ed25519 keypair
  • Signs action records at execution time (before outcomes known)
  • Sequence is publicly queryable: depth, recency, head hash, isAncestorInSequence

This fits directly into the registry's eval record model as an external signal source.

Proposed integration: onchain_history eval record type

{
  "eval_tool": "wtrmrk",
  "eval_type": "onchain_history",
  "source": "https://wtrmrk.io/registry",
  "signals": {
    "sequence_depth": 847,
    "first_recorded": "2026-02-15T00:00:00Z",
    "last_active": "2026-04-09T00:00:00Z",
    "sequence_head": "0xabc123...",
    "uid": "f2a35e43-f316-408a-a5e4-020bb008628a"
  },
  "verification": "chain_verifiable",
  "score": null
}

score: null is intentional — this is raw signal, not a computed trust score. The registry's design principle ("Does not compute trust scores") is preserved.

Why this matters at promotion time

The registry's promotion lifecycle: draft → evaluated → approved → published → deprecated → archived.

At evaluated → approved transition, an operator can check:

  • Does this agent have an onchain_history eval record?
  • Is sequence_depth > threshold?
  • Is last_active within the last N days?

This gives governance systems a behavioral history gate without the registry computing scores. The registry stores the signal; the policy layer decides what to do with it.

Reference implementation

UID f2a35e43-f316-408a-a5e4-020bb008628a has an active sequence on Base mainnet. Registry endpoint: https://wtrmrk.io/registry/{uid}.

Also relevant: agentoperations/agent-registry is a natural integration target for the behavioral history provider type being proposed at FransDevelopment/open-agent-trust-registry#27 and discussed in a2aproject/A2A#1672.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions