Skip to content

patch_document_from_markdown silently rewrites content through undisclosed LLM #20

@adambalm

Description

@adambalm

Summary

patch_document_from_markdown passes user-authored content through a model that rewrites prose, not just converts markdown to Portable Text. The tool description says content will be "converted to the appropriate structure based on the target field schema," implying structural format conversion. What actually happens is semantic rewriting — different words, different sentence structure, different meaning.

Disclosure vs. consent

The MCP server troubleshooting docs mention that "some models may struggle to map markdown onto complex Portable Text schemas." This acknowledges model involvement in general terms. However:

  1. Tool naming implies deterministic conversion. The name patch_document_from_markdown and the description ("converted to the appropriate structure based on the target field schema") read as format conversion, not content transformation.
  2. No point-of-use disclosure. When the tool is invoked via MCP, no response or parameter indicates that a model will process the input.
  3. No opt-in or configuration. Unlike transform_image, create_version, and generate_image — which all expose an instruction parameter — this tool provides no mechanism to control or disable model involvement.
  4. No diff. The user receives the patched result with no indication of what changed between input and output.
  5. No verbatim mode. There is no way to request that the tool preserve exact wording during the markdown-to-Portable-Text conversion.

The issue is not that model involvement is secret — it is mentioned in troubleshooting docs. The issue is that users invoking the tool have no disclosure at call time, no consent mechanism, and no way to request verbatim handling.

Reproduction

Step 1: Create a test document with a Portable Text field (e.g., an article with a body field).

Step 2: Call patch_document_from_markdown with the following exact input:

The project contains exactly fourteen schema files. The server ran for three hours
without interruption. Margaret said the delivery would arrive on Thursday. None of
this is true, but the sentences are precise.

Step 3: Read back the document.

Expected output: Four sentences in a Portable Text block, preserving the exact wording:

The project contains exactly fourteen schema files. The server ran for three hours without interruption. Margaret said the delivery would arrive on Thursday. None of this is true, but the sentences are precise.

Actual output:

The sentences are precise but factually false: there are not exactly fourteen schema files, the server did not run for three hours without interruption, and Margaret's delivery is not actually on Thursday.

The tool:

  • Deleted three of four sentences
  • Rewrote the remaining sentence into a summary of the others
  • Reversed the rhetorical structure — input states claims then discloses they are false; output leads with "factually false" and lists negations
  • Changed the voice from declarative to explanatory

Second reproduction (longer input)

In a separate test, two paragraphs of editorial framing were submitted:

Input (excerpt):

This is an illustrative artifact, not a truth claim. It was generated in January 2026 by Claude Opus 4.5 reading structured notes from Basic Memory...

Output (excerpt):

This demo artifact captures a historical snapshot of the system's evolution.

The phrase "not a truth claim" — the operative editorial statement — was deleted entirely. The output was restructured into a bulleted explainer with bold headers that the author did not write.

Why this matters

Every other Sanity MCP/Agent Action tool that uses a model discloses it and provides an instruction parameter (transform, create_version, generate_image, transform_image). The Patch Agent Action docs explicitly state: "Patch doesn't use an LLM. Unlike many Agent Actions, Patch doesn't use an LLM and instead relies on your schema."

patch_document_from_markdown breaks this pattern — it uses a model with no point-of-use disclosure and no user control. Users submitting authored prose expect format conversion, not content transformation. The troubleshooting note ("Some models may struggle to map markdown onto complex Portable Text schemas") frames model involvement as a formatting concern, but the observed behavior is semantic rewriting of simple schemas with trivially mappable fields.

Environment

  • MCP server: https://mcp.sanity.io (remote, hosted)
  • Client: Claude Code (Claude Opus 4.6)
  • Date: 2026-02-19
  • Workspace schema deployed via sanity schema deploy

Suggested fix

Any of these would resolve the issue:

  1. Add a verbatim mode — a parameter (e.g., verbatim: true) that guarantees no semantic rewriting. If the markdown cannot be mapped to the schema without model interpretation, fail with an explicit error rather than silently rewriting.
  2. Remove the model from the conversion path for simple/standard Portable Text schemas — markdown-to-block is a deterministic structural conversion that does not require AI for basic schemas.
  3. Disclose model involvement in the tool description (not just in separate troubleshooting docs) and add an instruction parameter consistent with other AI-powered tools, so users can opt in to content transformation rather than having it applied by default.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions