Skip to content

MCP: auto-coerce string-encoded arrays/objects in tool call arguments (tags, types, metadata) #849

@gurumee329

Description

@gurumee329

Problem

Multiple MCP clients (Claude Code, Cursor with various LLM backends) consistently serialize tags, types, and metadata parameters as JSON-encoded strings instead of native JSON arrays/objects. This causes Pydantic validation failures:

Error: 1 validation error for call[retain]
tags
  Input should be a valid list [type=list_type, input_value='["educore", "scope:ident...olicies", "type:guide"]', input_type=str]

Error: 1 validation error for call[recall]
types
  Input should be a valid list [type=list_type, input_value='["world"]', input_type=str]

Error: 1 validation error for call[retain]
metadata
  Input should be a valid dictionary [type=dict_type, input_value='{"branch": "develop", "f...y/domain/constants.ts"}', input_type=str]

Key observations

  • This is not a user error — the LLM agent generates tool calls with string-wrapped arrays/objects
  • The agent often retries with the exact same broken format 4+ times in a row, wasting tokens and failing silently
  • Extensive prompt engineering (templates, ✗/✓ comparisons, recovery procedures) has been attempted and proven ineffective
  • The values inside the strings are valid JSON — they just need to be unwrapped

Affected parameters

Parameter Expected type Actually received
tags list[str] '["tag1", "tag2"]' (string)
types list[str] '["world"]' (string)
metadata dict[str, str] '{"key": "value"}' (string)

Proposed fix

Add a Pydantic @field_validator (or BeforeValidator) that auto-coerces string-encoded JSON:

from pydantic import field_validator
import json

@field_validator("tags", "types", mode="before")
@classmethod
def coerce_string_to_list(cls, v):
    if isinstance(v, str):
        try:
            parsed = json.loads(v)
            if isinstance(parsed, list):
                return parsed
        except (json.JSONDecodeError, TypeError):
            pass
    return v

@field_validator("metadata", mode="before")
@classmethod
def coerce_string_to_dict(cls, v):
    if isinstance(v, str):
        try:
            parsed = json.loads(v)
            if isinstance(parsed, dict):
                return parsed
        except (json.JSONDecodeError, TypeError):
            pass
    return v

This is a common pattern for APIs consumed by LLM agents. The fix is backward-compatible — native arrays/objects pass through unchanged.

Impact

Without this fix, a significant percentage of MCP retain/recall/reflect calls fail silently, causing data loss in memory operations. The agent often gives up after repeated failures, leaving the user without stored memories for the session.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions