Skip to content

[Feature Request] Add WFGY multi-agent failure map (ProblemMap No.13) to LightAgent docs #23

@onestardao

Description

@onestardao

Hi, thank you for publishing LightAgent — the framework is very helpful for people trying to implement real multi-agent systems without a lot of boilerplate.

I maintain WFGY ProblemMap (MIT), a 16-mode failure taxonomy for RAG + agents:

https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md

Besides “classic” RAG issues, it has a dedicated slot:

  • No.13 – multi-agent chaos {OBS}
    • Sub-pages on role drift, cross-agent memory overwrite, and black-box interactions.

This ProblemMap has already been integrated or referenced by:

  • Harvard MIMS Lab – ToolUniverse (LLM tools benchmark; WFGY used for robustness / RAG debugging).
  • QCRI LLM Lab – Multimodal-RAG-Survey.
  • Univ. of Innsbruck Data Science Group – Rankify.

From my experience, LightAgent-style setups hit a very specific cluster of problems:

  • Agents gradually overwrite or corrupt each other’s memory.
  • Roles drift over time (planner becomes worker, critic starts acting like another planner, etc.).
  • Logs are hard to interpret when something goes wrong (“black box chaos”).

Proposal

I would like to contribute a multi-agent troubleshooting section for LightAgent:

  1. A short doc (e.g. docs/troubleshooting/multi_agent_problemmap.md) that:

    • Summarizes ProblemMap No.13 and its sub-modes (role drift, memory overwrite, hidden feedback loops).
    • Shows how they manifest inside LightAgent (chat history growth, tool calls, shared vs. local memory, etc.).
    • Gives concrete logging / instrumentation tips to make these issues visible.
  2. A small “debugging checklist” table for LightAgent users, mapping:

    • Symptom → ProblemMap mode → what to inspect in LightAgent.
    • For example:
      • “Agent A suddenly starts talking like Agent B” → multi-agent chaos / role drift → check prompt templates + naming + routing.
      • “All agents forget instructions after a few turns” → memory overwrite → inspect global memory append rules.
  3. Optionally, a minimal example script where I intentionally create a multi-agent failure, then show how to diagnose it using the ProblemMap lens.

Everything would live in docs only, no breaking changes to the framework.
If this is interesting, I can open a PR with a first draft so you can see the style and decide how deep you want it integrated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions