diff --git a/_vale/config/vocabularies/Docker/accept.txt b/_vale/config/vocabularies/Docker/accept.txt index 0370dcd7c16e..8f99fceffc9a 100644 --- a/_vale/config/vocabularies/Docker/accept.txt +++ b/_vale/config/vocabularies/Docker/accept.txt @@ -290,4 +290,6 @@ Zsh [Vv]irtiofs [Vv]irtualize [Ww]alkthrough +[Tt]oolsets? +[Rr]erank(ing|ed)? diff --git a/content/manuals/ai/cagent/_index.md b/content/manuals/ai/cagent/_index.md index 77ea20c69b29..bb7576f3197b 100644 --- a/content/manuals/ai/cagent/_index.md +++ b/content/manuals/ai/cagent/_index.md @@ -5,236 +5,148 @@ weight: 60 params: sidebar: group: Open source + badge: + color: violet + text: Experimental keywords: [ai, agent, cagent] --- {{< summary-bar feature_name="cagent" >}} -[cagent](https://github.com/docker/cagent) lets you build, orchestrate, and share -AI agents. You can use it to define AI agents that work as a team. +[cagent](https://github.com/docker/cagent) is an open source tool for building +teams of specialized AI agents. Instead of prompting one generalist model, you +define agents with specific roles and instructions that collaborate to solve +problems. Run these agent teams from your terminal using any LLM provider. -cagent relies on the concept of a _root agent_ that acts as a team lead and -delegates tasks to the sub-agents you define. -Each agent: -- uses the model of your choice, with the parameters of your choice. -- has access to the [built-in tools](#built-in-tools) and MCP servers - configured in the [Docker MCP gateway](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md). -- works in its own context. They do not share knowledge. +## Why agent teams -The root agent is your main contact point. Each agent has its own context, -they don't share knowledge. +One agent handling complex work means constant context-switching. Split the +work across focused agents instead - each handles what it's best at. cagent +manages the coordination. -## Key features +Here's a two-agent team that debugs problems: -- ️Multi-tenant architecture with client isolation and session management. -- Rich tool ecosystem via Model Context Protocol (MCP) integration. -- Hierarchical agent system with intelligent task delegation. -- Multiple interfaces including CLI, TUI, API server, and MCP server. -- Agent distribution via Docker registry integration. -- Security-first design with proper client scoping and resource isolation. -- Event-driven streaming for real-time interactions. -- Multi-model support (OpenAI, Anthropic, Gemini, DMR, Docker AI Gateway). - -## Get started with cagent - -1. The easiest way to get cagent is to [install Docker Desktop version 4.49 or later](/manuals/desktop/release-notes.md) for your operating system. - - > [!NOTE] - > You can also build cagent from the source. For more information, see the [cagent GitHub repository](https://github.com/docker/cagent?tab=readme-ov-file#build-from-source). - -1. Set the following environment variables: - - ```bash - export OPENAI_API_KEY= # For OpenAI models - export ANTHROPIC_API_KEY= # For Anthropic models - export GOOGLE_API_KEY= # For Gemini models - ``` - -1. Create an agent by saving this sample as `assistant.yaml`: - - ```yaml {title="assistant.yaml"} - agents: - root: - model: openai/gpt-5-mini - description: A helpful AI assistant - instruction: | - You are a knowledgeable assistant that helps users with various tasks. - Be helpful, accurate, and concise in your responses. - ``` - -1. Start your prompt with your agent: - - ```bash - cagent run assistant.yaml - ``` - -## Create an agentic team - -You can use AI prompting to generate a team of agents with the `cagent new` -command: - -```console -$ cagent new - -For any feedback, visit: https://docker.qualtrics.com/jfe/form/SV_cNsCIg92nQemlfw - -Welcome to cagent! (Ctrl+C to exit) - -What should your agent/agent team do? (describe its purpose): - -> I need a cross-functional feature team. The team owns a specific product - feature end-to-end. Include the key responsibilities of each of the roles - involved (engineers, designer, product manager, QA). Keep the description - short, clear, and focused on how this team delivers value to users and the business. -``` - -Alternatively, you can write your configuration file manually. For example: - -```yaml {title="agentic-team.yaml"} +```yaml agents: root: - model: claude - description: "Main coordinator agent that delegates tasks and manages workflow" + model: openai/gpt-5-mini # Change to the model that you want to use + description: Bug investigator instruction: | - You are the root coordinator agent. Your job is to: - 1. Understand user requests and break them down into manageable tasks. - 2. Delegate appropriate tasks to your helper agent. - 3. Coordinate responses and ensure tasks are completed properly. - 4. Provide final responses to the user. - When you receive a request, analyze what needs to be done and decide whether to: - - Handle it yourself if it's simple. - - Delegate to the helper agent if it requires specific assistance. - - Break complex requests into multiple sub-tasks. - sub_agents: ["helper"] + Analyze error messages, stack traces, and code to find bug root causes. + Explain what's wrong and why it's happening. + Delegate fix implementation to the fixer agent. + sub_agents: [fixer] + toolsets: + - type: filesystem + - type: mcp + ref: docker:duckduckgo - helper: - model: claude - description: "Assistant agent that helps with various tasks as directed by the root agent" + fixer: + model: anthropic/claude-sonnet-4-5 # Change to the model that you want to use + description: Fix implementer instruction: | - You are a helpful assistant agent. Your role is to: - 1. Complete specific tasks assigned by the root agent. - 2. Provide detailed and accurate responses. - 3. Ask for clarification if tasks are unclear. - 4. Report back to the root agent with your results. - - Focus on being thorough and helpful in whatever task you're given. - -models: - claude: - provider: anthropic - model: claude-sonnet-4-0 - max_tokens: 64000 -``` - -[See the reference documentation](https://github.com/docker/cagent?tab=readme-ov-file#-configuration-reference). - -## Built-in tools - -cagent includes a set of built-in tools that enhance your agents' capabilities. -You don't need to configure any external MCP tools to use them. - -```yaml -agents: - root: - # ... other config + Write fixes for bugs diagnosed by the investigator. + Make minimal, targeted changes and add tests to prevent regression. toolsets: - - type: todo - - type: transfer_task + - type: filesystem + - type: shell ``` -### Think tool +The root agent investigates and explains the problem. When it understands the +issue, it hands off to `fixer` for implementation. Each agent stays focused on +its specialty. -The think tool allows agents to reason through problems step by step: +## Installation -```yaml -agents: - root: - # ... other config - toolsets: - - type: think -``` +cagent is included in Docker Desktop 4.49 and later. -### Todo tool +For Docker Engine users or custom installations: -The todo tool helps agents manage task lists: +- **Homebrew**: `brew install cagent` +- **Pre-built binaries**: [GitHub releases](https://github.com/docker/cagent/releases) +- **From source**: See the [cagent repository](https://github.com/docker/cagent?tab=readme-ov-file#build-from-source) -```yaml -agents: - root: - # ... other config - toolsets: - - type: todo -``` +## Get started -### Memory tool +Try the bug analyzer team: -The memory tool provides persistent storage: +1. Set your API key for the model provider you want to use: -```yaml -agents: - root: - # ... other config - toolsets: - - type: memory - path: "./agent_memory.db" -``` - -### Task transfer tool + ```console + $ export ANTHROPIC_API_KEY= # For Claude models + $ export OPENAI_API_KEY= # For OpenAI models + $ export GOOGLE_API_KEY= # For Gemini models + ``` -The task transfer tool is an internal tool that allows an agent to delegate a task -to sub-agents. To prevent an agent from delegating work, make sure it doesn't have -sub-agents defined in its configuration. +2. Save the [example configuration](#why-agent-teams) as `debugger.yaml`. -### Using tools via the Docker MCP Gateway +3. Run your agent team: -If you use the [Docker MCP gateway](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md), -you can configure your agent to interact with the -gateway and use the MCP servers configured in it. See [docker mcp -gateway run](/reference/cli/docker/mcp/gateway/gateway_run.md). + ```console + $ cagent run debugger.yaml + ``` -For example, to enable an agent to use Duckduckgo via the MCP Gateway: +You'll see a prompt where you can describe bugs or paste error messages. The +investigator analyzes the problem, then hands off to the fixer for +implementation. -```yaml -toolsets: - - type: mcp - command: docker - args: ["mcp", "gateway", "run", "--servers=duckduckgo"] -``` +## How it works -## CLI interactive commands +You interact with the _root agent_, which can delegate work to sub-agents you +define. Each agent: -You can use the following CLI commands, during -CLI sessions with your agents: +- Uses its own model and parameters +- Has its own context (agents don't share knowledge) +- Can access built-in tools like todo lists, memory, and task delegation +- Can use external tools via [MCP servers](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md) -| Command | Description | -|----------|------------------------------------------| -| /exit | Exit the program | -| /reset | Clear conversation history | -| /eval | Save current conversation for evaluation | -| /compact | Compact the current session | +The root agent delegates tasks to agents listed under `sub_agents`. Sub-agents +can have their own sub-agents for deeper hierarchies. -## Share your agents +## Configuration options -Agent configurations can be packaged and shared via Docker Hub. -Before you start, make sure you have a [Docker repository](/manuals/docker-hub/repos/create.md). +Agent configurations are YAML files. A basic structure looks like this: -To push an agent: +```yaml +agents: + root: + model: claude-sonnet-4-0 + description: Brief role summary + instruction: | + Detailed instructions for this agent... + sub_agents: [helper] -```bash -cagent push ./.yaml / + helper: + model: gpt-5-mini + description: Specialist agent role + instruction: | + Instructions for the helper agent... ``` -To pull an agent to the current directory: +You can also configure model settings (like context limits), tools (including +MCP servers), and more. See the [configuration reference](https://github.com/docker/cagent?tab=readme-ov-file#-configuration-reference) +for complete details. + +## Share agent teams -```bash -cagent pull / +Agent configurations are packaged as OCI artifacts. Push and pull them like +container images: + +```console +$ cagent push ./debugger.yaml myusername/debugger +$ cagent pull myusername/debugger ``` -The agent's configuration file is named `_.yaml`. Run -it with the `cagent run ` command. +Use Docker Hub or any OCI-compatible registry. Pushing creates the repository +if it doesn't exist yet. -## Related pages +## What's next -- For more information about cagent, see the -[GitHub repository](https://github.com/docker/cagent). -- [Docker MCP Gateway](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md) +- Follow the [tutorial](./tutorial.md) to build your first coding agent +- Learn [best practices](./best-practices.md) for building effective agents +- Integrate cagent with your [editor](./integrations/acp.md) or use agents as + [tools in MCP clients](./integrations/mcp.md) +- Browse example agent configurations in the [cagent repository](https://github.com/docker/cagent/tree/main/examples) +- Use `cagent new` to generate agent teams with AI +- Connect agents to external tools via the [Docker MCP Gateway](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md) +- Read the full [configuration reference](https://github.com/docker/cagent?tab=readme-ov-file#-configuration-reference) diff --git a/content/manuals/ai/cagent/best-practices.md b/content/manuals/ai/cagent/best-practices.md new file mode 100644 index 000000000000..3015aee0d72d --- /dev/null +++ b/content/manuals/ai/cagent/best-practices.md @@ -0,0 +1,259 @@ +--- +title: Best practices +description: Patterns and techniques for building effective cagent agents +keywords: [cagent, best practices, patterns, agent design, optimization] +weight: 20 +--- + +Patterns you learn from building and running cagent agents. These aren't +features or configuration options - they're approaches that work well in +practice. + +## Handling large command outputs + +Shell commands that produce large output can overflow your agent's context +window. Validation tools, test suites, and build logs often generate thousands +of lines. If you capture this output directly, it consumes all available +context and the agent fails. + +The solution: redirect output to a file, then read the file. The Read tool +automatically truncates large files to 2000 lines, and your agent can navigate +through it if needed. + +**Don't do this:** + +```yaml +reviewer: + instruction: | + Run validation: `docker buildx bake validate` + Check the output for errors. + toolsets: + - type: shell +``` + +The validation output goes directly into context. If it's large, the agent +fails with a context overflow error. + +**Do this:** + +```yaml +reviewer: + instruction: | + Run validation and save output: + `docker buildx bake validate > validation.log 2>&1` + + Read validation.log to check for errors. + The file can be large - read the first 2000 lines. + Errors usually appear at the beginning. + toolsets: + - type: filesystem + - type: shell +``` + +The output goes to a file, not context. The agent reads what it needs using +the filesystem toolset. + +**Important details:** + +- Use `>` to redirect, not `tee`. The `tee` command writes to both the file + and stdout, defeating the purpose. +- Redirect both stdout and stderr with `2>&1` +- Write to the working directory, not `/tmp` (permission issues) +- Tell your agent that errors usually appear early in logs so it knows to read + from the beginning + +This pattern works for any command with potentially large output: test runs, +build logs, linting tools, search results, database dumps. + +## Structuring agent teams + +A single agent handling multiple responsibilities makes instructions complex +and behavior unpredictable. Breaking work across specialized agents produces +better results. + +The coordinator pattern works well: a root agent understands the overall task +and delegates to specialists. Each specialist focuses on one thing. + +**Example: Documentation writing team** + +```yaml +agents: + root: + description: Technical writing coordinator + instruction: | + Coordinate documentation work: + 1. Delegate to writer for content creation + 2. Delegate to editor for formatting polish + 3. Delegate to reviewer for validation + 4. Loop back through editor if reviewer finds issues + sub_agents: [writer, editor, reviewer] + toolsets: [filesystem, todo] + + writer: + description: Creates and edits documentation content + instruction: | + Write clear, practical documentation. + Focus on content quality - the editor handles formatting. + toolsets: [filesystem, think] + + editor: + description: Polishes formatting and style + instruction: | + Fix formatting issues, wrap lines, run prettier. + Remove AI-isms and polish style. + Don't change meaning or add content. + toolsets: [filesystem, shell] + + reviewer: + description: Runs validation tools + instruction: | + Run validation suite, report failures. + toolsets: [filesystem, shell] +``` + +Each agent has clear responsibilities. The writer doesn't worry about line +wrapping. The editor doesn't generate content. The reviewer just runs tools. + +**When to use teams:** + +- Multiple distinct steps in your workflow +- Different skills required (writing ↔ editing ↔ testing) +- One step might need to retry based on later feedback + +**When to use a single agent:** + +- Simple, focused tasks +- All work happens in one step +- Adding coordination overhead doesn't help + +## Optimizing RAG performance + +RAG indexing takes time when you have many files. A configuration that indexes +your entire codebase might take minutes to start. Optimize for what your agent +actually needs. + +**Narrow the scope:** + +Don't index everything. Index what's relevant for the agent's work. + +```yaml +# Too broad - indexes entire codebase +rag: + codebase: + docs: [./] + +# Better - indexes only relevant directories +rag: + codebase: + docs: [./src/api, ./docs, ./examples] +``` + +If your agent only works with API code, don't index tests, vendor directories, +or generated files. + +**Increase batching and concurrency:** + +Process more chunks per API call and make parallel requests. + +```yaml +strategies: + - type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + batch_size: 50 # More chunks per API call + max_embedding_concurrency: 10 # Parallel requests + chunking: + size: 2000 # Larger chunks = fewer total chunks + overlap: 150 +``` + +This reduces both API calls and indexing time. + +**Consider BM25 for fast local search:** + +If you need exact term matching (function names, error messages, identifiers), +BM25 is fast and runs locally without API calls. + +```yaml +strategies: + - type: bm25 + database: ./bm25.db + chunking: + size: 1500 +``` + +Combine with embeddings using hybrid retrieval when you need both semantic +understanding and exact matching. + +## Preserving document scope + +When building agents that update documentation, a common problem: the agent +transforms minimal guides into comprehensive tutorials. It adds prerequisites, +troubleshooting, best practices, examples, and detailed explanations to +everything. + +These additions might individually be good, but they change the document's +character. A focused 90-line how-to becomes a 200-line reference. + +**Build this into instructions:** + +```yaml +writer: + instruction: | + When updating documentation: + + 1. Understand the current document's scope and length + 2. Match that character - don't transform minimal guides into tutorials + 3. Add only what's genuinely missing + 4. Value brevity - not every topic needs comprehensive coverage + + Good additions fill gaps. Bad additions change the document's character. + When in doubt, add less rather than more. +``` + +Tell your agents explicitly to preserve the existing document's scope. Without +this guidance, they default to being comprehensive. + +## Model selection + +Choose models based on the agent's role and complexity. + +**Use larger models (Sonnet, GPT-5) for:** + +- Complex reasoning and planning +- Writing and editing content +- Coordinating multiple agents +- Tasks requiring judgment and creativity + +**Use smaller models (Haiku, GPT-5 Mini) for:** + +- Running validation tools +- Simple structured tasks +- Reading logs and reporting errors +- High-volume, low-complexity work + +Example from the documentation writing team: + +```yaml +agents: + root: + model: anthropic/claude-sonnet-4-5 # Complex coordination + writer: + model: anthropic/claude-sonnet-4-5 # Creative content work + editor: + model: anthropic/claude-sonnet-4-5 # Judgment about style + reviewer: + model: anthropic/claude-haiku-4-5 # Just runs validation +``` + +The reviewer uses Haiku because it runs commands and checks for errors. No +complex reasoning needed, and Haiku is faster and cheaper. + +## What's next + +- Review [configuration reference](./reference/config.md) for all available + options +- Check [toolsets reference](./reference/toolsets.md) to understand what tools + agents can use +- See [example configurations](https://github.com/docker/cagent/tree/main/examples) + for complete working agents +- Read the [RAG guide](./rag.md) for detailed retrieval optimization diff --git a/content/manuals/ai/cagent/examples.md b/content/manuals/ai/cagent/examples.md deleted file mode 100644 index 8f0388852a2a..000000000000 --- a/content/manuals/ai/cagent/examples.md +++ /dev/null @@ -1,166 +0,0 @@ ---- -title: cagent examples -description: Get inspiration from agent examples -keywords: [ai, agent, cagent] -weight: 10 ---- - -Get inspiration from the following agent examples. - -## Agentic development team - -```yaml {title="dev-team.yaml"} -agents: - root: - model: claude - description: Technical lead coordinating development - instruction: | - You are a technical lead managing a development team. - Coordinate tasks between developers and ensure quality. - sub_agents: [developer, reviewer, tester] - - developer: - model: claude - description: Expert software developer - instruction: | - You are an expert developer. Write clean, efficient code - and follow best practices. - toolsets: - - type: filesystem - - type: shell - - type: think - - reviewer: - model: gpt4 - description: Code review specialist - instruction: | - You are a code review expert. Focus on code quality, - security, and maintainability. - toolsets: - - type: filesystem - - tester: - model: gpt4 - description: Quality assurance engineer - instruction: | - You are a QA engineer. Write tests and ensure - software quality. - toolsets: - - type: shell - - type: todo - -models: - gpt4: - provider: openai - model: gpt-4o - - claude: - provider: anthropic - model: claude-sonnet-4-0 - max_tokens: 64000 -``` - -## Research assistant - -```yaml {title="research-assistant.yaml"} -agents: - root: - model: claude - description: Research assistant with web access - instruction: | - You are a research assistant. Help users find information, - analyze data, and provide insights. - toolsets: - - type: mcp - command: mcp-web-search - args: ["--provider", "duckduckgo"] - - type: todo - - type: memory - path: "./research_memory.db" - -models: - claude: - provider: anthropic - model: claude-sonnet-4-0 - max_tokens: 64000 -``` - -## Technical blog writer - -```yaml {title="tech-blog-writer.yaml"} -#!/usr/bin/env cagent run -version: "1" - -agents: - root: - model: anthropic - description: Writes technical blog posts - instruction: | - You are the leader of a team of AI agents for a technical blog writing workflow. - - Here are the members in your team: - - - web_search_agent: Searches the web - - writer: Writes a 750-word technical blog post based on the chosen prompt - - - - 1. Call the `web_search_agent` agent to search the web to get - important information about the task that is asked - - 2. Call the `writer` agent to write a 750-word technical blog - post based on the research done by the web_search_agent - - - - Use the transfer_to_agent tool to call the right agent at the right - time to complete the workflow. - - DO NOT transfer to multiple members at once - - ONLY CALL ONE AGENT AT A TIME - - When using the `transfer_to_agent` tool, make exactly one call - and wait for the result before making another. Do not batch or - parallelize tool calls. - sub_agents: - - web_search_agent - - writer - toolsets: - - type: think - - web_search_agent: - model: anthropic - add_date: true - description: Search the web for information - instruction: | - Search the web for information - - Always include sources - toolsets: - - type: mcp - command: uvx - args: ["duckduckgo-mcp-server"] - - writer: - model: anthropic - description: Writes a 750-word technical blog post based on the chosen prompt. - instruction: | - You are an agent that receives a single technical writing prompt - and generates a detailed, informative, and well-structured technical blog post. - - - Ensure the content is technically accurate and includes relevant - code examples, diagrams, or technical explanations where appropriate. - - Structure the blog post with clear sections, including an introduction, - main content, and conclusion. - - Use technical terminology appropriately and explain complex concepts clearly. - - Include practical examples and real-world applications where relevant. - - Make sure the content is engaging for a technical audience while - maintaining professional standards. - - Constraints: - - DO NOT use lists - -models: - anthropic: - provider: anthropic - model: claude-3-5-sonnet-latest -``` - -See more examples in the [repository](https://github.com/docker/cagent/tree/main/examples). \ No newline at end of file diff --git a/content/manuals/ai/cagent/images/cagent-acp-zed.avif b/content/manuals/ai/cagent/images/cagent-acp-zed.avif new file mode 100644 index 000000000000..258e751effaf Binary files /dev/null and b/content/manuals/ai/cagent/images/cagent-acp-zed.avif differ diff --git a/content/manuals/ai/cagent/integrations/_index.md b/content/manuals/ai/cagent/integrations/_index.md new file mode 100644 index 000000000000..d3f7f7f2700a --- /dev/null +++ b/content/manuals/ai/cagent/integrations/_index.md @@ -0,0 +1,6 @@ +--- +build: + render: never +title: Integrations +weight: 50 +--- diff --git a/content/manuals/ai/cagent/integrations/acp.md b/content/manuals/ai/cagent/integrations/acp.md new file mode 100644 index 000000000000..37e6437cdded --- /dev/null +++ b/content/manuals/ai/cagent/integrations/acp.md @@ -0,0 +1,251 @@ +--- +title: ACP integration +description: Configure your editor or IDE to use cagent agents as coding assistants +keywords: [cagent, acp, editor, ide, vscode, neovim, integration] +weight: 40 +--- + +Run cagent agents directly in your editor using the Agent Client Protocol (ACP). +Your agent gets access to your editor's filesystem context and can read and +modify files as you work. The editor handles file operations while cagent +provides the AI capabilities. + +This guide shows you how to configure VS Code, Neovim, or Zed to run cagent +agents. If you're looking to expose cagent agents as tools to MCP clients like +Claude Desktop or Claude Code, see [MCP integration](./mcp.md) instead. + +## How it works + +When you run cagent with ACP, it becomes part of your editor's environment. You +select code, highlight a function, or reference a file - the agent sees what +you see. No copying file paths or switching to a terminal. + +Ask "explain this function" and the agent reads the file you're viewing. Ask it +to "add error handling" and it edits the code right in your editor. The agent +works with your editor's view of the project, not some external file system it +has to navigate. + +The difference from running cagent in a terminal: file operations go through +your editor instead of the agent directly accessing your filesystem. When the +agent needs to read or write a file, it requests it from your editor. This +keeps the agent's view of your code synchronized with yours - same working +directory, same files, same state. + +## Prerequisites + +Before configuring your editor, you need: + +- **cagent installed** - See the [installation guide](../_index.md#installation) +- **Agent configuration** - A YAML file defining your agent. See the + [tutorial](../tutorial.md) or [example + configurations](https://github.com/docker/cagent/tree/main/examples) +- **Editor with ACP support** - VS Code, Neovim, Zed, or any editor you can + configure for stdio-based tools + +Your agents will use model provider API keys from your shell environment +(`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.). Make sure these are set before +launching your editor. + +## Editor configuration + +Your editor needs to know how to start cagent and communicate with it over +stdio. Most editors support this through extension systems, configuration files, +or plugin managers. You're telling your editor: "When I want to use an AI +agent, run this command and talk to it." + +### Zed + +Zed has built-in ACP support. + +1. Add cagent to your agent servers in `settings.json`: + + ```json + { + "agent_servers": { + "my-cagent-team": { + "command": "cagent", + "args": ["acp", "agent.yml"] + } + } + } + ``` + + Replace: + - `my-cagent-team` with the name you want to use for the agent + - `agent.yml` with the path to your agent configuration file. + + If you have multiple agent files that you like to run separately, you can + create multiple entries under `agent_servers` for each agent. + +2. Start a new external agent thread. Select your agent in the drop-down list. + + ![New external thread with cagent in Zed](../images/cagent-acp-zed.avif) + +### Neovim + +Use the [CodeCompanion](https://github.com/olimorris/codecompanion.nvim) plugin, +which has native support for cagent through a built-in adapter: + +1. [Install CodeCompanion](https://codecompanion.olimorris.dev/installation) + through your plugin manager. +2. Extend the `cagent` adapter in your CodeCompanion config: + + ```lua + require("codecompanion").setup({ + adapters = { + acp = { + cagent = function() + return require("codecompanion.adapters").extend("cagent", { + commands = { + default = { + "cagent", + "acp", + "agent.yml", + }, + }, + }) + end, + }, + }, + }) + ``` + + Replace `agent.yml` with the path to your agent configuration file. If you + have multiple agent files that you like to run separately, you can create + multiple commands for each agent. + +3. Restart Neovim and launch CodeCompanion: + + ```plaintext + :CodeCompanion + ``` + +4. Switch to the cagent adapter (keymap `ga` in the CodeCompanion buffer, by + default). + +See the [CodeCompanion ACP documentation](https://codecompanion.olimorris.dev/usage/acp-protocol) +for more information about ACP support in CodeCompanion. Note that terminal +operations are not supported, so [toolsets](../reference/toolsets.md) like +`shell` or `script_shell` are not usable through CodeCompanion. + +### VS Code + +VS Code [doesn't support ACP](https://github.com/microsoft/vscode/issues/265496) +natively yet. + +### Other editors + +For other editors with ACP support, you need a way to: + +1. Start cagent with `cagent acp your-agent.yml --working-dir /project/path` +2. Send prompts to its stdin +3. Read responses from its stdout + +This typically requires writing a small plugin or extension for your editor, or +using your editor's external tool integration if it supports stdio +communication. + +## Agent references + +You can specify your agent configuration as a local file path or OCI registry +reference: + +```console +# Local file path +$ cagent acp ./agent.yml + +# OCI registry reference +$ cagent acp agentcatalog/pirate +$ cagent acp dockereng/myagent:v1.0.0 +``` + +Use the same syntax in your editor configuration: + +```json +{ + "agent_servers": { + "myagent": { + "command": "cagent", + "args": ["acp", "agentcatalog/pirate"] + } + } +} +``` + +Registry references enable team sharing, version management, and clean +configuration without local file paths. See [Sharing agents](../sharing-agents.md) +for details on using OCI registries. + +## Testing your setup + +Verify your configuration works: + +1. Start the cagent ACP server using your editor's configured method +2. Send a test prompt through your editor's interface +3. Check that the agent responds +4. Verify filesystem operations work by asking the agent to read a file + +If the agent starts but can't access files or perform other actions, check: + +- Working directory in your editor is set correctly to your project root +- Agent configuration file path is absolute or relative to working directory +- Your editor or plugin properly implements ACP protocol features + +## Common workflows + +### Explain code at cursor + +Select a function or code block, then ask: "Explain what this code does." + +The agent reads the file, analyzes the selection, and explains the +functionality. + +### Add functionality + +Ask: "Add error handling to this function" while having a function selected. + +The agent reads the current code, writes improved code with error handling, and +explains the changes. + +### Search the codebase + +For agents configured with RAG (see examples directory), ask: "Where is +authentication implemented?" + +The agent searches your indexed codebase and points to relevant files and +functions. + +### Multi-step refactoring + +Ask: "Rename this function and update all callers." + +The agent finds all references, makes changes across files, and reports what +it updated. + +## ACP vs MCP integration + +Both protocols let you integrate cagent agents with other tools, but they're +designed for different use cases: + +| Feature | ACP Integration | MCP Integration | +| ----------- | ---------------------------- | ------------------------------ | +| Use case | Embedded agents in editors | Agents as tools in MCP clients | +| Filesystem | Delegated to client (editor) | Direct cagent access | +| Working dir | Client workspace | Configurable per agent | +| Best for | Code editing workflows | Using agents as callable tools | + +Use ACP when you want agents embedded in your editor. Use MCP when you want to +expose agents as tools to MCP clients like Claude Desktop or Claude Code. + +For MCP integration setup, see [MCP integration](./mcp.md). + +## What's next + +- Review the [configuration reference](../reference/config.md) for advanced agent + setup +- Explore the [toolsets reference](../reference/toolsets.md) to learn what tools + are available +- Add [RAG for codebase search](../rag.md) to your agent +- Check the [CLI reference](../reference/cli.md) for all `cagent acp` options +- Browse [example configurations](https://github.com/docker/cagent/tree/main/examples) + for inspiration diff --git a/content/manuals/ai/cagent/integrations/mcp.md b/content/manuals/ai/cagent/integrations/mcp.md new file mode 100644 index 000000000000..31066f4108e6 --- /dev/null +++ b/content/manuals/ai/cagent/integrations/mcp.md @@ -0,0 +1,278 @@ +--- +title: MCP integration +description: Expose cagent agents as tools to MCP clients like Claude Desktop and Claude Code +keywords: [cagent, mcp, model context protocol, claude desktop, claude code, integration] +weight: 50 +--- + +When you run cagent in MCP mode, your agents show up as tools in Claude Desktop +and other MCP clients. Instead of switching to a terminal to run your security +agent, you ask Claude to use it and Claude calls it for you. + +This guide covers setup for Claude Desktop and Claude Code. If you want agents +embedded in your editor instead, see [ACP integration](./acp.md). + +## How it works + +You configure Claude Desktop (or another MCP client) to connect to cagent. Your +agents appear in Claude's tool list. When you ask Claude to use one, it calls +that agent through the MCP protocol. + +Say you have a security agent configured. Ask Claude Desktop "Use the security +agent to audit this authentication code" and Claude calls it. The agent runs +with its configured tools (filesystem, shell, whatever you gave it), then +returns results to Claude. + +If your configuration has multiple agents, each one becomes a separate tool. +A config with `root`, `designer`, and `engineer` agents gives Claude three +tools to choose from. Claude might call the engineer directly or use the root +coordinator—depends on your agent descriptions and what you ask for. + +## Prerequisites + +Before configuring MCP integration, you need: + +- **cagent installed** - See the [installation guide](../_index.md#installation) +- **Agent configuration** - A YAML file defining your agent. See the + [tutorial](../tutorial.md) or [example + configurations](https://github.com/docker/cagent/tree/main/examples) +- **MCP client** - Claude Desktop, Claude Code, or another MCP-compatible + application +- **API keys** - Environment variables for any model providers your agents use + (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.) + +## MCP client configuration + +Your MCP client needs to know how to start cagent and communicate with it. This +typically involves adding cagent as an MCP server in your client's configuration. + +### Claude Desktop + +Add cagent to your Claude Desktop MCP settings file: + +- macOS: `~/Library/Application Support/Claude/claude_desktop_config.json` +- Windows: `%APPDATA%\Claude\claude_desktop_config.json` + +Example configuration: + +```json +{ + "mcpServers": { + "myagent": { + "command": "/usr/local/bin/cagent", + "args": ["mcp", "/path/to/agent.yml", "--working-dir", "/Users/yourname/projects"], + "env": { + "ANTHROPIC_API_KEY": "your_anthropic_key_here", + "OPENAI_API_KEY": "your_openai_key_here" + } + } + } +} +``` + +Configuration breakdown: + +- `command`: Full path to your `cagent` binary (use `which cagent` to find it) +- `args`: MCP command arguments: + - `mcp`: The subcommand to run cagent in MCP mode + - `dockereng/myagent`: Your agent configuration (local file path or OCI + reference) + - `--working-dir`: Optional working directory for agent execution +- `env`: Environment variables your agents need: + - Model provider API keys (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.) + - Any other environment variables your agents reference + +After updating the configuration, restart Claude Desktop. Your agents will +appear as available tools. + +### Claude Code + +Add cagent as an MCP server using the `claude mcp add` command: + +```console +$ claude mcp add --transport stdio myagent \ + --env OPENAI_API_KEY=$OPENAI_API_KEY \ + --env ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \ + -- cagent mcp /path/to/agent.yml --working-dir $(pwd) +``` + +Command breakdown: + +- `claude mcp add`: Claude Code command to register an MCP server +- `--transport stdio`: Use stdio transport (standard for local MCP servers) +- `myagent`: Name for this MCP server in Claude Code +- `--env`: Pass environment variables (repeat for each variable) +- `--`: Separates Claude Code options from the MCP server command +- `cagent mcp /path/to/agent.yml`: The cagent MCP command with the path to your + agent configuration +- `--working-dir $(pwd)`: Set the working directory for agent execution + +After adding the server, your agents will be available as tools in Claude Code +sessions. + +### Other MCP clients + +For other MCP-compatible clients, you need to: + +1. Start cagent with `cagent mcp /path/to/agent.yml --working-dir /project/path` +2. Configure the client to communicate with cagent over stdio +3. Pass required environment variables (API keys, etc.) + +Consult your MCP client's documentation for specific configuration steps. + +## Agent references + +You can specify your agent configuration as a local file path or OCI registry +reference: + +```console +# Local file path +$ cagent mcp ./agent.yml + +# OCI registry reference +$ cagent mcp agentcatalog/pirate +$ cagent mcp dockereng/myagent:v1.0.0 +``` + +Use the same syntax in MCP client configurations: + +```json +{ + "mcpServers": { + "myagent": { + "command": "/usr/local/bin/cagent", + "args": ["mcp", "agentcatalog/pirate"] + } + } +} +``` + +Registry references let your team use the same agent configuration without +managing local files. See [Sharing agents](../sharing-agents.md) for details. + +## Designing agents for MCP + +MCP clients see each of your agents as a separate tool and can call any of them +directly. This changes how you should think about agent design compared to +running agents with `cagent run`. + +### Write good descriptions + +The `description` field tells the MCP client what the agent does. This is how +the client decides when to call it. "Analyzes code for security vulnerabilities +and compliance issues" is specific. "A helpful security agent" doesn't say what +it actually does. + +```yaml +agents: + security_auditor: + description: Analyzes code for security vulnerabilities and compliance issues + # Not: "A helpful security agent" +``` + +### MCP clients call agents directly + +The MCP client can call any of your agents, not just root. If you have `root`, +`designer`, and `engineer` agents, the client might call the engineer directly +instead of going through root. Design each agent to work on its own: + +```yaml +agents: + engineer: + description: Implements features and writes production code + instruction: | + You implement code based on requirements provided. + You can work independently without a coordinator. + toolsets: + - type: filesystem + - type: shell +``` + +If an agent needs others to work properly, say so in the description: +"Coordinates design and engineering agents to implement complete features." + +### Test each agent on its own + +MCP clients call agents individually, so test them that way: + +```console +$ cagent run agent.yml --agent engineer +``` + +Make sure the agent works without going through root first. Check that it has +the right tools and that its instructions make sense when it's called directly. + +## Testing your setup + +Verify your MCP integration works: + +1. Restart your MCP client after configuration changes +2. Check that cagent agents appear as available tools +3. Invoke an agent with a simple test prompt +4. Verify the agent can access its configured tools (filesystem, shell, etc.) + +If agents don't appear or fail to execute, check: + +- `cagent` binary path is correct and executable +- Agent configuration file exists and is valid +- All required API keys are set in environment variables +- Working directory path exists and has appropriate permissions +- MCP client logs for connection or execution errors + +## Common workflows + +### Call specialist agents + +You have a security agent that knows your compliance rules and common +vulnerabilities. In Claude Desktop, paste some authentication code and ask +"Use the security agent to review this." The agent checks the code and reports +what it finds. You stay in Claude's interface the whole time. + +### Work with agent teams + +Your configuration has a coordinator that delegates to designer and engineer +agents. Ask Claude Code "Use the coordinator to implement a login form" and +the coordinator hands off UI work to the designer and code to the engineer. +You get a complete implementation without running `cagent run` yourself. + +### Run domain-specific tools + +You built an infrastructure agent with custom deployment scripts and monitoring +queries. Ask any MCP client "Use the infra agent to check production status" +and it runs your tools and returns results. Your deployment knowledge is now +available wherever you use MCP clients. + +### Share agents + +Your team keeps agents in an OCI registry. Everyone adds +`agentcatalog/security-expert` to their MCP client config. When you update the +agent, they get the new version on their next restart. No YAML files to pass +around. + +## ACP vs MCP integration + +Both protocols let you integrate cagent agents with other tools, but they're +designed for different use cases: + +| Feature | ACP Integration | MCP Integration | +| ----------- | ---------------------------- | ----------------------------- | +| Use case | Embedded agents in editors | Agents as tools in MCP clients| +| Filesystem | Delegated to client (editor) | Direct cagent access | +| Working dir | Client workspace | Configurable per agent | +| Best for | Code editing workflows | Using agents as callable tools| + +Use ACP when you want agents embedded in your editor. Use MCP when you want to +expose agents as tools to MCP clients like Claude Desktop or Claude Code. + +For ACP integration setup, see [ACP integration](./acp.md). + +## What's next + +- Review the [configuration reference](../reference/config.md) for advanced agent + setup +- Explore the [toolsets reference](../reference/toolsets.md) to learn what tools + agents can use +- Add [RAG for codebase search](../rag.md) to your agent +- Check the [CLI reference](../reference/cli.md) for all `cagent mcp` options +- Browse [example configurations](https://github.com/docker/cagent/tree/main/examples) + for different agent types diff --git a/content/manuals/ai/cagent/rag.md b/content/manuals/ai/cagent/rag.md new file mode 100644 index 000000000000..9a4535617cb2 --- /dev/null +++ b/content/manuals/ai/cagent/rag.md @@ -0,0 +1,436 @@ +--- +title: RAG in cagent +description: How RAG gives your cagent agents access to codebases and documentation +keywords: [cagent, rag, retrieval, embeddings, semantic search] +weight: 60 +--- + +When you configure a RAG source in cagent, your agent automatically gains a +search tool for that knowledge base. The agent decides when to search, retrieves +only relevant information, and uses it to answer questions or complete tasks - +all without you manually managing what goes in the prompt. + +This guide explains how cagent's RAG system works, when to use it, and how to +configure it effectively for your content. + +> [!NOTE] +> RAG is an advanced feature that requires configuration and tuning. The defaults +> work well for getting started, but tailoring the configuration to your specific +> content and use case significantly improves results. + +## The problem: too much context + +Your agent can work with your entire codebase, but it can't fit everything in +its context window. Even with 200K token limits, medium-sized projects are too +large. Finding relevant code buried in hundreds of files wastes context. + +Filesystem tools help agents read files, but the agent has to guess which files +to read. It can't search by meaning, only by filename. Ask "find the retry +logic" and the agent reads files hoping to stumble on the right code. + +Grep finds exact text matches but misses related concepts. Searching +"authentication" won't find code using "auth" or "login." You either get +hundreds of matches or zero, and grep doesn't understand code structure - it +just matches strings anywhere they appear. + +RAG indexes your content ahead of time and enables semantic search. The agent +searches pre-indexed content by meaning, not exact words. It retrieves only +relevant chunks that respect code structure. No wasted context on exploration. + +## How RAG works in cagent + +Configure a RAG source in your cagent config: + +```yaml +rag: + codebase: + docs: [./src, ./pkg] + strategies: + - type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + database: ./code.db + +agents: + root: + model: openai/gpt-5 + instruction: You are a coding assistant. Search the codebase when needed. + rag: [codebase] +``` + +When you reference `rag: [codebase]`, cagent: + +1. **At startup** - Indexes your documents (first run only, blocks until complete) +2. **During conversation** - Gives the agent a search tool +3. **When the agent searches** - Retrieves relevant chunks and adds them to context +4. **On file changes** - Automatically re-indexes modified files + +The agent decides when to search based on the conversation. You don't manage +what goes in context - the agent does. + +### The indexing process + +On first run, cagent: + +- Reads files from configured paths +- Respects `.gitignore` patterns (can be disabled) +- Splits documents into chunks +- Creates searchable representations using your chosen strategy +- Stores everything in a local database + +Subsequent runs reuse the index. If files change, cagent detects this and +re-indexes only what changed, keeping your knowledge base up to date without +manual intervention. + +## Retrieval strategies + +Different content requires different retrieval approaches. cagent supports +three strategies, each optimized for different use cases. The defaults work +well, but understanding the trade-offs helps you choose the right approach. + +### Semantic search (chunked-embeddings) + +Converts text to vectors that represent meaning, enabling search by concept +rather than exact words: + +```yaml +strategies: + - type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + database: ./docs.db + chunking: + size: 1000 + overlap: 100 +``` + +During indexing, documents are split into chunks and each chunk is converted +to a 1536-dimensional vector by the embedding model. These vectors are +essentially coordinates in a high-dimensional space where similar concepts are +positioned close together. + +When you search for "how do I authenticate users?", your query becomes a vector +and the database finds chunks with nearby vectors using cosine similarity +(measuring the angle between vectors). The embedding model learned that +"authentication," "auth," and "login" are related concepts, so searching for +one finds the others. + +Example: The query "how do I authenticate users?" finds both "User +authentication requires a valid API token" and "Token-based auth validates +requests" despite different wording. It won't find "The authentication tests +are failing" because that's a different meaning despite containing the word. + +This works well for documentation where users ask questions using different +terminology than your docs. The downside is it may miss exact technical terms +and sometimes you want literal matches, not semantic ones. Requires embedding +API calls during indexing. + +### Keyword search (BM25) + +Statistical algorithm that matches and ranks by term frequency and rarity: + +```yaml +strategies: + - type: bm25 + database: ./bm25.db + k1: 1.5 + b: 0.75 + chunking: + size: 1000 + overlap: 100 +``` + +During indexing, documents are tokenized and the algorithm calculates how often +each term appears (term frequency) and how rare it is across all documents +(inverse document frequency). The scoring index is stored in a local SQLite +database. + +When you search for "HandleRequest function", the algorithm finds chunks +containing these exact terms and scores them based on term frequency, term +rarity, and document length. Finding "HandleRequest" is scored as more +significant than finding common words like "function". Think of it as grep with +statistical ranking. + +Example: Searching "HandleRequest function" finds `func HandleRequest(w +http.ResponseWriter, r *http.Request)` and "The HandleRequest function +processes incoming requests", but not "process HTTP requests" despite that +being semantically similar. + +The `k1` parameter (default 1.5) controls how much repeated terms matter - +higher values emphasize repetition more. The `b` parameter (default 0.75) +controls length normalization - higher values penalize longer documents more. + +This is fast, local (no API costs), and predictable for finding function names, +class names, API endpoints, and any identifier that appears verbatim. The +trade-off is zero understanding of meaning - "RetryHandler" and "retry logic" +won't match despite being related. Essential complement to semantic search. + +### LLM-enhanced semantic search (semantic-embeddings) + +Generates semantic summaries with an LLM before embedding, enabling search by +what code does rather than what it's called: + +```yaml +strategies: + - type: semantic-embeddings + embedding_model: openai/text-embedding-3-small + chat_model: openai/gpt-5-mini + vector_dimensions: 1536 + database: ./code.db + ast_context: true + chunking: + size: 1000 + code_aware: true +``` + +During indexing, code is split using AST structure (functions stay intact), +then the `chat_model` generates a semantic summary of each chunk. The summary +gets embedded, not the raw code. When you search, your query matches against +these summaries, but the original code is returned. + +This solves a problem with regular embeddings: raw code embeddings are +dominated by variable names and implementation details. A function called +`processData` that implements retry logic won't semantically match "retry". But +when the LLM summarizes it first, the summary explicitly mentions "retry +logic," making it findable. + +Example: Consider this code: + +```go +func (c *Client) Do(req *Request) (*Response, error) { + for i := 0; i < 3; i++ { + resp, err := c.attempt(req) + if err == nil { return resp, nil } + time.Sleep(time.Duration(1< [!NOTE] +> Currently only Go is supported; support for additional languages is planned. + +For short, focused content like API references: + +```yaml +chunking: + size: 500 + overlap: 50 +``` + +Brief sections need less overlap since they're naturally self-contained. + +Experiment with these values. If retrieval misses context, increase chunk size +or overlap. If results are too broad, decrease chunk size. + +## Making decisions about RAG + +### When to use RAG + +Use RAG when: + +- Your content is too large for the context window +- You want targeted retrieval, not everything at once +- Content changes and needs to stay current +- Agent needs to search across many files + +Don't use RAG when: + +- Content is small enough to include in agent instructions +- Information rarely changes (consider prompt engineering instead) +- You need real-time data (RAG uses pre-indexed snapshots) +- Content is already in a searchable format the agent can query directly + +### Choosing retrieval strategies + +Use semantic search (chunked-embeddings) for user-facing documentation, content +with varied terminology, and conceptual searches where users phrase questions +differently than your docs. + +Use keyword search (BM25) for code identifiers, function names, API endpoints, +error messages, and any content where exact term matching matters. Essential +for technical jargon and proper nouns. + +Use LLM-enhanced semantic (semantic-embeddings) for code search by +functionality, finding implementations by behavior rather than name, or complex +technical content requiring deep understanding. Choose this when accuracy +matters more than indexing speed. + +Use hybrid (multiple strategies) for general-purpose search across mixed +content, when you're unsure which approach works best, or for production +systems where quality matters most. Maximum coverage at the cost of complexity. + +### Tuning for your project + +Start with defaults, then adjust based on results. + +If retrieval misses relevant content: + +- Increase `limit` in strategies to retrieve more candidates +- Adjust `threshold` to be less strict +- Increase chunk `size` to capture more context +- Add more retrieval strategies + +If retrieval returns irrelevant content: + +- Decrease `limit` to fewer candidates +- Increase `threshold` to be more strict +- Add reranking with specific criteria +- Decrease chunk `size` for more focused results + +If indexing is too slow: + +- Increase `batch_size` for fewer API calls +- Increase `max_embedding_concurrency` for parallelism +- Consider BM25 instead of embeddings (local, no API) +- Use smaller embedding models + +If results lack context: + +- Increase chunk `overlap` +- Increase chunk `size` +- Use `return_full_content: true` to return entire documents +- Add neighboring chunks to results + +## Further reading + +- [Configuration reference](reference/config.md#rag) - Complete RAG options and + parameters +- [RAG examples](https://github.com/docker/cagent/tree/main/examples/rag) - + Working configurations for different scenarios +- [Tools reference](reference/toolsets.md) - How RAG search tools work in agent workflows diff --git a/content/manuals/ai/cagent/reference/_index.md b/content/manuals/ai/cagent/reference/_index.md new file mode 100644 index 000000000000..1e3fdb26253f --- /dev/null +++ b/content/manuals/ai/cagent/reference/_index.md @@ -0,0 +1,6 @@ +--- +build: + render: never +title: Reference +weight: 40 +--- diff --git a/content/manuals/ai/cagent/reference/cli.md b/content/manuals/ai/cagent/reference/cli.md new file mode 100644 index 000000000000..b20e8846fcb3 --- /dev/null +++ b/content/manuals/ai/cagent/reference/cli.md @@ -0,0 +1,482 @@ +--- +title: CLI reference +linkTitle: CLI +description: Complete reference for cagent command-line interface +keywords: [ai, agent, cagent, cli, command line] +weight: 30 +--- + +Command-line interface for running, managing, and deploying AI agents. + +For agent configuration file syntax, see the [Configuration file +reference](./config.md). For toolset capabilities, see the [Toolsets +reference](./toolsets.md). + +## Synopsis + +```console +$ cagent [command] [flags] +``` + +## Global flags + +Work with all commands: + +| Flag | Type | Default | Description | +| --------------- | ------- | ------- | -------------------- | +| `-d`, `--debug` | boolean | false | Enable debug logging | +| `-o`, `--otel` | boolean | false | Enable OpenTelemetry | +| `--log-file` | string | - | Debug log file path | + +Debug logs write to `~/.cagent/cagent.debug.log` by default. Override with +`--log-file`. + +## Runtime flags + +Work with most commands. Supported commands link to this section. + +| Flag | Type | Default | Description | +| ------------------- | ------- | ------- | ------------------------------------ | +| `--models-gateway` | string | - | Models gateway address | +| `--env-from-file` | array | - | Load environment variables from file | +| `--code-mode-tools` | boolean | false | Enable JavaScript tool orchestration | +| `--working-dir` | string | - | Working directory for the session | + +Set `--models-gateway` via `CAGENT_MODELS_GATEWAY` environment variable. + +## Commands + +### a2a + +Expose agent via the Agent2Agent (A2A) protocol. Allows other A2A-compatible +systems to discover and interact with your agent. Auto-selects an available +port if not specified. + +```console +$ cagent a2a agent-file|registry-ref +``` + +> [!NOTE] +> A2A support is currently experimental and needs further work. Tool calls are +> handled internally and not exposed as separate ADK events. Some ADK features +> are not yet integrated. + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) + +Flags: + +| Flag | Type | Default | Description | +| --------------- | ------- | ------- | ----------------- | +| `-a`, `--agent` | string | root | Agent name | +| `--port` | integer | 0 | Port (0 = random) | + +Supports [runtime flags](#runtime-flags). + +Examples: + +```console +$ cagent a2a ./agent.yaml --port 8080 +$ cagent a2a agentcatalog/pirate --port 9000 +``` + +### acp + +Start agent as ACP (Agent Client Protocol) server on stdio for editor integration. +See [ACP integration](../integrations/acp.md) for setup guides. + +```console +$ cagent acp agent-file|registry-ref +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) + +Supports [runtime flags](#runtime-flags). + +### alias add + +Create alias for agent. + +```console +$ cagent alias add name target +``` + +Arguments: + +- `name` - Alias name (required) +- `target` - Path to YAML or registry reference (required) + +Examples: + +```console +$ cagent alias add dev ./dev-agent.yaml +$ cagent alias add prod docker.io/user/prod-agent:latest +$ cagent alias add default ./agent.yaml +``` + +Setting alias name to "default" lets you run `cagent run` without arguments. + +### alias list + +List all aliases. + +```console +$ cagent alias list +$ cagent alias ls +``` + +### alias remove + +Remove alias. + +```console +$ cagent alias remove name +$ cagent alias rm name +``` + +Arguments: + +- `name` - Alias name (required) + +### api + +HTTP API server. + +```console +$ cagent api agent-file|agents-dir +``` + +Arguments: + +- `agent-file|agents-dir` - Path to YAML or directory with agents (required) + +Flags: + +| Flag | Type | Default | Description | +| -------------------- | ------- | ---------- | --------------------------------- | +| `-l`, `--listen` | string | :8080 | Listen address | +| `-s`, `--session-db` | string | session.db | Session database path | +| `--pull-interval` | integer | 0 | Auto-pull OCI ref every N minutes | + +Supports [runtime flags](#runtime-flags). + +Examples: + +```console +$ cagent api ./agent.yaml +$ cagent api ./agents/ --listen :9000 +$ cagent api docker.io/user/agent --pull-interval 10 +``` + +The `--pull-interval` flag works only with OCI references. Automatically pulls and reloads at the specified interval. + +### build + +Build Docker image for agent. + +```console +$ cagent build agent-file|registry-ref [image-name] +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) +- `image-name` - Docker image name (optional) + +Flags: + +| Flag | Type | Default | Description | +| ------------ | ------- | ------- | -------------------------- | +| `--dry-run` | boolean | false | Print Dockerfile only | +| `--push` | boolean | false | Push image after build | +| `--no-cache` | boolean | false | Build without cache | +| `--pull` | boolean | false | Pull all referenced images | + +Example: + +```console +$ cagent build ./agent.yaml myagent:latest +$ cagent build ./agent.yaml --dry-run +``` + +### catalog list + +List catalog agents. + +```console +$ cagent catalog list [org] +``` + +Arguments: + +- `org` - Organization name (optional, default: `agentcatalog`) + +Queries Docker Hub for agent repositories. + +### debug config + +Show resolved agent configuration. + +```console +$ cagent debug config agent-file|registry-ref +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) + +Supports [runtime flags](#runtime-flags). + +Shows canonical configuration in YAML after all processing and defaults. + +### debug toolsets + +List agent tools. + +```console +$ cagent debug toolsets agent-file|registry-ref +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) + +Supports [runtime flags](#runtime-flags). + +Lists all tools for each agent in the configuration. + +### eval + +Run evaluation tests. + +```console +$ cagent eval agent-file|registry-ref [eval-dir] +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) +- `eval-dir` - Evaluation files directory (optional, default: `./evals`) + +Supports [runtime flags](#runtime-flags). + +### exec + +Single message execution without TUI. + +```console +$ cagent exec agent-file|registry-ref [message|-] +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) +- `message` - Prompt, or `-` for stdin (optional) + +Same flags as [run](#run). + +Supports [runtime flags](#runtime-flags). + +Examples: + +```console +$ cagent exec ./agent.yaml +$ cagent exec ./agent.yaml "Check for security issues" +$ echo "Instructions" | cagent exec ./agent.yaml - +``` + +### feedback + +Submit feedback. + +```console +$ cagent feedback +``` + +Shows link to submit feedback. + +### mcp + +MCP (Model Context Protocol) server on stdio. Exposes agents as tools to MCP +clients. See [MCP integration](../integrations/mcp.md) for setup guides. + +```console +$ cagent mcp agent-file|registry-ref +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (required) + +Supports [runtime flags](#runtime-flags). + +Examples: + +```console +$ cagent mcp ./agent.yaml +$ cagent mcp docker.io/user/agent:latest +``` + +### new + +Create agent configuration interactively. + +```console +$ cagent new [message...] +``` + +Flags: + +| Flag | Type | Default | Description | +| ------------------ | ------- | ------- | ------------------------------- | +| `--model` | string | - | Model as `provider/model` | +| `--max-iterations` | integer | 0 | Maximum agentic loop iterations | + +Supports [runtime flags](#runtime-flags). + +Opens interactive TUI to configure and generate agent YAML. + +### pull + +Pull agent from OCI registry. + +```console +$ cagent pull registry-ref +``` + +Arguments: + +- `registry-ref` - OCI registry reference (required) + +Flags: + +| Flag | Type | Default | Description | +| --------- | ------- | ------- | --------------------------- | +| `--force` | boolean | false | Pull even if already exists | + +Example: + +```console +$ cagent pull docker.io/user/agent:latest +``` + +Saves to local YAML file. + +### push + +Push agent to OCI registry. + +```console +$ cagent push agent-file registry-ref +``` + +Arguments: + +- `agent-file` - Path to local YAML (required) +- `registry-ref` - OCI reference like `docker.io/user/agent:latest` (required) + +Example: + +```console +$ cagent push ./agent.yaml docker.io/myuser/myagent:latest +``` + +### run + +Interactive terminal UI for agent sessions. + +```console +$ cagent run [agent-file|registry-ref] [message|-] +``` + +Arguments: + +- `agent-file|registry-ref` - Path to YAML or OCI registry reference (optional) +- `message` - Initial prompt, or `-` for stdin (optional) + +Flags: + +| Flag | Type | Default | Description | +| --------------- | ------- | ------- | ---------------------------- | +| `-a`, `--agent` | string | root | Agent name | +| `--yolo` | boolean | false | Auto-approve all tool calls | +| `--attach` | string | - | Attach image file | +| `--model` | array | - | Override model (repeatable) | +| `--dry-run` | boolean | false | Initialize without executing | +| `--remote` | string | - | Remote runtime address | + +Supports [runtime flags](#runtime-flags). + +Examples: + +```console +$ cagent run ./agent.yaml +$ cagent run ./agent.yaml "Analyze this codebase" +$ cagent run ./agent.yaml --agent researcher +$ echo "Instructions" | cagent run ./agent.yaml - +$ cagent run +``` + +Running without arguments uses the default agent or a "default" alias if configured. + +Shows interactive TUI in a terminal. Falls back to exec mode otherwise. + +#### Interactive commands + +TUI slash commands: + +| Command | Description | +| ---------- | -------------------------------- | +| `/exit` | Exit | +| `/reset` | Clear history | +| `/eval` | Save conversation for evaluation | +| `/compact` | Compact conversation | +| `/yolo` | Toggle auto-approval | + +### version + +Print version information. + +```console +$ cagent version +``` + +Shows cagent version and commit hash. + +## Environment variables + +| Variable | Description | +| ------------------------------ | ------------------------------- | +| `CAGENT_MODELS_GATEWAY` | Models gateway address | +| `TELEMETRY_ENABLED` | Telemetry control (set `false`) | +| `CAGENT_HIDE_TELEMETRY_BANNER` | Hide telemetry banner (set `1`) | +| `OTEL_EXPORTER_OTLP_ENDPOINT` | OpenTelemetry endpoint | + +## Model overrides + +Override models specified in your configuration file using the `--model` flag. + +Format: `[agent=]provider/model` + +Without an agent name, the model applies to all agents. With an agent name, it applies only to that specific agent. + +Apply to all agents: + +```console +$ cagent run ./agent.yaml --model gpt-5 +$ cagent run ./agent.yaml --model anthropic/claude-sonnet-4-5 +``` + +Apply to specific agents only: + +```console +$ cagent run ./agent.yaml --model researcher=gpt-5 +$ cagent run ./agent.yaml --model "agent1=gpt-5,agent2=claude-sonnet-4-5" +``` + +Providers: `openai`, `anthropic`, `google`, `dmr` + +Omit provider for automatic selection based on model name. diff --git a/content/manuals/ai/cagent/reference/config.md b/content/manuals/ai/cagent/reference/config.md new file mode 100644 index 000000000000..b644b536012d --- /dev/null +++ b/content/manuals/ai/cagent/reference/config.md @@ -0,0 +1,557 @@ +--- +title: Configuration file reference +linkTitle: Configuration file +description: Complete reference for the cagent YAML configuration file format +keywords: [ai, agent, cagent, configuration, yaml] +weight: 10 +--- + +This reference documents the YAML configuration file format for cagent agents. +It covers file structure, agent parameters, model configuration, toolset setup, +and RAG sources. + +For detailed documentation of each toolset's capabilities and specific options, +see the [Toolsets reference](./toolsets.md). + +## File structure + +A configuration file has four top-level sections: + +```yaml +agents: # Required - agent definitions + root: + model: anthropic/claude-sonnet-4-5 + description: What this agent does + instruction: How it should behave + +models: # Optional - model configurations + custom_model: + provider: openai + model: gpt-5 + +rag: # Optional - RAG sources + docs: + docs: [./documents] + strategies: [...] + +metadata: # Optional - author, license, readme + author: Your Name +``` + +## Agents + +| Property | Type | Description | Required | +| ---------------------- | ------- | ---------------------------------------------- | -------- | +| `model` | string | Model reference or name | Yes | +| `description` | string | Brief description of agent's purpose | Yes | +| `instruction` | string | Detailed behavior instructions | Yes | +| `sub_agents` | array | Agent names for task delegation | No | +| `handoffs` | array | Agent names for conversation handoff | No | +| `toolsets` | array | Available tools | No | +| `welcome_message` | string | Message displayed on start | No | +| `add_date` | boolean | Include current date in context | No | +| `add_environment_info` | boolean | Include working directory, OS, Git info | No | +| `add_prompt_files` | array | Prompt file paths to include | No | +| `max_iterations` | integer | Maximum tool call loops (unlimited if not set) | No | +| `num_history_items` | integer | Conversation history limit | No | +| `code_mode_tools` | boolean | Enable Code Mode for tools | No | +| `commands` | object | Named prompts accessible via `/command_name` | No | +| `structured_output` | object | JSON schema for structured responses | No | +| `rag` | array | RAG source names | No | + +### Task delegation vs conversation handoff + +Use `sub_agents` to break work into tasks. The root agent assigns work to a +sub-agent and gets results back while staying in control. + +Use `handoffs` to transfer the entire conversation to a different agent. The +new agent takes over completely. + +### Commands + +Named prompts users invoke with `/command_name`. Supports JavaScript template +literals with `${env.VARIABLE}` for environment variables: + +```yaml +commands: + greet: "Say hello to ${env.USER}" + analyze: "Analyze ${env.PROJECT_NAME || 'demo'}" +``` + +Run with: `cagent run config.yaml /greet` + +### Structured output + +Constrain responses to a JSON schema (OpenAI and Gemini only): + +```yaml +structured_output: + name: code_analysis + strict: true + schema: + type: object + properties: + issues: + type: array + items: { ... } + required: [issues] +``` + +## Models + +| Property | Type | Description | Required | +| --------------------- | ------- | ---------------------------------------------- | -------- | +| `provider` | string | `openai`, `anthropic`, `google`, `dmr` | Yes | +| `model` | string | Model name | Yes | +| `temperature` | float | Randomness (0.0-2.0) | No | +| `max_tokens` | integer | Maximum response length | No | +| `top_p` | float | Nucleus sampling (0.0-1.0) | No | +| `frequency_penalty` | float | Repetition penalty (-2.0 to 2.0, OpenAI only) | No | +| `presence_penalty` | float | Topic penalty (-2.0 to 2.0, OpenAI only) | No | +| `base_url` | string | Custom API endpoint | No | +| `parallel_tool_calls` | boolean | Enable parallel tool execution (default: true) | No | +| `token_key` | string | Authentication token key | No | +| `track_usage` | boolean | Track token usage | No | +| `thinking_budget` | mixed | Reasoning effort (provider-specific) | No | +| `provider_opts` | object | Provider-specific options | No | + +### Alloy models + +Use multiple models in rotation by separating names with commas: + +```yaml +model: anthropic/claude-sonnet-4-5,openai/gpt-5 +``` + +### Thinking budget + +Controls reasoning depth. Configuration varies by provider: + +- **OpenAI**: String values - `minimal`, `low`, `medium`, `high` +- **Anthropic**: Integer token budget (1024-32768, must be less than `max_tokens`) + - Set `provider_opts.interleaved_thinking: true` for tool use during reasoning +- **Gemini**: Integer token budget (0 to disable, -1 for dynamic, max 24576) + - Gemini 2.5 Pro: 128-32768, cannot disable (minimum 128) + +```yaml +# OpenAI +thinking_budget: low + +# Anthropic +thinking_budget: 8192 +provider_opts: + interleaved_thinking: true + +# Gemini +thinking_budget: 8192 # Fixed +thinking_budget: -1 # Dynamic +thinking_budget: 0 # Disabled +``` + +### Docker Model Runner (DMR) + +Run local models. If `base_url` is omitted, cagent auto-discovers via Docker +Model plugin. + +```yaml +provider: dmr +model: ai/qwen3 +max_tokens: 8192 +base_url: http://localhost:12434/engines/llama.cpp/v1 # Optional +``` + +Pass llama.cpp options via `provider_opts.runtime_flags` (array, string, or multiline): + +```yaml +provider_opts: + runtime_flags: ["--ngl=33", "--threads=8"] + # or: runtime_flags: "--ngl=33 --threads=8" +``` + +Model config fields auto-map to runtime flags: + +- `temperature` → `--temp` +- `top_p` → `--top-p` +- `max_tokens` → `--context-size` + +Explicit `runtime_flags` override auto-mapped flags. + +Speculative decoding for faster inference: + +```yaml +provider_opts: + speculative_draft_model: ai/qwen3:0.6B-F16 + speculative_num_tokens: 16 + speculative_acceptance_rate: 0.8 +``` + +## Tools + +Configure tools in the `toolsets` array. Three types: built-in, MCP +(local/remote), and Docker Gateway. + +> [!NOTE] +> This section covers toolset configuration syntax. For detailed documentation +> of each toolset's capabilities, available tools, and specific configuration +> options, see the [Toolsets reference](./toolsets.md). + +All toolsets support common properties like `tools` (whitelist), `defer` (deferred loading), `toon` (output compression), `env` (environment variables), and `instruction` (usage guidance). See the [Toolsets reference](./toolsets.md) for details on these properties and what each toolset does. + +### Built-in tools + +```yaml +toolsets: + - type: filesystem + - type: shell + - type: think + - type: todo + shared: true + - type: memory + path: ./memory.db +``` + +All agents automatically have a `transfer_task` tool for delegating to +sub-agents. See the [Toolsets reference](./toolsets.md) for complete documentation +of each toolset. + +### MCP tools + +Local process: + +```yaml +- type: mcp + command: npx + args: + ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/files"] + tools: ["read_file", "write_file"] # Optional: limit to specific tools + env: + NODE_OPTIONS: "--max-old-space-size=8192" +``` + +Remote server: + +```yaml +- type: mcp + remote: + url: https://mcp-server.example.com + transport_type: sse + headers: + Authorization: Bearer token +``` + +### Docker MCP Gateway + +Containerized tools from [Docker MCP Catalog](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md): + +```yaml +- type: mcp + ref: docker:duckduckgo +``` + +## RAG + +Retrieval-augmented generation for document knowledge bases. Define sources at +the top level, reference in agents. + +```yaml +rag: + docs: + docs: [./documents, ./README.md] + strategies: + - type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + database: ./embeddings.db + +agents: + root: + rag: [docs] +``` + +### Retrieval strategies + +All strategies support chunking configuration. Chunk size and overlap are +measured in characters (Unicode code points), not tokens. + +#### Chunked-embeddings + +Direct semantic search using vector embeddings. Best for understanding intent, +synonyms, and paraphrasing. + +| Field | Type | Default | +| ---------------------------------- | ------- | ------- | +| `embedding_model` | string | - | +| `database` | string | - | +| `vector_dimensions` | integer | - | +| `similarity_metric` | string | cosine | +| `threshold` | float | 0.5 | +| `limit` | integer | 5 | +| `chunking.size` | integer | 1000 | +| `chunking.overlap` | integer | 75 | +| `chunking.respect_word_boundaries` | boolean | true | +| `chunking.code_aware` | boolean | false | + +```yaml +- type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + database: ./vector.db + similarity_metric: cosine_similarity + threshold: 0.5 + limit: 10 + chunking: + size: 1000 + overlap: 100 +``` + +#### Semantic-embeddings + +LLM-enhanced semantic search. Uses a language model to generate rich semantic +summaries of each chunk before embedding, capturing deeper meaning. + +| Field | Type | Default | +| ---------------------------------- | ------- | ------- | +| `embedding_model` | string | - | +| `chat_model` | string | - | +| `database` | string | - | +| `vector_dimensions` | integer | - | +| `similarity_metric` | string | cosine | +| `threshold` | float | 0.5 | +| `limit` | integer | 5 | +| `ast_context` | boolean | false | +| `semantic_prompt` | string | - | +| `chunking.size` | integer | 1000 | +| `chunking.overlap` | integer | 75 | +| `chunking.respect_word_boundaries` | boolean | true | +| `chunking.code_aware` | boolean | false | + +```yaml +- type: semantic-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + chat_model: openai/gpt-5-mini + database: ./semantic.db + threshold: 0.3 + limit: 10 + chunking: + size: 1000 + overlap: 100 +``` + +#### BM25 + +Keyword-based search using BM25 algorithm. Best for exact terms, technical +jargon, and code identifiers. + +| Field | Type | Default | +| ---------------------------------- | ------- | ------- | +| `database` | string | - | +| `k1` | float | 1.5 | +| `b` | float | 0.75 | +| `threshold` | float | 0.0 | +| `limit` | integer | 5 | +| `chunking.size` | integer | 1000 | +| `chunking.overlap` | integer | 75 | +| `chunking.respect_word_boundaries` | boolean | true | +| `chunking.code_aware` | boolean | false | + +```yaml +- type: bm25 + database: ./bm25.db + k1: 1.5 + b: 0.75 + threshold: 0.3 + limit: 10 + chunking: + size: 1000 + overlap: 100 +``` + +### Hybrid retrieval + +Combine multiple strategies with fusion: + +```yaml +strategies: + - type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + database: ./vector.db + limit: 20 + - type: bm25 + database: ./bm25.db + limit: 15 + +results: + fusion: + strategy: rrf # Options: rrf, weighted, max + k: 60 # RRF smoothing parameter + deduplicate: true + limit: 5 +``` + +Fusion strategies: + +- `rrf`: Reciprocal Rank Fusion (recommended, rank-based, no normalization needed) +- `weighted`: Weighted combination (`fusion.weights: {chunked-embeddings: 0.7, bm25: 0.3}`) +- `max`: Maximum score across strategies + +### Reranking + +Re-score results with a specialized model for improved relevance: + +```yaml +results: + reranking: + model: openai/gpt-5-mini + top_k: 10 # Only rerank top K (0 = all) + threshold: 0.3 # Minimum score after reranking + criteria: | # Optional domain-specific guidance + Prioritize official docs over blog posts + limit: 5 +``` + +DMR native reranking: + +```yaml +models: + reranker: + provider: dmr + model: hf.co/ggml-org/qwen3-reranker-0.6b-q8_0-gguf + +results: + reranking: + model: reranker +``` + +### Code-aware chunking + +For source code, use AST-based chunking. With semantic-embeddings, you can +include AST metadata in the LLM prompts: + +```yaml +- type: semantic-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + chat_model: openai/gpt-5-mini + database: ./code.db + ast_context: true # Include AST metadata in semantic prompts + chunking: + size: 2000 + code_aware: true # Enable AST-based chunking +``` + +### RAG properties + +Top-level RAG source: + +| Field | Type | Description | +| ------------ | -------- | --------------------------------------------------------------- | +| `docs` | []string | Document paths (suppports glob patterns, respects `.gitignore`) | +| `tool` | object | Customize RAG tool name/description/instruction | +| `strategies` | []object | Retrieval strategies (see above for strategy-specific fields) | +| `results` | object | Post-processing (fusion, reranking, limits) | + +Results: + +| Field | Type | Default | +| --------------------- | ------- | ------- | +| `limit` | integer | 15 | +| `deduplicate` | boolean | true | +| `include_score` | boolean | false | +| `fusion.strategy` | string | - | +| `fusion.k` | integer | 60 | +| `fusion.weights` | object | - | +| `reranking.model` | string | - | +| `reranking.top_k` | integer | 0 | +| `reranking.threshold` | float | 0.5 | +| `reranking.criteria` | string | "" | +| `return_full_content` | boolean | false | + +## Metadata + +Documentation and sharing information: + +| Property | Type | Description | +| --------- | ------ | ------------------------------- | +| `author` | string | Author name | +| `license` | string | License (e.g., MIT, Apache-2.0) | +| `readme` | string | Usage documentation | + +```yaml +metadata: + author: Your Name + license: MIT + readme: | + Description and usage instructions +``` + +## Example configuration + +Complete configuration demonstrating key features: + +```yaml +agents: + root: + model: claude + description: Technical lead + instruction: Coordinate development tasks and delegate to specialists + sub_agents: [developer, reviewer] + toolsets: + - type: filesystem + - type: mcp + ref: docker:duckduckgo + rag: [readmes] + commands: + status: "Check project status" + + developer: + model: gpt + description: Software developer + instruction: Write clean, maintainable code + toolsets: + - type: filesystem + - type: shell + + reviewer: + model: claude + description: Code reviewer + instruction: Review for quality and security + toolsets: + - type: filesystem + +models: + gpt: + provider: openai + model: gpt-5 + + claude: + provider: anthropic + model: claude-sonnet-4-5 + max_tokens: 64000 + +rag: + readmes: + docs: ["**/README.md"] + strategies: + - type: chunked-embeddings + embedding_model: openai/text-embedding-3-small + vector_dimensions: 1536 + database: ./embeddings.db + limit: 10 + - type: bm25 + database: ./bm25.db + limit: 10 + results: + fusion: + strategy: rrf + k: 60 + limit: 5 +``` + +## What's next + +- Read the [Toolsets reference](./toolsets.md) for detailed toolset documentation +- Review the [CLI reference](./cli.md) for command-line options +- Browse [example configurations](https://github.com/docker/cagent/tree/main/examples) +- Learn about [sharing agents](../sharing-agents.md) diff --git a/content/manuals/ai/cagent/reference/examples.md b/content/manuals/ai/cagent/reference/examples.md new file mode 100644 index 000000000000..82fa44dbe01e --- /dev/null +++ b/content/manuals/ai/cagent/reference/examples.md @@ -0,0 +1,33 @@ +--- +title: Examples +description: Get inspiration from agent examples +keywords: [ai, agent, cagent] +weight: 40 +--- + +Get inspiration from the following agent examples. +See more examples in the [cagent GitHub repository](https://github.com/docker/cagent/tree/main/examples). + +## Development team + +{{% cagent-example.inline "dev-team.yaml" %}} +{{- $example := .Get 0 }} +{{- $baseUrl := "https://raw.githubusercontent.com/docker/cagent/refs/heads/main/examples" }} +{{- $url := fmt.Printf "%s/%s" $baseUrl $example }} +{{- with resources.GetRemote $url }} +{{ $data := .Content | transform.Unmarshal }} + +```yaml {collapse=true} +{{ .Content }} +``` + +{{ end }} +{{% /cagent-example.inline %}} + +## Go developer + +{{% cagent-example.inline "gopher.yaml" /%}} + +## Technical blog writer + +{{% cagent-example.inline "blog.yaml" /%}} diff --git a/content/manuals/ai/cagent/reference/toolsets.md b/content/manuals/ai/cagent/reference/toolsets.md new file mode 100644 index 000000000000..5d66355388fd --- /dev/null +++ b/content/manuals/ai/cagent/reference/toolsets.md @@ -0,0 +1,493 @@ +--- +title: Toolsets reference +linkTitle: Toolsets +description: Complete reference for cagent toolsets and their capabilities +keywords: [ai, agent, cagent, tools, toolsets] +weight: 20 +--- + +This reference documents the toolsets available in cagent and what each one +does. Tools give agents the ability to take action—interacting with files, +executing commands, accessing external resources, and managing state. + +For configuration file syntax and how to set up toolsets in your agent YAML, +see the [Configuration file reference](./config.md). + +## How agents use tools + +When you configure toolsets for an agent, those tools become available in the +agent's context. The agent can invoke tools by name with appropriate parameters +based on the task at hand. + +Tool invocation flow: + +1. Agent analyzes the task and determines which tool to use +2. Agent constructs tool parameters based on requirements +3. cagent executes the tool and returns results +4. Agent processes results and decides next steps + +Agents can call multiple tools in sequence or make decisions based on tool +results. Tool selection is automatic based on the agent's understanding of the +task and available capabilities. + +## Tool types + +cagent supports three types of toolsets: + +Built-in toolsets +: Core functionality built directly into cagent (`filesystem`, `shell`, + `memory`, etc.). These provide essential capabilities for file operations, + command execution, and state management. +MCP toolsets +: Tools provided by Model Context Protocol servers, either local processes + (stdio) or remote servers (HTTP/SSE). MCP enables access to a wide ecosystem + of standardized tools. +Custom toolsets +: Shell scripts wrapped as tools with typed parameters (`script_shell`). This + lets you define domain-specific tools for your use case. + +## Configuration + +Toolsets are configured in your agent's YAML file under the `toolsets` array: + +```yaml +agents: + my_agent: + model: anthropic/claude-sonnet-4-5 + description: A helpful coding assistant + toolsets: + # Built-in toolset + - type: filesystem + + # Built-in toolset with configuration + - type: memory + path: ./memories.db + + # Local MCP server (stdio) + - type: mcp + command: npx + args: ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"] + + # Remote MCP server (SSE) + - type: mcp + remote: + url: https://mcp.example.com/sse + transport_type: sse + headers: + Authorization: Bearer ${API_TOKEN} + + # Custom shell tools + - type: script_shell + tools: + build: + cmd: npm run build + description: Build the project +``` + +### Common configuration options + +All toolset types support these optional properties: + +| Property | Type | Description | +| ------------- | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `instruction` | string | Additional instructions for using the toolset | +| `tools` | array | Specific tool names to enable (defaults to all) | +| `env` | object | Environment variables for the toolset | +| `toon` | string | Comma-delimited regex patterns matching tool names whose JSON outputs should be compressed. Reduces token usage by simplifying/compressing JSON responses from matched tools using automatic encoding. Example: `"search.*,list.*"` | +| `defer` | boolean or array | Control which tools load into initial context. Set to `true` to defer all tools, or array of tool names to defer specific tools. Deferred tools don't consume context until explicitly loaded via `search_tool`/`add_tool`. | + +### Tool selection + +By default, agents have access to all tools from their configured toolsets. You +can restrict this using the `tools` option: + +```yaml +toolsets: + - type: filesystem + tools: [read_file, write_file, list_directory] +``` + +This is useful for: + +- Limiting agent capabilities for security +- Reducing context size for smaller models +- Creating specialized agents with focused tool access + +### Deferred loading + +Deferred loading keeps tools out of the initial context window, loading them +only when explicitly requested. This is useful for large toolsets where most +tools won't be used, significantly reducing context consumption. + +Defer all tools in a toolset: + +```yaml +toolsets: + - type: mcp + command: npx + args: ["-y", "@modelcontextprotocol/server-filesystem", "/path"] + defer: true # All tools load on-demand +``` + +Or defer specific tools while loading others immediately: + +```yaml +toolsets: + - type: mcp + command: npx + args: ["-y", "@modelcontextprotocol/server-filesystem", "/path"] + defer: [search_files, list_directory] # Only these are deferred +``` + +Agents can discover deferred tools via `search_tool` and load them into context +via `add_tool` when needed. Best for toolsets with dozens of tools where only a +few are typically used. + +### Output compression + +The `toon` property compresses JSON outputs from matched tools to reduce token +usage. When a tool's output is JSON, it's automatically compressed using +efficient encoding before being returned to the agent: + +```yaml +toolsets: + - type: mcp + command: npx + args: ["-y", "@modelcontextprotocol/server-github"] + toon: "search.*,list.*" # Compress outputs from search/list tools +``` + +Useful for tools that return large JSON responses (API results, file listings, +search results). The compression is transparent to the agent but can +significantly reduce context consumption for verbose tool outputs. + +### Per-agent tool configuration + +Different agents can have different toolsets: + +```yaml +agents: + coordinator: + model: anthropic/claude-sonnet-4-5 + sub_agents: [code_writer, code_reviewer] + toolsets: + - type: filesystem + tools: [read_file] + + code_writer: + model: openai/gpt-5-mini + toolsets: + - type: filesystem + - type: shell + + code_reviewer: + model: anthropic/claude-sonnet-4-5 + toolsets: + - type: filesystem + tools: [read_file, read_multiple_files] +``` + +This allows specialized agents with focused capabilities, security boundaries, +and optimized performance. + +## Built-in tools reference + +### Filesystem + +The `filesystem` toolset gives your agent the ability to work with +files and directories. Your agent can read files to understand +context, write new files, make targeted edits to existing files, +search for content, and explore directory structures. Essential for +code analysis, documentation updates, configuration management, and +any agent that needs to understand or modify project files. + +Access is restricted to the current working directory by default. Agents can +request access to additional directories at runtime, which requires your +approval. + +#### Configuration + +```yaml +toolsets: + - type: filesystem + + # Optional: restrict to specific tools + - type: filesystem + tools: [read_file, write_file, edit_file] +``` + +#### Available tools + +| Tool | Capability | +| --------------------------- | --------------------------------------------- | +| `read_file` | Read file contents | +| `write_file` | Write new files | +| `edit_file` | Make targeted edits to existing files | +| `read_multiple_files` | Read multiple files at once | +| `search_files` | Find files by name or pattern | +| `search_files_content` | Search file contents (grep-like) | +| `list_directory` | List directory contents | +| `directory_tree` | View recursive directory structure | +| `create_directory` | Create directories | +| `move_file` | Move or rename files | +| `get_file_info` | Get file/directory metadata | +| `add_allowed_directory` | Request access to additional directories | +| `list_allowed_directories` | Check currently allowed directories | +| `list_directory_with_sizes` | List directory contents with size information | + +### Shell + +The `shell` toolset lets your agent execute commands in your system's shell +environment. Use this for agents that need to run builds, execute tests, manage +processes, interact with CLI tools, or perform system operations. The agent can +run commands in the foreground or background. + +Commands execute in the current working directory and inherit environment +variables from the cagent process. This toolset is powerful but should be used +with appropriate security considerations. + +#### Configuration + +```yaml +toolsets: + - type: shell +``` + +#### Available tools + +| Tool | Capability | +| ------- | ----------------------------------------- | +| `shell` | Execute shell commands in the environment | + +### Think + +The `think` toolset provides your agent with a reasoning scratchpad. The agent +can record thoughts and reasoning steps without taking actions or modifying +data. Particularly useful for complex tasks where the agent needs to plan +multiple steps, verify requirements, or maintain context across a long +conversation. + +Agents use this to break down problems, list applicable rules, verify they have +all needed information, and document their reasoning process before acting. + +#### Configuration + +```yaml +toolsets: + - type: think +``` + +#### Available tools + +| Tool | Capability | +| ------- | ----------------------------------------------- | +| `think` | Record reasoning thoughts without taking action | + +### Todo + +The `todo` toolset gives your agent task-tracking capabilities for managing +multi-step operations. Your agent can break down complex work into discrete +tasks, track progress through each step, and ensure nothing is missed before +completing a request. Especially valuable for agents handling complex +workflows with multiple dependencies. + +The `shared` option allows todos to persist across different agents in a +multi-agent system, enabling coordination. + +#### Configuration + +```yaml +toolsets: + - type: todo + + # Optional: share todos across agents + - type: todo + shared: true +``` + +#### Available tools + +| Tool | Capability | +| -------------- | --------------------------------- | +| `create_todo` | Create individual todo items | +| `create_todos` | Create multiple todos at once | +| `update_todo` | Update todo status | +| `list_todos` | List all current todos and status | + +### Memory + +The `memory` toolset allows your agent to store and retrieve information across +conversations and sessions. Your agent can remember user preferences, project +context, previous decisions, and other information that should persist. Useful +for agents that interact with users over time or need to maintain state about +a project or environment. + +Memories are stored in a local database file and persist across cagent +sessions. + +#### Configuration + +```yaml +toolsets: + - type: memory + + # Optional: specify database location + - type: memory + path: ./agent-memories.db +``` + +#### Available tools + +| Tool | Capability | +| ----------------- | ---------------------------------- | +| `store_memory` | Store information persistently | +| `retrieve_memory` | Retrieve stored information by key | +| `search_memory` | Search memories by content or tags | +| `list_memories` | List all stored memories | + +### Fetch + +The `fetch` toolset enables your agent to retrieve content from HTTP/HTTPS URLs. +Your agent can fetch documentation, API responses, web pages, or any content +accessible via HTTP GET requests. Useful for agents that need to access +external resources, check API documentation, or retrieve web content. + +The agent can specify custom HTTP headers when needed for authentication or +other purposes. + +#### Configuration + +```yaml +toolsets: + - type: fetch +``` + +#### Available tools + +| Tool | Capability | +| ------- | --------------------------------------- | +| `fetch` | Retrieve content from URLs via HTTP GET | + +### API + +The `api` toolset lets you define custom tools that call HTTP APIs. Similar to +`script_shell` but for web services, this allows you to expose REST APIs, +webhooks, or any HTTP endpoint as a tool your agent can use. The agent sees +these as typed tools with automatic parameter validation. + +Use this to integrate with external services, call internal APIs, trigger +webhooks, or interact with any HTTP-based system. + +#### Configuration + +Each API tool is defined with an `api_config` containing the endpoint, HTTP method, and optional typed parameters: + +```yaml +toolsets: + - type: api + api_config: + name: search_docs + endpoint: https://api.example.com/search + method: GET + instruction: Search the documentation database + headers: + Authorization: Bearer ${API_TOKEN} + args: + query: + type: string + description: Search query + limit: + type: number + description: Maximum results + required: [query] + + - type: api + api_config: + name: create_ticket + endpoint: https://api.example.com/tickets + method: POST + instruction: Create a support ticket + args: + title: + type: string + description: Ticket title + description: + type: string + description: Ticket description + required: [title, description] +``` + +For GET requests, parameters are interpolated into the endpoint URL. For POST +requests, parameters are sent as JSON in the request body. + +Supported argument types: `string`, `number`, `boolean`, `array`, `object`. + +### Script Shell + +The `script_shell` toolset lets you define custom tools by wrapping shell +commands with typed parameters. This allows you to expose domain-specific +operations to your agent as first-class tools. The agent sees these custom +tools just like built-in tools, with parameter validation and type checking +handled automatically. + +Use this to create tools for deployment scripts, build commands, test runners, +or any operation specific to your project or workflow. + +#### Configuration + +Each custom tool is defined with a command, description, and optional typed +parameters: + +```yaml +toolsets: + - type: script_shell + tools: + deploy: + cmd: ./deploy.sh + description: Deploy the application to an environment + args: + environment: + type: string + description: Target environment (dev, staging, prod) + version: + type: string + description: Version to deploy + required: [environment] + + run_tests: + cmd: npm test + description: Run the test suite + args: + filter: + type: string + description: Test name filter pattern +``` + +Supported argument types: `string`, `number`, `boolean`, `array`, `object`. + +#### Tools + +The tools you define become available to your agent. In the previous example, +the agent would have access to `deploy` and `run_tests` tools. + +## Automatic tools + +Some tools are automatically added to agents based on their configuration. You +don't configure these explicitly—they appear when needed. + +### transfer_task + +Automatically available when your agent has `sub_agents` configured. Allows +the agent to delegate tasks to sub-agents and receive results back. + +### handoff + +Automatically available when your agent has `handoffs` configured. Allows the +agent to transfer the entire conversation to a different agent. + +## What's next + +- Read the [Configuration file reference](./config.md) for YAML file structure +- Review the [CLI reference](./cli.md) for running agents +- Explore [MCP servers](/manuals/ai/mcp-catalog-and-toolkit/mcp-gateway.md) for extended capabilities +- Browse [example configurations](https://github.com/docker/cagent/tree/main/examples) diff --git a/content/manuals/ai/cagent/sharing-agents.md b/content/manuals/ai/cagent/sharing-agents.md new file mode 100644 index 000000000000..0d6a5efa38ce --- /dev/null +++ b/content/manuals/ai/cagent/sharing-agents.md @@ -0,0 +1,96 @@ +--- +title: Sharing agents +description: Distribute agent configurations through OCI registries +keywords: [cagent, oci, registry, docker hub, sharing, distribution] +weight: 30 +--- + +Push your agent to a registry and share it by name. Your teammates +reference `agentcatalog/security-expert` instead of copying YAML files +around or asking you where your agent configuration lives. + +When you update the agent in the registry, everyone gets the new version +the next time they pull or restart their client. + +## Prerequisites + +To push agents to a registry, authenticate first: + +```console +$ docker login +``` + +For other registries, use their authentication method. + +## Publishing agents + +Push your agent configuration to a registry: + +```console +$ cagent push ./agent.yml myusername/agent-name +``` + +Push creates the repository if it doesn't exist yet. Use Docker Hub or +any OCI-compatible registry. + +Tag specific versions: + +```console +$ cagent push ./agent.yml myusername/agent-name:v1.0.0 +$ cagent push ./agent.yml myusername/agent-name:latest +``` + +## Using published agents + +Pull an agent to inspect it locally: + +```console +$ cagent pull agentcatalog/pirate +``` + +This saves the configuration as a local YAML file. + +Run agents directly from the registry: + +```console +$ cagent run agentcatalog/pirate +``` + +Or reference it directly in integrations: + +### Editor integration (ACP) + +Use registry references in ACP configurations so your editor always uses +the latest version: + +```json +{ + "agent_servers": { + "myagent": { + "command": "cagent", + "args": ["acp", "agentcatalog/pirate"] + } + } +} +``` + +### MCP client integration + +Agents can be exposed as tools in MCP clients: + +```json +{ + "mcpServers": { + "myagent": { + "command": "/usr/local/bin/cagent", + "args": ["mcp", "agentcatalog/pirate"] + } + } +} +``` + +## What's next + +- Set up [ACP integration](./integrations/acp.md) with shared agents +- Configure [MCP integration](./integrations/mcp.md) with shared agents +- Browse the [agent catalog](https://hub.docker.com/u/agentcatalog) for examples diff --git a/content/manuals/ai/cagent/tutorial.md b/content/manuals/ai/cagent/tutorial.md new file mode 100644 index 000000000000..d23bf1207b96 --- /dev/null +++ b/content/manuals/ai/cagent/tutorial.md @@ -0,0 +1,288 @@ +--- +title: Building a coding agent +description: Create a coding agent that can read, write, and validate code changes in your projects +keywords: [cagent, tutorial, coding agent, ai assistant] +weight: 10 +--- + +This tutorial teaches you how to build a coding agent that can help with software +development tasks. You'll start with a basic agent and progressively add +capabilities until you have a production-ready assistant that can read code, +make changes, run tests, and even look up documentation. + +By the end, you'll understand how to structure agent instructions, configure +tools, and compose multiple agents for complex workflows. + +## What you'll build + +A coding agent that can: + +- Read and modify files in your project +- Run commands like tests and linters +- Follow a structured development workflow +- Look up documentation when needed +- Track progress through multi-step tasks + +## What you'll learn + +- How to configure cagent agents in YAML +- How to give agents access to tools (filesystem, shell, etc.) +- How to write effective agent instructions +- How to compose multiple agents for specialized tasks +- How to adapt agents for your own projects + +## Prerequisites + +Before starting, you need: + +- **cagent installed** - See the [installation guide](_index.md#installation) +- **API key configured** - Set `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` in your + environment. Get keys from [Anthropic](https://console.anthropic.com/) or + [OpenAI](https://platform.openai.com/api-keys) +- **A project to work with** - Any codebase where you want agent assistance + +## Creating your first agent + +A cagent agent is defined in a YAML configuration file. The minimal agent needs +just a model and instructions that define its purpose. + +Create a file named `agents.yml`: + +```yaml +agents: + root: + model: openai/gpt-5 + description: A basic coding assistant + instruction: | + You are a helpful coding assistant. + Help me write and understand code. +``` + +Run your agent: + +```console +$ cagent run agents.yml +``` + +Try asking it: "How do I read a file in Python?" + +The agent can answer coding questions, but it can't see your files or run +commands yet. To make it useful for real development work, it needs access to +tools. + +## Adding tools + +A coding agent needs to interact with your project files and run commands. You +enable these capabilities by adding toolsets. + +Update `agents.yml` to add filesystem and shell access: + +```yaml +agents: + root: + model: openai/gpt-5 + description: A coding assistant with filesystem access + instruction: | + You are a helpful coding assistant. + You can read and write files to help me develop software. + Always check if code works before finishing a task. + toolsets: + - type: filesystem + - type: shell +``` + +Run the updated agent and try: "Read the README.md file and summarize it." + +Your agent can now: + +- Read and write files in the current directory +- Execute shell commands +- Explore your project structure + +> [!NOTE] +> By default, filesystem access is restricted to the current working directory. +> The agent will request permission if it needs to access other directories. + +The agent can now interact with your code, but its behavior is still generic. +Next, you'll teach it how to work effectively. + +## Structuring agent instructions + +Generic instructions produce generic results. For production use, you want your +agent to follow a specific workflow and understand your project's conventions. + +Update your agent with structured instructions. This example shows a Go +development agent, but you can adapt the pattern for any language: + +```yaml +agents: + root: + model: anthropic/claude-sonnet-4-5 + description: Expert Go developer + instruction: | + Your goal is to help with code-related tasks by examining, modifying, + and validating code changes. + + + # Workflow: + # 1. Analyze: Understand requirements and identify relevant code. + # 2. Examine: Search for files, analyze structure and dependencies. + # 3. Modify: Make changes following best practices. + # 4. Validate: Run linters/tests. If issues found, return to Modify. + + + Constraints: + - Be thorough in examination before making changes + - Always validate changes before considering the task complete + - Write code to files, don't show it in chat + + ## Development Workflow + - `go build ./...` - Build the application + - `go test ./...` - Run tests + - `golangci-lint run` - Check code quality + + add_date: true + add_environment_info: true + toolsets: + - type: filesystem + - type: shell + - type: todo +``` + +Try asking: "Add error handling to the `parseConfig` function in main.go" + +The structured instructions give your agent: + +- A clear workflow to follow (analyze, examine, modify, validate) +- Project-specific commands to run +- Constraints that prevent common mistakes +- Context about the environment (`add_date` and `add_environment_info`) + +The `todo` toolset helps the agent track progress through multi-step tasks. +When you ask for complex changes, the agent will break down the work and update +its progress as it goes. + +## Composing multiple agents + +Complex tasks often benefit from specialized agents. You can add sub-agents that +handle specific responsibilities, like researching documentation while your main +agent stays focused on coding. + +Add a librarian agent that can search for documentation: + +```yaml +agents: + root: + model: anthropic/claude-sonnet-4-5 + description: Expert Go developer + instruction: | + Your goal is to help with code-related tasks by examining, modifying, + and validating code changes. + + When you need to look up documentation or research how something works, + delegate to the librarian agent. + + (rest of instructions from previous section...) + toolsets: + - type: filesystem + - type: shell + - type: todo + sub_agents: + - librarian + + librarian: + model: anthropic/claude-haiku-4-5 + description: Documentation researcher + instruction: | + You are the librarian. Your job is to find relevant documentation, + articles, or resources to help the developer agent. + + Search the internet and fetch web pages as needed. + toolsets: + - type: mcp + ref: docker:duckduckgo + - type: fetch +``` + +Try asking: "How do I use `context.Context` in Go? Then add it to my server code." + +Your main agent will delegate the research to the librarian, then use that +information to modify your code. This keeps the main agent's context focused on +the coding task while still having access to up-to-date documentation. + +Using a smaller, faster model (Haiku) for the librarian saves costs since +documentation lookup doesn't need the same reasoning depth as code changes. + +## Adapting for your project + +Now that you understand the core concepts, adapt the agent for your specific +project: + +### Update the development commands + +Replace the Go commands with your project's workflow: + +```yaml +## Development Workflow +- `npm test` - Run tests +- `npm run lint` - Check code quality +- `npm run build` - Build the application +``` + +### Add project-specific constraints + +If your agent keeps making the same mistakes, add explicit constraints: + +```yaml +Constraints: + - Always run tests before considering a task complete + - Follow the existing code style in src/ directories + - Never modify files in the generated/ directory + - Use TypeScript strict mode for new files +``` + +### Choose the right models + +For coding tasks, use reasoning-focused models: + +- `anthropic/claude-sonnet-4-5` - Strong reasoning, good for complex code +- `openai/gpt-5` - Fast, good general coding ability + +For auxiliary tasks like documentation lookup, smaller models work well: + +- `anthropic/claude-haiku-4-5` - Fast and cost-effective +- `openai/gpt-5-mini` - Good for simple tasks + +### Iterate based on usage + +The best way to improve your agent is to use it. When you notice issues: + +1. Add specific instructions to prevent the problem +2. Update constraints to guide behavior +3. Add relevant commands to the development workflow +4. Consider adding specialized sub-agents for complex areas + +## What you learned + +You now know how to: + +- Create a basic cagent configuration +- Add tools to enable agent capabilities +- Write structured instructions for consistent behavior +- Compose multiple agents for specialized tasks +- Adapt agents for different programming languages and workflows + +## Next steps + +- Learn [best practices](best-practices.md) for handling large outputs, structuring + agent teams, and optimizing performance +- Integrate cagent with your [editor](integrations/acp.md) or use agents as + [tools in MCP clients](integrations/mcp.md) +- Review the [Configuration reference](reference/config.md) for all available + options +- Explore the [Tools reference](reference/toolsets.md) to see what capabilities you can + enable +- Check out [example configurations](https://github.com/docker/cagent/tree/main/examples) + for different use cases +- See the full [golang_developer.yaml](https://github.com/docker/cagent/blob/main/golang_developer.yaml) + that the Docker team uses to develop cagent