🐱 Cat-Research · 多智能体深度研究系统

让 AI 像专业研究员一样思考、搜索、验证、迭代

中文 · English

📖 中文文档

项目简介

Cat-Research 是一个基于多智能体协作的深度研究系统。它将一个研究问题自动分解为多个专业阶段，由职责各异的 AI 智能体协同完成，最终交付一份经过来源权威性验证、事实交叉核查、结论逻辑验证的高质量研究报告。

不同于普通的 AI 问答，Cat-Research 会：

🔍 主动搜索：通过智谱 Search API 实时获取最新网络信息，自动按时效过滤（优先近 3 个月）
🔗 验证来源：4 级 Tier 分类 + 0–100 分域名评分 + 4 档置信度等级
✅ 核查事实：对关键声明进行交叉验证，输出置信度分数（0.0–1.0）
🔄 迭代改进：强制执行最少 2 轮、最多 5 轮质量改进循环，配合棘轮机制保留最优版本
📊 置信度报告：最终输出综合质量评分（来源 25% + 事实 35% + 结论 40%）

系统架构

用户提问
   │
   ▼
┌─────────────────────────────────────────────────────┐
│                    Orchestrator                     │
│                   （研究协调器）                     │
└─────────────────────────────────────────────────────┘
   │
   ├─→ [阶段 1]   🗣️  Clarifier          意图识别与问题澄清
   │
   ├─→ [阶段 2]   📋  Planner            研究规划（生成 10–16 个多语言搜索查询）
   │
   ├─→ [阶段 3]   🔍  Researcher         网络搜索 + 网页抓取 + 多源收集
   │
   ├─→ [阶段 3.5] 🔎  SourceVerifier     来源权威性评估（Tier 1–4 + 0–100 分）
   │
   ├─→ [阶段 4]   🧐  Analyst            综合分析（因果 / 趋势 / 对比）
   │
   ├─→ [阶段 5]   ✍️  Writer             初稿撰写（结构化研究报告）
   │
   ├─→ [阶段 5.5] 🔬  FactChecker        事实核查（关键声明交叉验证）
   │
   └─→ [阶段 6]   🔄  质量改进循环（强制 ≥ 2 轮，最多 5 轮）
           ├─→ 🎯 Critic              多维评审（7 项评分，满分 10）
           ├─→ 🔍 Researcher          补充研究（仅前 3 轮）
           ├─→ ✔️ ConclusionValidator  结论验证（5 项评分）
           └─→ ✍️ Writer              迭代改进（棘轮机制保留最优版本）
   │
   ▼
最终研究报告（09_final.md）+ 置信度报告（08_verification/confidence_report.json）

9 大智能体

智能体	职责
🗣️ Clarifier（澄清者）	识别用户意图，判断是否需要澄清，自动调整研究方向与深度
📋 Planner（规划师）	将问题分解为 10–16 个多语言搜索查询，按时效与领域分层排列
🔍 Researcher（研究员）	执行网络搜索（智谱 Search API，降级备选 DuckDuckGo），抓取网页正文，聚合多源原始数据
🔎 SourceVerifier（来源验证员）	对每个来源打 Tier 1–4 标签并给出 0–100 分域名评分，识别不可靠来源
🧐 Analyst（分析师）	整合原始资料，提炼关键发现，进行因果、趋势、对比分析
✍️ Writer（写作者）	撰写结构化研究报告，根据评审反馈多轮迭代优化
🎯 Critic（评审员）	从完整性、准确性、深度、清晰性等 7 个维度对报告打分
🔬 FactChecker（事实核查员）	对报告内关键声明交叉验证，给出 0.0–1.0 置信度评分
✔️ ConclusionValidator（结论验证员）	验证结论的逻辑严密性、全面性与实用价值（5 项评分）

项目结构

cat-research/
├── main.py                  # 命令行入口
├── run_api.py               # Web UI + API 服务入口
├── orchestrator.py          # 研究编排器（主流程控制）
├── config.py                # 全局配置（模型 / 搜索 / 质量参数）
├── requirements.txt         # Python 依赖
├── settings.example.json    # 配置模板
│
├── agents/                  # 所有智能体实现
│   ├── base_agent.py        # 基础智能体（对话 / 上下文压缩）
│   ├── clarifier.py         # 澄清者
│   ├── planner.py           # 规划师
│   ├── researcher.py        # 研究员
│   ├── analyst.py           # 分析师
│   ├── writer.py            # 写作者
│   ├── critic.py            # 评审员
│   ├── source_verifier.py   # 来源验证员
│   ├── fact_checker.py      # 事实核查员
│   └── conclusion_validator.py  # 结论验证员
│
├── api/                     # FastAPI Web 服务
│   ├── app.py               # FastAPI 应用 + 所有路由定义
│   ├── models/              # Pydantic 数据模型
│   └── services/            # 业务逻辑层
│
├── tools/                   # 工具模块
│   ├── web_search.py        # 智谱 Web Search API + 网页抓取（降级 DuckDuckGo）
│   ├── fact_tools.py        # 事实核查工具
│   ├── file_tools.py        # 文件读写工具
│   ├── domain_checker.py    # 域名权威性检测
│   └── verification_registry.py  # 验证结果缓存注册中心
│
├── static/                  # Web UI 前端静态文件
└── workspace/               # 研究输出目录（每次研究独立文件夹）

快速开始

1. 克隆项目

git clone https://github.com/mmlong818/cat-research.git
cd cat-research

2. 安装依赖

pip install -r requirements.txt

3. 配置 API Key

cp settings.example.json settings.json
# 编辑 settings.json，填入你的 API Key

settings.json 内容示例：

{
  "api_key": "your_api_key_here",
  "base_url": "https://open.bigmodel.cn/api/paas/v4/",
  "core_model": "glm-4.7",
  "support_model": "glm-4.7-flash"
}

也可通过环境变量配置（优先级高于 settings.json）：

# 智谱 GLM
export ZHIPU_API_KEY=your_api_key_here

# 或 OpenAI
export OPENAI_API_KEY=your_api_key_here
export OPENAI_BASE_URL=https://api.openai.com/v1

4. 启动研究

方式一：命令行直接提问

python main.py "2025年大模型行业有哪些重要进展？"

方式二：命令行交互模式

python main.py
# 按提示输入研究问题

方式三：Web UI + REST API 服务

python run_api.py
# 访问 http://localhost:8000

输出结构

每次研究会在 workspace/ 目录下生成独立会话文件夹：

workspace/session_20250322_143022/
├── 01_question.txt              # 原始研究问题
├── 02_clarification.txt         # 意图澄清结果
├── 03_plan.json                 # 研究规划（搜索查询列表）
├── 04_research/                 # 原始搜索数据与网页内容
├── 05_analysis.md               # 综合分析报告
├── 06_drafts/                   # 各轮草稿历史
├── 07_reviews/                  # 评审记录（含 7 维度分数）
├── 08_verification/             # 所有验证结果
│   ├── source_verification.json    # 来源权威性评估
│   ├── fact_check.json             # 事实核查结果
│   ├── conclusion_validation.json  # 结论验证结果
│   └── confidence_report.json      # 📊 综合置信度报告
└── 09_final.md                  # 📄 最终研究报告

API 端点

启动 python run_api.py 后可访问以下接口：

研究任务

方法	端点	说明
`POST`	`/api/research`	启动研究任务
`GET`	`/api/research/{id}/stream`	SSE 流式获取实时进度
`GET`	`/api/research/{id}/status`	查询任务当前状态
`GET`	`/api/research/{id}/result`	获取已完成任务的结果
`POST`	`/api/research/{id}/message`	向正在运行的任务注入消息
`POST`	`/api/research/{id}/pause`	暂停任务
`POST`	`/api/research/{id}/resume`	恢复任务
`POST`	`/api/research/{id}/stop`	停止任务
`DELETE`	`/api/research/{id}`	停止并删除任务

会话管理

方法	端点	说明
`GET`	`/api/sessions`	获取历史研究会话列表
`GET`	`/api/sessions/{id}/report`	获取会话最终报告
`GET`	`/api/sessions/{id}/plan`	获取会话研究计划
`GET`	`/api/sessions/{id}/phases`	获取会话各阶段内容
`DELETE`	`/api/sessions/{id}`	删除会话及工作区

澄清对话

方法	端点	说明
`POST`	`/api/clarify`	开始意图澄清会话
`POST`	`/api/clarify/{id}/message`	继续澄清对话
`POST`	`/api/clarify/{id}/confirm`	确认问题并启动研究

系统

方法	端点	说明
`GET`	`/api/health`	健康检查
`GET`	`/api/config`	获取当前系统配置
`POST`	`/api/config`	运行时更新系统配置
`GET`	`/api/settings`	获取当前 API 设置
`POST`	`/api/settings`	更新并持久化 API Key / 模型配置
`GET`	`/`	Web UI 界面
`GET`	`/docs`	Swagger 交互式 API 文档

配置说明

所有配置项均可通过环境变量覆盖 settings.json：

# ── API 接入 ──────────────────────────────────────────────────────
ZHIPU_API_KEY=your_key          # 智谱 GLM API Key
OPENAI_API_KEY=your_key         # 或 OpenAI API Key
OPENAI_BASE_URL=https://...     # 自定义 Base URL

# ── 模型选择 ─────────────────────────────────────────────────────
CORE_MODEL=glm-4.7              # 核心模型（规划/研究/分析/写作）
SUPPORT_MODEL=glm-4.7-flash     # 辅助模型（评审/验证/核查）

# ── 研究质量参数 ──────────────────────────────────────────────────
MAX_IMPROVEMENT_CYCLES=5        # 最多改进轮数（默认 5）
MIN_IMPROVEMENT_CYCLES=2        # 最少强制改进轮数（默认 2）
QUALITY_THRESHOLD=8.0           # 质量提前停止阈值（满分 10）
MAX_SEARCH_RESULTS=8            # 每次搜索最多返回结果数
MAX_FETCH_CHARS=6000            # 每个网页最多提取字符数

# ── 上下文管理 ────────────────────────────────────────────────────
COMPRESS_THRESHOLD_CHARS=100000 # 超过此字符数自动压缩上下文
COMPRESS_KEEP_RECENT=6          # 压缩时保留最近 N 条消息

# ── 子进程隔离（实验性）────────────────────────────────────────────
USE_SUBPROCESS=false            # 是否启用独立子进程隔离
SUBPROCESS_AGENTS=researcher,analyst,writer

# ── API 服务 ──────────────────────────────────────────────────────
API_HOST=0.0.0.0
API_PORT=8000
CORS_ORIGINS=http://localhost:8000

支持的 API 提供商

Cat-Research 兼容所有 OpenAI 接口标准的 API 服务：

提供商	Base URL	推荐模型
智谱 AI	`https://open.bigmodel.cn/api/paas/v4/`	glm-4.7, glm-4.7-flash
OpenAI	`https://api.openai.com/v1`	gpt-5
DeepSeek	`https://api.deepseek.com/v1`	deepseek
Moonshot	`https://api.moonshot.cn/v1`	moonshot 2.5
本地 Ollama	`http://localhost:11434/v1`	qwen3.5 等

系统要求

Python 3.9+
有效的 API Key（支持上表中任意服务）
正常的网络连接（用于实时网络搜索）
内存：512 MB+（推荐 2 GB 以上）

📖 English Documentation

Overview

Cat-Research is a multi-agent collaborative deep research system. It automatically decomposes a research question into specialized stages, each handled by a dedicated AI agent, ultimately producing a high-quality research report with source credibility verification, cross-referenced fact-checking, and validated conclusions.

Unlike simple AI Q&A tools, Cat-Research will:

🔍 Actively search via Zhipu Search API with automatic recency filtering (prioritizes last 3 months)
🔗 Verify sources using 4-tier classification + 0–100 domain scoring + 4-level confidence rating
✅ Fact-check key claims through cross-reference validation (confidence: 0.0–1.0)
🔄 Iteratively improve with a minimum of 2 and up to 5 quality cycles, with a ratchet mechanism to preserve the best version
📊 Output a confidence report with weighted quality scoring (sources 25% + facts 35% + conclusions 40%)

Architecture

User Query
   │
   ▼
┌─────────────────────────────────────────────────────┐
│                    Orchestrator                     │
└─────────────────────────────────────────────────────┘
   │
   ├─→ [Stage 1]   🗣️  Clarifier          Intent recognition & clarification
   ├─→ [Stage 2]   📋  Planner            Research planning (10–16 queries)
   ├─→ [Stage 3]   🔍  Researcher         Web search + page scraping
   ├─→ [Stage 3.5] 🔎  SourceVerifier     Source authority scoring (Tier 1–4, 0–100)
   ├─→ [Stage 4]   🧐  Analyst            Synthesis & causal/trend analysis
   ├─→ [Stage 5]   ✍️  Writer             First draft (structured report)
   ├─→ [Stage 5.5] 🔬  FactChecker        Fact verification (0.0–1.0 confidence)
   └─→ [Stage 6]   🔄  Quality Loop (≥ 2 rounds, max 5)
           ├─→ 🎯 Critic              Multi-dimension review (7 scores / 10)
           ├─→ 🔍 Researcher          Supplemental research (rounds 1–3)
           ├─→ ✔️ ConclusionValidator  Conclusion validation (5 scores)
           └─→ ✍️ Writer              Iterative improvement (ratchet to best version)
   │
   ▼
Final Report (09_final.md) + Confidence Report (08_verification/confidence_report.json)

9 Specialized Agents

Agent	Role
🗣️ Clarifier	Identifies user intent, decides if clarification is needed, adjusts scope
📋 Planner	Breaks down the question into 10–16 multilingual queries, layered by recency
🔍 Researcher	Executes web searches (Zhipu Search API, falls back to DuckDuckGo), scrapes pages, aggregates raw multi-source data
🔎 SourceVerifier	Assigns Tier 1–4 labels and 0–100 domain scores to each source; flags unreliable ones
🧐 Analyst	Synthesizes research into key findings; causal, trend, and comparative analysis
✍️ Writer	Writes structured research reports; iterates based on review feedback
🎯 Critic	Reviews on 7 dimensions: completeness, accuracy, depth, clarity, usefulness, sources, simplicity
🔬 FactChecker	Cross-validates key claims and assigns confidence scores (0.0–1.0)
✔️ ConclusionValidator	Validates logical rigor, completeness, and practical value of conclusions (5 scores)

Quick Start

1. Clone

git clone https://github.com/mmlong818/cat-research.git
cd cat-research

2. Install dependencies

pip install -r requirements.txt

3. Configure API Key

cp settings.example.json settings.json
# Edit settings.json and fill in your API key

settings.json example:

{
  "api_key": "your_api_key_here",
  "base_url": "https://open.bigmodel.cn/api/paas/v4/",
  "core_model": "glm-4.7",
  "support_model": "glm-4.7-flash"
}

4. Run

CLI — direct question:

python main.py "What are the major AI breakthroughs in 2025?"

CLI — interactive mode:

python main.py

Web UI + API server:

python run_api.py
# Open http://localhost:8000

Output Structure

Each research session creates an isolated folder under workspace/:

workspace/session_20250322_143022/
├── 01_question.txt              # Original question
├── 02_clarification.txt         # Clarification result
├── 03_plan.json                 # Research plan (query list)
├── 04_research/                 # Raw search data & scraped pages
├── 05_analysis.md               # Synthesized analysis
├── 06_drafts/                   # Draft history
├── 07_reviews/                  # Review records (with 7-dim scores)
├── 08_verification/             # All verification results
│   ├── source_verification.json    # Source authority scores
│   ├── fact_check.json             # Fact-check results
│   ├── conclusion_validation.json  # Conclusion validation
│   └── confidence_report.json      # 📊 Confidence report
└── 09_final.md                  # 📄 Final research report

API Reference

After running python run_api.py:

Research Tasks

Method	Endpoint	Description
`POST`	`/api/research`	Start a new research task
`GET`	`/api/research/{id}/stream`	SSE stream for real-time progress
`GET`	`/api/research/{id}/status`	Get current task status
`GET`	`/api/research/{id}/result`	Get completed task result
`POST`	`/api/research/{id}/message`	Inject a message into a running task
`POST`	`/api/research/{id}/pause`	Pause a task
`POST`	`/api/research/{id}/resume`	Resume a paused task
`POST`	`/api/research/{id}/stop`	Stop a task
`DELETE`	`/api/research/{id}`	Stop and delete a task

Sessions

Method	Endpoint	Description
`GET`	`/api/sessions`	List all research sessions
`GET`	`/api/sessions/{id}/report`	Get session final report
`GET`	`/api/sessions/{id}/plan`	Get session research plan
`GET`	`/api/sessions/{id}/phases`	Get all phase outputs for a session
`DELETE`	`/api/sessions/{id}`	Delete session and workspace

Clarification

Method	Endpoint	Description
`POST`	`/api/clarify`	Start a clarification session
`POST`	`/api/clarify/{id}/message`	Continue clarification dialog
`POST`	`/api/clarify/{id}/confirm`	Confirm and start research

System

Method	Endpoint	Description
`GET`	`/api/health`	Health check
`GET`	`/api/config`	Get system configuration
`POST`	`/api/config`	Update configuration at runtime
`GET`	`/api/settings`	Get current API settings
`POST`	`/api/settings`	Update and persist API key / model settings
`GET`	`/`	Web UI
`GET`	`/docs`	Swagger interactive API docs

Configuration

All settings can be overridden via environment variables:

# API access
ZHIPU_API_KEY=your_key
OPENAI_API_KEY=your_key
OPENAI_BASE_URL=https://...

# Model selection
CORE_MODEL=glm-4.7              # Core model (planning / research / writing)
SUPPORT_MODEL=glm-4.7-flash     # Support model (review / verification)

# Quality parameters
MAX_IMPROVEMENT_CYCLES=5        # Maximum improvement rounds
MIN_IMPROVEMENT_CYCLES=2        # Minimum forced improvement rounds
QUALITY_THRESHOLD=8.0           # Early-stop quality threshold (out of 10)

# API server
API_HOST=0.0.0.0
API_PORT=8000

Supported API Providers

Cat-Research works with any OpenAI-compatible API:

Provider	Base URL	Recommended Models
Zhipu AI	`https://open.bigmodel.cn/api/paas/v4/`	glm-4.7, glm-4.7-flash
OpenAI	`https://api.openai.com/v1`	gpt-5
DeepSeek	`https://api.deepseek.com/v1`	deepseek
Moonshot	`https://api.moonshot.cn/v1`	moonshot-2.5
Local (Ollama)	`http://localhost:11434/v1`	qwen3.5, etc.

Requirements

Python 3.9+
A valid API Key from any supported provider
Internet access (for real-time web search)
RAM: 512 MB minimum (2 GB recommended)

License

MIT License · Free to use, modify, and distribute.

Made with ❤️ · Issues · Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
agents		agents
api		api
static		static
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
orchestrator.py		orchestrator.py
requirements.txt		requirements.txt
run_api.py		run_api.py
settings.example.json		settings.example.json

Folders and files

Latest commit

History

Repository files navigation

🐱 Cat-Research · 多智能体深度研究系统

📖 中文文档

项目简介

系统架构

9 大智能体

项目结构

快速开始

1. 克隆项目

2. 安装依赖

3. 配置 API Key

4. 启动研究

输出结构

API 端点

配置说明

支持的 API 提供商

系统要求

📖 English Documentation

Overview

Architecture

9 Specialized Agents

Quick Start

1. Clone

2. Install dependencies

3. Configure API Key

4. Run

Output Structure

API Reference

Configuration

Supported API Providers

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages