Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ MCP client 指向本地构建产物:

| 阶段 | Tools | 说明 |
| --- | --- | --- |
| PDF 解析 | `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_status` / `doc2x_parse_pdf_wait_text` | 提交任务、查询状态、等待并取文本 |
| PDF 解析 | `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_status` / `doc2x_parse_pdf_wait_text` / `doc2x_materialize_pdf_layout_json` | 提交任务、查询状态、等待并取文本,或将 v3 layout 结果落盘为本地 JSON |
| 结果导出 | `doc2x_convert_export_submit` / `doc2x_convert_export_result` / `doc2x_convert_export_wait` | 发起导出、查结果、等待导出完成 |
| 下载落盘 | `doc2x_download_url_to_file` / `doc2x_materialize_convert_zip` | 下载 URL 到本地、解包 convert zip |
| 图片版面解析 | `doc2x_parse_image_layout_sync` / `doc2x_parse_image_layout_submit` / `doc2x_parse_image_layout_status` / `doc2x_parse_image_layout_wait_text` | 同步/异步图片 OCR 与版面解析 |
Expand All @@ -102,7 +102,7 @@ MCP client 指向本地构建产物:
### PDF 解析模型(`doc2x_parse_pdf_submit` / `doc2x_parse_pdf_wait_text`)

- 可选参数:`model`
- 可选值:`v3-2026`(最新模型)
- 可选值:`v2`(默认) / `v3-2026`(最新模型)
- 不传时默认 `v2`

```json
Expand All @@ -111,6 +111,23 @@ MCP client 指向本地构建产物:
}
```

### PDF Layout JSON 落盘(`doc2x_materialize_pdf_layout_json`)

- 必选参数:`output_path`
- `uid` 与 `pdf_path` 二选一
- `v2` 不支持 `layout`;需要 `pages[].layout` 时请使用 `v3-2026`
- 若传 `pdf_path` 但不传 `model`,该工具默认使用 `v3-2026`
- 成功时将原始 `result` JSON 写到本地

`layout` 是页面块结构和坐标信息,适合 figure/table 裁剪、区域高亮、结构化抽取和版面分析;如果只想看正文内容,优先使用 Markdown / DOCX 导出。

```json
{
"pdf_path": "/absolute/path/to/input.pdf",
"output_path": "/absolute/path/to/input_v3.layout.json"
}
```

### 导出公式参数(`doc2x_convert_export_submit` / `doc2x_convert_export_wait`)

- 必选参数:`formula_mode`(`normal` / `dollar`)
Expand All @@ -134,6 +151,12 @@ MCP client 指向本地构建产物:
1. `doc2x_parse_image_layout_sync` 直接同步解析。
2. 若需要稳态轮询,改用 submit/status/wait 组合。

### 工作流 3:PDF -> v3 layout JSON 本地文件

1. 调用 `doc2x_materialize_pdf_layout_json`,传入 `pdf_path` 和 `output_path`。
2. 工具会等待 parse 成功,并将原始 `result` JSON 落到本地。
3. 该 JSON 可直接提供给后续 figure/table 裁剪脚本使用。

## 本地开发

### 环境要求
Expand Down
27 changes: 25 additions & 2 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Point MCP client to your local build output:

| Stage | Tools | Purpose |
| --- | --- | --- |
| PDF parse | `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_status` / `doc2x_parse_pdf_wait_text` | Submit parse tasks, check status, wait and fetch text |
| PDF parse | `doc2x_parse_pdf_submit` / `doc2x_parse_pdf_status` / `doc2x_parse_pdf_wait_text` / `doc2x_materialize_pdf_layout_json` | Submit parse tasks, check status, wait and fetch text, or materialize v3 layout JSON locally |
| Export | `doc2x_convert_export_submit` / `doc2x_convert_export_result` / `doc2x_convert_export_wait` | Start export, read export result, wait for completion |
| Download | `doc2x_download_url_to_file` / `doc2x_materialize_convert_zip` | Download export URL to local path, materialize convert zip |
| Image layout parse | `doc2x_parse_image_layout_sync` / `doc2x_parse_image_layout_submit` / `doc2x_parse_image_layout_status` / `doc2x_parse_image_layout_wait_text` | Sync/async OCR and layout parse for images |
Expand All @@ -102,7 +102,7 @@ Point MCP client to your local build output:
### PDF Parse Model (`doc2x_parse_pdf_submit` / `doc2x_parse_pdf_wait_text`)

- Optional parameter: `model`
- Supported value: `v3-2026` (latest model)
- Supported values: `v2` (default) / `v3-2026` (latest model)
- Default (when omitted): `v2`

```json
Expand All @@ -111,6 +111,23 @@ Point MCP client to your local build output:
}
```

### PDF Layout JSON Materialization (`doc2x_materialize_pdf_layout_json`)

- Required: `output_path`
- Provide either `uid` or `pdf_path`
- `v2` does not support `layout`; use `v3-2026` when `pages[].layout` is required
- When `pdf_path` is used and `model` is omitted, this tool defaults to `v3-2026`
- On success it writes the raw parse `result` JSON locally

`layout` contains page block structure and coordinates, which is useful for figure/table crops, region highlighting, structured extraction, and layout analysis. If the goal is readable full text, prefer Markdown / DOCX export.

```json
{
"pdf_path": "/absolute/path/to/input.pdf",
"output_path": "/absolute/path/to/input_v3.layout.json"
}
```

### Export Formula Parameters (`doc2x_convert_export_submit` / `doc2x_convert_export_wait`)

- Required: `formula_mode` (`normal` / `dollar`)
Expand All @@ -134,6 +151,12 @@ Point MCP client to your local build output:
1. Use `doc2x_parse_image_layout_sync` for direct parse.
2. For robust polling behavior, switch to submit/status/wait flow.

### Workflow 3: PDF -> local v3 layout JSON

1. Call `doc2x_materialize_pdf_layout_json` with `pdf_path` and `output_path`.
2. The tool waits for parse success and writes the raw `result` JSON locally.
3. The saved JSON can be consumed directly by downstream figure/table crop scripts.

## Local Development

### Requirements
Expand Down
6 changes: 3 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@noedgeai-org/doc2x-mcp",
"version": "0.1.3",
"version": "0.1.4",
"description": "Doc2x MCP server (stdio, MCP SDK).",
"license": "MIT",
"engines": {
Expand Down Expand Up @@ -31,13 +31,13 @@
"skill:install:ps": "pwsh -NoProfile -ExecutionPolicy Bypass -File scripts/install-skill.ps1",
"skill:install:winps": "powershell -NoProfile -ExecutionPolicy Bypass -File scripts/install-skill-winps.ps1",
"start": "node dist/index.js",
"test:unit": "npm run build && node --test test/unit/registerToolsShared.test.js",
"test:unit": "npm run build && node --test test/unit/registerToolsShared.test.js test/unit/materialize.test.js",
"test:e2e": "npm run build && node --test test/e2e/mcpServer.e2e.test.js",
"test": "npm run test:unit && npm run test:e2e",
"prepublishOnly": "pnpm run build"
},
"dependencies": {
"@modelcontextprotocol/sdk": "1.26.0",
"@modelcontextprotocol/sdk": "1.27.1",
"@types/lodash": "^4.17.23",
"lodash": "4.17.23",
"lru-cache": "^11.2.6",
Expand Down
76 changes: 38 additions & 38 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading