English | 中文
Document processing toolkit for Word, PowerPoint, Excel, and CSV files.
Bytes in, bytes out. Pure processing logic with no file I/O assumptions — use it from CLI scripts, web apps, or any Python program.
https://dockit.tianlizeng.cloud
Upload your document, pick a tool, download the result. Zero setup.
| Tool | What it does | Input | Output |
|---|---|---|---|
| Word Format Fix | Fix quotes, punctuation, units in .docx |
Word file | Word file |
| Word Text Extract | Full text extraction to Markdown | Word file | Markdown |
| Word Style Cleanup | Remove unused styles, rename/merge styles | Word file | Word file |
| Format Convert | Convert between XLSX, CSV, TXT | Spreadsheet | Spreadsheet |
| PPT Standardize | Unify fonts, fix text, set table styles | PowerPoint | PowerPoint |
| PPT to Markdown | Extract slide text with speaker notes | PowerPoint | Markdown |
| Table Merge | Merge multiple spreadsheets by column matching | Excel files | Excel file |
| Table Standardize | Check headers, reorder, auto-complete | Excel file | Excel file |
| Chart Generation | Bar charts, Gantt charts, flow diagrams | JSON config | PNG |
| Markdown to Word | Convert Markdown to styled Word document | Markdown | Word file |
| Format Inspection | Inspect document structure and styles | Word file | Report |
| Revision Marks | Add revision marks to Word documents | Word file | Word file |
| Image Captions | Add captions to images in Word documents | Word file | Word file |
# From GitHub (recommended)
pip install git+https://github.com/zengtianli/dockit.git
# For local development
git clone https://github.com/zengtianli/dockit.git
cd dockit && pip install -e .from dockit.docx import format_text
with open("input.docx", "rb") as f:
doc_bytes = f.read()
result = format_text(doc_bytes, fix_quotes=True, fix_punctuation=True, fix_units=True)
with open("output.docx", "wb") as f:
f.write(result.data)
print(result.stats) # {"quotes": 5, "punctuation": 12, "units": 3}- Fix quote pairing (smart Chinese quotes)
- Convert English punctuation to Chinese equivalents
- Convert Chinese unit names to standard symbols (e.g. 平方米 → m²)
- Format text in Word documents (quotes, punctuation, units)
- Quote font splitting (set specific font for quote characters)
- Process paragraphs, tables, headers, and footers
- Unify fonts across all slides and masters
- Fix text formatting (quotes, punctuation, units)
- Set table style options (header row, banded rows, first column)
- One-click standardization (all of the above)
- Convert between XLSX, CSV, and TXT formats
- Split workbook into per-sheet files
- Lowercase column headers
- Convert legacy .xls to .xlsx
- Auto-detect delimiters
- Convert between CSV and delimited text
- Replace circled numbers with plain format
- Reorder rows by a reference list
Run your own instance with Docker or directly:
# Docker
docker build -t dockit .
docker run -p 8503:8503 dockit
# Or run directly
pip install -e .[web]
streamlit run app/app.pyOr just use the hosted version: https://dockit.tianlizeng.cloud
MIT
