AI-powered XML analysis using Saxon-HE (XPath 3.1, XQuery 3.1, XSLT 3.0) through Claude Desktop's natural language interface.
-
Clone the repository.
git clone https://github.com/newtfire/digitxml-mcp.git
-
Set up a Python virtual environment.
cd digitxml-mcp python3.12 -m venv .venv # MacOS: source .venv/bin/activate # Windows: source .venv\Scripts\activate
-
Install dependencies.
pip install -r requirements.txt
-
Verify installation.
python3.12 diagnose.py
-
Configure Claude Desktop.
# MacOS: nano ~/Library/Application\ Support/Claude/claude_desktop_config.json # Windows: nano %APPDATA%\Claude\claude_desktop_config.json
-
Add the "mcpServers" section below the preferences (replace with YOUR path):
{ "preferences": { "sidebarMode": "chat", "coworkScheduledTasksEnabled": false }, "mcpServers": { "digitxml-mcp": { "command": "/Users/USERNAME/Documents/GitHub/repos/digitxml-mcp/.venv/bin/python", "args": [ "/Users/USERNAME/Documents/GitHub/repos/digitxml-mcp/mcp_server.py" ] } } }
- Restart Claude Desktop (completely quit and reopen)
- Ask Claude:
What XML tools do you have available? - You should see:
xpath_query, xquery_query, xslt_transform...
Find all ingredients in syllabubRecipe.xml
Count the number of paragraphs in chapter1.xml
Extract all @id attributes from the document
Use XQuery to find ingredients not mentioned in any recipe step
Check if all @ref attributes point to valid @id values
Generate a report of all element types and their frequencies
Add @unit="cups" to all quantity elements without units
Convert all dates from MM/DD/YYYY to ISO format
Create an HTML table of contents from section headings
Validate manuscript.xml against the TEI schema
Find all elements that don't follow the schema rules
Check for duplicate @xml:id values
What XML files are in my workspace?
Switch to chapter2.xml
Which file am I currently using?
Drag any XML file into Claude chat:
[drag file] Validate this against my schema
[drag file] Find all person names in this document
For each XML file in my workspace:
- Validate against schema
- Count elements
- Generate summary report
All paths are relative to the installation directory (no absolute paths needed!):
{
"xml_data_path": "./data/syllabubRecipe.xml",
"xml_schema_path": "./schemas/recipe.rnc",
"backup_dir": "./backups",
"log_dir": "./logs"
}Work with multiple files:
{
"xml_data_path": "./data/",
"default_file": "syllabubRecipe.xml",
"xml_schema_path": "./schemas/",
"backup_dir": "./backups"
}| Tool | Purpose | Example |
|---|---|---|
xpath_query |
XPath 3.1 queries | Find all <title> elements |
xquery_query |
XQuery 3.1 queries | Complex data extraction |
xslt_transform |
XSLT 3.0 transforms | Modify document structure |
get_structure_summary |
Analyze structure | Document overview |
find_irregularities |
Validation checks | Find issues |
apply_transformation |
Safe transforms | With validation |
batch_corrections |
Multiple fixes | Systematic updates |
switch_xml_file |
Change files | Work with different docs |
list_workspace_files |
Browse files | See available XMLs |
create_backup |
Save state | Before changes |
reload_document |
Refresh | After external edits |
You: Check if all ingredients in the recipe are actually used in the steps
Claude: [Uses XQuery to cross-reference]
Found 8 ingredients
6 are mentioned in steps
2 unused: vanilla, nutmeg
Would you like me to suggest where to add them?
You: For each XML file in data/, validate against schema and report errors
Claude: Processing 5 files...
✓ syllabubRecipe.xml - Valid
⚠ chapter1.xml - Missing required @id on line 45
✓ chapter2.xml - Valid
✗ notes.xml - Not well-formed (unclosed tag line 12)
✓ metadata.xml - Valid
You: Add a @created attribute with today's date to all <chapter> elements
Claude: [Generates XSLT]
Preview of changes:
Before: <chapter title="Introduction">
After: <chapter title="Introduction" created="2026-02-10">
Apply to all 15 chapters? (I'll create a backup first)
Check paths are relative:
✓ "./data/file.xml"
✗ "/Users/you/data/file.xml"Verify files exist:
ls -la data/Run diagnostic:
python diagnose.pyCheck Claude Desktop logs:
- macOS:
~/Library/Logs/Claude/mcp*.log - Windows:
%APPDATA%\Claude\logs\
Test server manually:
source .venv/bin/activate
python mcp_server.py
# Should start without errorsActivate virtual environment:
source .venv/bin/activate # or .venv\Scripts\activate
pip list | grep saxonche # Should show saxonche 12.9.0Claude Desktop (UI)
↓ MCP Protocol
MCP Server (Python)
↓ Calls
Saxon-HE Engine
↓ Processes
Your XML Files
- Saxon-HE 12.9 - Industry-standard XML processor
- XPath 3.1 - Advanced querying with functions
- XQuery 3.1 - FLWOR expressions, modules
- XSLT 3.0 - Streaming, try/catch, packages
- Relax NG - Schema validation (via jingtrang)
- MCP - Model Context Protocol for tool integration
- ✅ Portable paths - Works on any computer without config changes
- ✅ Read-only safe - Works even in restricted filesystems
- ✅ Multi-file - Switch between documents dynamically
- ✅ Upload support - Drag & drop processing
- ✅ Natural language - No need to remember syntax
- ✅ Safe operations - Automatic backups before changes
- ✅ Professional grade - Saxon-HE is industry standard
digitxml-mcp/
├── mcp_server.py # MCP server (use this!)
├── saxon_mcp_server.py # Saxon engine wrapper
├── config.json # Configuration
├── requirements.txt # Python dependencies
├── diagnose.py # Installation checker
│
├── data/ # Your XML files
│ └── syllabubRecipe.xml # Example
│
├── schemas/ # Validation schemas
│ └── recipe.rnc # Example Relax NG
│
├── backups/ # Automatic backups
└── logs/ # Server logs
- TEI document processing
- Manuscript validation
- Cross-reference checking
- Metadata enrichment
- DocBook transformation
- Link validation
- Format conversion
- Content management
- Batch validation
- Schema compliance
- Consistency checking
- Error reporting
- Multi-format output
- Content transformation
- Quality assurance
- Automated workflows
Edit saxon_mcp_server.py:
self.xpath_proc.declare_namespace('tei', 'http://www.tei-c.org/ns/1.0')Show me all files matching "chapter*.xml"
List files in subdirectory data/manuscripts/
Extract character names from chapter1.xml
Now check if these characters appear in chapter2.xml
Generate a character consistency report
- Saxon Documentation: https://www.saxonica.com/documentation/
- XPath 3.1: https://www.w3.org/TR/xpath-31/
- XQuery 3.1: https://www.w3.org/TR/xquery-31/
- XSLT 3.0: https://www.w3.org/TR/xslt-30/
- Relax NG: https://relaxng.org/
- MCP Protocol: https://modelcontextprotocol.io/
Need help?
- Run
python diagnose.pyto check installation - Check logs for error messages
- Verify paths are relative in config.json
- Test server manually:
python mcp_server.py
Ready to process XML with AI? Configure Claude Desktop and start asking questions in natural language! 🚀