Convert Confluence Space XML exports (ZIP or extracted folder) into Markdown and attachments, then browse them with a local web viewer.
- Converter (
server/convert_all_confluence_exports.py): Reads Confluence XML backup (.xml.zipor extracted folder withentities.xml), extracts pages as Markdown with YAML frontmatter (created, modified, creator, parent, pageId) and attachments. Output is a folder{name}_OUTwithpages/,attachments/, andentities.xml. - Server (
server/server.py): Serves a REST API and static assets. It scansserver/datafor any folder that contains apages/subdir and exposes folders, pages, versions, search, and attachment URLs. - Viewer (React app in
viewer/): Browse folders, open pages (with version history), view diff between versions, and open attachments.
- Python 3 with
lxml:pip install -r server/requirements.txt(orpip3 install -r server/requirements.txt). Usepython3to run the converter and server ifpythonis not available. - Node.js and pnpm (or npm) for the viewer
-
Clone the repo
git clone https://github.com/YOUR_USER/confluence-export-viewer.git cd confluence-export-viewer -
Convert an export
- Place one or more Confluence export ZIPs (e.g.
Space-export-2024-01-15.xml.zip) inserver/data. - From the project root:
npm run convert
- Or run the converter from
server/dataso input and output stay there:cd server/data && python ../convert_all_confluence_exports.py
- This creates
{basename}_OUT/withpages/*.md,attachments/, andentities.xml.
- Place one or more Confluence export ZIPs (e.g.
-
Start the app
npm install cd viewer && pnpm install && cd .. npm run dev
- Server: http://localhost:8000
- Viewer: http://localhost:5173 (proxies
/apiand/attachmentsto the server)
-
Open http://localhost:5173, pick a folder from the list, then open pages and versions.
- Input: Confluence Space XML export — either a
.xml.zipfile or an already-extracted directory whose name ends in.xmland that containsentities.xml(and optionally anattachments/tree). - Output: For each input, a directory
{base_name}_OUTis created with:pages/— one.mdfile per page; frontmatter includescreated,modified,creator,lastModifier,version,parent,pageId.attachments/— extracted files (names from Confluence metadata; extensions from MIME when needed).entities.xml— copied so the viewer can map attachment IDs to pages.
- Options:
--no-version— do not keep previous versions (only latest page and attachment version). - Where to put output: Place each
*_OUTfolder insideserver/dataso the server can list and serve it.
- Folders: List of converted exports (from
server/data); filter by name. - Pages: List of pages in the selected folder (by date, newest first); favorites supported.
- Page content: Rendered Markdown, metadata (author, dates), and list of attachments.
- Versions: Side panel with version history; select a version to load it. Diff view (current vs previous version) available.
- Attachments: Per-page or per-folder attachment list; open images/videos in a dialog or download other files.
- Search: Full-text search over page body (all folders or current folder only).
| Endpoint | Description |
|---|---|
GET /api/folders |
List folders (each has name, path, page_count, attachment_count) |
GET /api/folder/pages?folder=<path> |
List pages in a folder (filename, title, metadata) |
GET /api/page?folder=<path>&page=<filename>&version=current|vN |
Page content (markdown + frontmatter), attachments, filename_map |
GET /api/page/versions?folder=<path>&page=<filename> |
List versions (id, label, date) |
GET /api/folder/attachments?folder=<path> |
List all attachments in a folder |
GET /api/search?q=<query>&folder=<path> |
Search in page body (optional folder scope) |
GET /attachments/<filename> |
Serve an attachment file |
- Run both server and viewer:
npm run dev(usesconcurrently). - Run separately:
npm run server(Python server on port 8000),npm run viewer(Vite dev server on 5173 with proxy to 8000). - Build viewer for production:
npm run build(output inviewer/dist). You can then serve the API from the Python server and host the built frontend from the same origin (e.g. copyviewer/distintoserverand point the server atindex.htmlfor/).
MIT — see LICENSE.