Skip to content

ilmakio/confluence-export-viewer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Confluence Export Viewer

Convert Confluence Space XML exports (ZIP or extracted folder) into Markdown and attachments, then browse them with a local web viewer.

What it does

  • Converter (server/convert_all_confluence_exports.py): Reads Confluence XML backup (.xml.zip or extracted folder with entities.xml), extracts pages as Markdown with YAML frontmatter (created, modified, creator, parent, pageId) and attachments. Output is a folder {name}_OUT with pages/, attachments/, and entities.xml.
  • Server (server/server.py): Serves a REST API and static assets. It scans server/data for any folder that contains a pages/ subdir and exposes folders, pages, versions, search, and attachment URLs.
  • Viewer (React app in viewer/): Browse folders, open pages (with version history), view diff between versions, and open attachments.

Requirements

  • Python 3 with lxml: pip install -r server/requirements.txt (or pip3 install -r server/requirements.txt). Use python3 to run the converter and server if python is not available.
  • Node.js and pnpm (or npm) for the viewer

Quick start

  1. Clone the repo

    git clone https://github.com/YOUR_USER/confluence-export-viewer.git
    cd confluence-export-viewer
  2. Convert an export

    • Place one or more Confluence export ZIPs (e.g. Space-export-2024-01-15.xml.zip) in server/data.
    • From the project root:
      npm run convert
    • Or run the converter from server/data so input and output stay there:
      cd server/data && python ../convert_all_confluence_exports.py
    • This creates {basename}_OUT/ with pages/*.md, attachments/, and entities.xml.
  3. Start the app

    npm install
    cd viewer && pnpm install && cd ..
    npm run dev
  4. Open http://localhost:5173, pick a folder from the list, then open pages and versions.

Conversion details

  • Input: Confluence Space XML export — either a .xml.zip file or an already-extracted directory whose name ends in .xml and that contains entities.xml (and optionally an attachments/ tree).
  • Output: For each input, a directory {base_name}_OUT is created with:
    • pages/ — one .md file per page; frontmatter includes created, modified, creator, lastModifier, version, parent, pageId.
    • attachments/ — extracted files (names from Confluence metadata; extensions from MIME when needed).
    • entities.xml — copied so the viewer can map attachment IDs to pages.
  • Options: --no-version — do not keep previous versions (only latest page and attachment version).
  • Where to put output: Place each *_OUT folder inside server/data so the server can list and serve it.

Viewer features

  • Folders: List of converted exports (from server/data); filter by name.
  • Pages: List of pages in the selected folder (by date, newest first); favorites supported.
  • Page content: Rendered Markdown, metadata (author, dates), and list of attachments.
  • Versions: Side panel with version history; select a version to load it. Diff view (current vs previous version) available.
  • Attachments: Per-page or per-folder attachment list; open images/videos in a dialog or download other files.
  • Search: Full-text search over page body (all folders or current folder only).

API (server)

Endpoint Description
GET /api/folders List folders (each has name, path, page_count, attachment_count)
GET /api/folder/pages?folder=<path> List pages in a folder (filename, title, metadata)
GET /api/page?folder=<path>&page=<filename>&version=current|vN Page content (markdown + frontmatter), attachments, filename_map
GET /api/page/versions?folder=<path>&page=<filename> List versions (id, label, date)
GET /api/folder/attachments?folder=<path> List all attachments in a folder
GET /api/search?q=<query>&folder=<path> Search in page body (optional folder scope)
GET /attachments/<filename> Serve an attachment file

Development

  • Run both server and viewer: npm run dev (uses concurrently).
  • Run separately: npm run server (Python server on port 8000), npm run viewer (Vite dev server on 5173 with proxy to 8000).
  • Build viewer for production: npm run build (output in viewer/dist). You can then serve the API from the Python server and host the built frontend from the same origin (e.g. copy viewer/dist into server and point the server at index.html for /).

License

MIT — see LICENSE.

About

Convert Confluence XML exports to Markdown and browse them with a local web viewer.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors