Export AI chat conversations to durable JSON and Markdown formats.
A Chrome browser extension that extracts conversations from AI chat platforms (Claude.ai, ChatGPT, Gemini, Grok) into portable, machine-readable files. Zero network requests. Zero telemetry. Your conversations never leave your browser.
Acknowledgement and thanks to Jerren@trifall.com for hinting at use of gpt icon actions to capture user and assistant content in his security-first chat-export repo https://github.com/Trifall/chat-export
- 5 Platform Support: Claude, ChatGPT, Gemini, Grok (grok.com), Grok-X (x.com/i/grok)
- Two Export Formats: JSON (canonical, schema v1.0) and Markdown with metadata
- Smart Extraction: Three-pass strategy (clipboard → heuristics → ML fallback)
- Privacy First: All processing happens in-browser. No data sent to external servers.
- Safety Guarantees: Hard limits (500 turns, 60s timeout), integrity checks, circuit breakers
- Cross-Platform: Works on Windows, macOS, Linux (any Chromium browser)
-
Download or clone this repository
git clone https://github.com/fxops-ai/chat-archive.git cd chat-archive git checkout chat-archive -
Load in Chrome (or Edge, Brave, Vivaldi, Arc, Opera)
- Open
chrome://extensions/ - Enable "Developer mode" (toggle in top-right)
- Click "Load unpacked"
- Select the
chat-archivefolder - Extension icon appears in toolbar
- Open
-
Navigate to a supported AI chat platform:
-
Open an existing conversation (must have at least one exchange)
-
Click the Chat Archive extension icon in your toolbar
-
Click "Export JSON" (or "Export Markdown")
-
Choose where to save the file
Your conversation is now a durable, addressable artifact. Use it as:
- Input to scripts or automation
- Training data or evaluation sets
- Documentation or knowledge base content
- Backup before deleting conversations
- Migration between platforms
Chat Archive uses a resilient, multi-layer approach to handle platform DOM differences:
Uses each platform's native copy buttons to extract content. This is the highest-fidelity method because it retrieves the same formatted content the platform gives to users.
- Claude:
button[data-testid="action-bar-copy"]on both user and assistant turns - ChatGPT:
button[data-testid="copy-turn-action-button"]with role fromdata-turnattribute - Gemini:
aria-label="Copy prompt"(user) /data-test-id="copy-button"(assistant) - Grok:
aria-label="Copy"with 6-signal role detection (alignment, button count, styling) - Grok-X:
aria-label="Copy text"with React Native Web class fallbacks
When clipboard extraction fails, uses 7 rule-based classifiers:
- H1: Explicit role attributes (0.95 confidence)
- H2: Alternating pattern detection (0.85)
- H3: Text length asymmetry (0.70)
- H5: Content signals (code blocks, questions) (0.65)
- H7: Code block density (0.60)
Ensemble voting with 0.75 acceptance threshold. Uncertain turns flagged for manual resolution.
Optional micro-classifier (<500KB) for genuinely ambiguous cases. Not yet implemented because Pass 0+1 achieve 95%+ confidence on all verified platforms.
JSON (Canonical Format)
{
"schema_version": "1.0",
"export_metadata": {
"source_platform": "claude.ai",
"source_url": "https://claude.ai/chat/abc-123",
"export_timestamp": "2026-02-15T14:30:00Z",
"total_turns": 24,
"flagged_turns": 0
},
"conversation": [
{
"turn": 1,
"role": "user",
"content": "Hello, can you help me with...",
"classification_confidence": 0.95,
"classification_source": "clipboard"
}
]
}Markdown
# Chat Export — Claude.ai
**Exported:** 2026-02-15 14:30 UTC
**Source:** https://claude.ai/chat/abc-123
**Turns:** 24
---
## User
Hello, can you help me with...
---
## Assistant
Of course! Here's how...Chat Archive makes zero network requests. Conversations never leave your browser.
- ✅ No telemetry, analytics, or crash reporting
- ✅ No external API calls
- ✅ No background service that phones home
- ✅ All processing happens locally in the browser tab
| Permission | Why We Need It | Risk Level |
|---|---|---|
activeTab |
Access current tab DOM for extraction | Minimal - only active tab, only on user action |
downloads |
Write export files to your device | Low - write-only, user chooses save location |
storage |
Cache user classification corrections | Low - never stores conversation content |
clipboardRead |
Read clipboard after programmatic copy-button click | Low - read-only, scoped to extraction flow |
| Host permissions (4 domains) | Inject content script on chat platforms | Scoped - only the 4 target domains |
MAX_TURNS: 500 // Hard cap on turns per export
MAX_EXTRACTION_TIME_MS: 60_000 // Kill switch: abort after 60 seconds
MAX_SINGLE_TURN_SIZE: 100_000 // Flag turns exceeding 100KB
SCROLL_STABILITY_THRESHOLD: 3 // Stop scrolling after 3 unchanged countsAll extraction operations are bounded by hard limits that cannot be overridden. If limits are hit, you get a partial export with clear error messages—never silently truncated data.
| Platform | Test Type | Turns | Result | Confidence | Warnings |
|---|---|---|---|---|---|
| Claude.ai | ARCHIVETEST | 8/8 | ✅ Pass | 0.95 | 0 |
| ChatGPT | ARCHIVETEST | 8/8 | ✅ Pass | 0.99 | 0 |
| Gemini | ARCHIVETEST | 8/8 | ✅ Pass | 0.99 | 0 |
| Grok | ARCHIVETEST | 8/8 | ✅ Pass | 0.95 | 0 |
| Grok | Real conversation | 32/32 | ✅ Pass | 0.95 | 0 |
| Grok-X | ARCHIVETEST | 8/8 | ✅ Pass | 0.95 | 0 |
Tested on: Windows 11 (Chrome 132), macOS (Safari for DOM analysis, Chrome for validation)
ARCHIVETEST Conversation (standardized validation):
USER: "ARCHIVETEST: What are the three primary colors?"
ASST: [response about red, blue, yellow / red, green, blue]
USER: "ARCHIVETEST: Can you write a short 2-line poem about rain?"
ASST: [short poem response]
USER: "ARCHIVETEST: What is 42 * 17?"
ASST: [714, possibly with explanation]
USER: "ARCHIVETEST: Summarize the previous conversation in one sentence."
ASST: [summary response]
All platforms: Zero extraction errors, zero integrity warnings, all roles correctly classified.
Detailed technical documentation is available in the repository:
- chat-archive-architecture.md - Complete system design, three-pass strategy, export formats, security architecture
- DOM_STRATEGY_ANALYSIS.md - How the extension handles platform DOM differences, reverse-engineering methodology
- Appendices in architecture.md:
- Appendix A: Grok DOM Analysis (grok.com)
- Appendix B: Grok-X DOM Analysis (x.com/i/grok)
- Appendix C: Claude DOM Analysis
- Appendix D: Gemini DOM Analysis
- Appendix E: ChatGPT DOM Analysis
Each appendix includes:
- Turn container structure
- Selector stability assessment (HIGH/MEDIUM/LOW ratings)
- Role detection signals with confidence scores
- Complete extraction implementation
- Elements to exclude
- Testing checklists
The extension uses a simple shell script to concatenate source files:
# Make the build script executable (first time only)
chmod +x build.sh
# Build content.js
./build.shOn Windows: Use Git Bash (comes with Git for Windows) to run ./build.sh
chat-archive/
├── manifest.json # Chrome extension manifest (version source of truth)
├── background.js # Service worker (handles downloads)
├── popup.html # Extension popup UI
├── popup.js # Popup logic — platform detection, export trigger
├── content.js # BUILT FILE — generated by build.sh, do not edit directly
├── build.sh # Build script — concatenates src/ into content.js
├── test_v021_patch.js # Unit tests for v0.2.1 HTML escaping patch
├── src/
│ ├── content.js # Main orchestrator — routes extraction by platform
│ ├── utils/
│ │ ├── constants.js # ⚠️ Shared utilities module (not just constants!)
│ │ │ # Contains: SAFETY_LIMITS, EXTENSION_VERSION,
│ │ │ # detectPlatform(), wait(), testClipboardAccess(),
│ │ │ # findActionButton(), clickCopyAndRead(),
│ │ │ # scrollToLoadAll(), findScrollableAncestor(),
│ │ │ # flagIfOversized(), platformToDisplayName()
│ │ │ # Used by ALL five extractors — edit with care
│ │ ├── serializer.js # JSON + Markdown export formatting
│ │ └── filewriter.js # Download via Chrome API + blob fallback
│ ├── extractors/
│ │ ├── claude.js # Claude.ai extraction (Pass 0 + direct fallback)
│ │ ├── chatgpt.js # ChatGPT extraction (Pass 0 + direct fallback)
│ │ ├── gemini.js # Gemini extraction (Pass 0 + direct fallback)
│ │ ├── grok.js # Grok (grok.com) extraction
│ │ └── grok-x.js # Grok (x.com/i/grok) — different DOM, separate extractor
│ └── heuristics/
│ └── pass1.js # Pass 1 structural classification (7 heuristic rules)
└── docs/
├── chat-archive-architecture.md # Full system design and DOM analysis
└── DOM_STRATEGY_ANALYSIS.md # Platform selector strategy and fragility notes
⚠️ Important:src/utils/constants.jsis the shared utilities module for the entire codebase. Despite its name, it contains DOM helpers, scroll handling, clipboard utilities, and button-finding logic used by all five platform extractors. When patching or replacing this file, always diff against the existing version first — stripping it to just constants will break extraction on all platforms. See: v0.2.1 post-mortem for context.
# Make the build script executable (first time only — Mac/Linux)
chmod +x build.sh
# Build content.js from src/ files
./build.sh
# On Windows — Git Bash is required (comes with Git for Windows)
& "C:\Program Files\Git\bin\bash.exe" build.shBuild discipline: Always run build.sh before committing. The root content.js
is what Chrome loads — committing src/ changes without rebuilding leaves the repo
in an inconsistent state where source and built artifact are out of sync.
After rebuilding, reload the extension by fully removing and re-adding it in
chrome://extensions/ or edge://extensions/. The reload button can cache stale builds.
- Make changes to files in
src/ - Run
build.shto rebuild rootcontent.js - Remove Chat Archive from
chrome://extensions/entirely - Load unpacked → select the repo root folder
- Confirm the version number on the extension card matches
manifest.json - Run the ARCHIVETEST conversation on target platform (see architecture doc)
- Export both JSON and Markdown, validate both files
- Check browser console for
[Chat Archive]log messages
| Version | Date | Change |
|---|---|---|
| v0.2.1 | Feb 2026 | Fix: escape HTML entities in Markdown serializer. Patch: restore shared utilities stripped from constants.js during initial patch. |
| v0.2.0 | Feb 2026 | Phase 2: 5-platform clipboard extraction, Markdown export, heuristic classification |
| v0.1.0 | Feb 2026 | Phase 1: Claude.ai only, direct text extraction, JSON export |
- ✅ Pass 0 (clipboard) + Pass 1 (heuristics) extraction
- ✅ 5 platform support (Claude, ChatGPT, Gemini, Grok, Grok-X)
- ✅ JSON and Markdown export
- ✅ Safety limits and integrity checks
- ✅ Cross-platform validation (Windows/Mac)
- User resolution UI for uncertain classifications
- Batch export (multiple conversations from platform history pages)
- Firefox port (minimal namespace polyfill for Manifest V3)
- Optional ML micro-classifier (<500KB, explicit user consent)
- Enhanced metadata (timestamps, model versions, regeneration tracking)
- Artifact extraction (Claude's code/document artifacts)
- Image description preservation (multimodal conversations)
- Custom export templates (user-defined JSON schemas)
- Conversation diffing (compare versions, track edits)
- Safari Web Extension (macOS/iOS if API support improves)
- Encrypted export (optional password protection)
- Cloud sync integration (Google Drive, Dropbox, GitHub Gists)
- Conversation search and indexing
Contributions are welcome! This project is in active development.
-
Platform Testing
- Test on different browsers (Edge, Brave, Opera, Vivaldi)
- Validate on different OS versions
- Report DOM changes when platforms update their UI
-
New Platform Support
- Perplexity.ai
- Pi (Inflection)
- Character.AI
- Poe (multiple models)
-
Feature Development
- User resolution UI (flagged turns)
- Batch export from conversation history pages
- Firefox compatibility layer
-
Documentation
- Video tutorials
- Troubleshooting guide
- Translation to other languages
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (create ARCHIVETEST conversation, export, validate)
- Commit (
git commit -m 'Add amazing feature') - Push to your fork (
git push origin feature/amazing-feature) - Open a Pull Request
When platforms update their UI and extraction breaks:
-
Open an issue with:
- Platform name and URL
- Browser and OS version
- Error message from browser console (look for
[Chat Archive]logs) - Screenshots of the page structure (right-click → Inspect Element)
-
Include a minimal reproduction:
- Create a simple 2-turn conversation
- Attempt export
- Share what happened vs. what you expected
[Choose your license - MIT, Apache 2.0, GPL, etc.]
This project is licensed under the License - see the LICENSE file for details.
- Inspired by the need for durable conversation archives in the age of AI assistants
- DOM analysis methodology informed by Chat Export for Claude and similar projects
- Built with love for the AI power-user community
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [john@fxops.ai]
Remember: This extension operates in archive state, not conversation state. The conversations you export become addressable artifacts with destinations beyond the chat interface. Use them wisely.
Chat Archive v0.1.0 - Extracting ephemeral conversations into durable knowledge.