A Cloudflare Worker that extracts the main content of any web page and returns clean Markdown. Built on top of Defuddle with special handling for X/Twitter posts, including text, media, polls, quotes, and long-form Articles.
🔗 Live demo: defuddle.thieunv.workers.dev
Examples:
# Regular web page
https://defuddle.thieunv.workers.dev/vividkit.dev
# X/Twitter post
https://defuddle.thieunv.workers.dev/x.com/thieunguyen_it/status/2021461660310044828
# X Article (long-form with multiple mediums)
https://defuddle.thieunv.workers.dev/x.com/trq212/status/2024574133011673516| Platform | Method | Details |
|---|---|---|
| Any web page | Defuddle + Turndown | Smart content extraction, strips ads/nav/footers |
| X / Twitter | FxTwitter API | Posts, articles, media, polls, quotes, threads |
| Custom extractor | Public posts and content | |
| Substack | Defuddle built-in | Newsletter articles (new in Defuddle 0.15) |
| YouTube | Defuddle built-in | Video metadata, transcripts |
| Defuddle built-in | Posts and comments | |
| GitHub | Defuddle built-in | Issues, READMEs, discussions |
- Any web page → Markdown via Defuddle + Turndown
- X/Twitter posts → rich Markdown via the FxTwitter API
- Tweet text with
t.colink expansion - Photos, videos, GIFs with thumbnails & duration
- X Articles (long-form DraftJS content with inline media)
- Quote tweets with media
- Polls with visual progress bars
- Engagement stats (likes, retweets, replies, views)
- Community notes, replying-to context, broadcasts
- External media (YouTube embeds, etc.)
- Tweet text with
- Facebook posts → Markdown with custom extractor
- JSON and Markdown output formats
- CORS support
# Get any web page as Markdown
curl https://<your-worker>.workers.dev/medium.com/@richardhightower/claude-code-todos-to-tasks-5a1b0e351a1c
# Get an X/Twitter post
curl https://<your-worker>.workers.dev/x.com/thieunguyen_it/status/2021461660310044828
# Get X Article (long-form with multiple mediums)
curl https://<your-worker>.workers.dev/x.com/trq212/status/2024574133011673516
# Get JSON output
curl -H 'Accept: application/json' https://<your-worker>.workers.dev/x.com/thieunguyen_it/status/2021461660310044828- Node.js ≥ 18
- A Cloudflare account (free tier works)
# Clone the repo
git clone <repo-url>
cd defuddle
# Install dependencies
npm install
# Start local dev server
npm run devThe worker will be available at http://localhost:8787.
# Test locally
curl http://localhost:8787/x.com/thieunguyen_it/status/2021461660310044828npm test-
Login to Cloudflare CLI
npx wrangler login
-
Deploy
npm run deploy
This runs
wrangler deploywhich:- Bundles the TypeScript source
- Uploads to Cloudflare Workers
- Assigns a
*.workers.devsubdomain
-
Verify
curl https://defuddle.<your-subdomain>.workers.dev/example.com
- Go to Cloudflare Dashboard → Workers & Pages →
defuddle→ Settings → Domains & Routes - Add a custom domain (must be on Cloudflare DNS) or a route pattern
The worker config is in wrangler.jsonc:
Key settings:
nodejs_compat— required for thelinkedomDOM parser used by Defuddleobservability.enabled— enables Workers logs in the dashboard
src/
├── index.ts # Worker entry point, request routing
├── convert.ts # Orchestrator: routes URLs to extractors
├── convert-types.ts # Shared types (ConvertResult, ConvertOptions)
├── web-page-extractor.ts # Generic web page extraction (Defuddle + Turndown)
├── x-twitter-fetcher.ts # X/Twitter post extraction via FxTwitter API
├── x-twitter-types.ts # X/Twitter type definitions
├── x-twitter-media-renderer.ts # X/Twitter media to markdown
├── x-twitter-text-processor.ts # X/Twitter text processing utilities
├── draftjs-to-markdown-converter.ts # DraftJS → Markdown for X Articles
├── facebook-fetcher.ts # Facebook post extraction
└── polyfill.ts # Workers runtime polyfills for DOM APIs
Extracts content from the given URL.
Response formats:
text/markdown(default) — Markdown with YAML frontmatterapplication/json— setAccept: application/jsonheader
Frontmatter fields:
| Field | Description |
|---|---|
title |
Page/tweet title |
author |
Author name |
published |
Publication date |
source |
Original URL |
domain |
Source domain |
description |
Page description or tweet preview |
word_count |
Content word count |
likes |
❤️ (X/Twitter only) |
retweets |
🔁 (X/Twitter only) |
replies |
💬 (X/Twitter only) |
views |
👁 (X/Twitter only) |
MIT

{ "name": "defuddle", // Worker name (= subdomain) "main": "src/index.ts", // Entry point "compatibility_date": "2026-03-01", "compatibility_flags": ["nodejs_compat"] // Required for linkedom }