🔐 Authenticated Scrape

Scrape any authenticated website in 2 minutes with Claude Code

┌─────────────────────────────────────────────────────────────┐
│  🌐 Open Site → 🔑 Login → 📡 Auto-Capture → 💾 Extract    │
│                                                              │
│  No curl copying. No manual headers. Just pure automation.  │
└─────────────────────────────────────────────────────────────┘

Getting Started • Examples • Contributing

✨ What It Does

Scrape data from any authenticated website in minutes using Chrome DevTools automation. No more manually copying curl commands or dealing with complex auth flows.

The magic: Log in once in your browser, and this skill automatically captures all API requests with auth headers included.

Why This Exists

😫 The Old Way

1. Open DevTools
2. Navigate to Network tab
3. Refresh page
4. Find the right request
5. Right-click → Copy as cURL
6. Paste into terminal
7. Parse the response
8. Extract data manually
9. Repeat for every session 🔁

✨ With This Skill

1. Run /authenticated-scrape
2. Log in once
3. Done! ✅

Claude handles everything:
- Captures requests
- Parses responses
- Extracts data
- Creates reusable scripts

🎯 Use Cases

📊 Export data from SaaS dashboards
🔒 Download your private analytics
🏢 Automate data extraction from business tools
🎨 Scrape authenticated content you own
🔄 Create reusable scripts for recurring data needs

🚀 Quick Start

Installation

Install the skill:

mkdir -p ~/.claude/skills/authenticated-scrape
cd ~/.claude/skills/authenticated-scrape
curl -O https://raw.githubusercontent.com/Salberg87/authenticated-scrape/main/SKILL.md

Restart Claude Code
Invoke the skill:

/authenticated-scrape

📖 How It Works

graph LR
    A[🚀 Start Skill] --> B[🌐 Open Browser]
    B --> C[🔑 Login]
    C --> D[📡 Capture Network]
    D --> E[🎯 Select Request]
    E --> F[📦 Extract Data]
    F --> G[💾 Save/Script]

    style A fill:#7C3AED,stroke:#5B21B6,color:#fff
    style G fill:#10B981,stroke:#059669,color:#fff

Step-by-Step Process

Step	Action	What Happens
1	🌐 Launch	Skill opens Chrome and navigates to your target site
2	🔑 Authenticate	You log in normally (or let Claude automate it)
3	📡 Capture	All API calls with auth headers captured automatically
4	🎯 Select	Choose which request contains your data
5	📦 Extract	Parse JSON/HTML and extract specific fields
6	💾 Reuse	Generate Node.js/Python/curl scripts for future use

💡 Example Workflow

User: /authenticated-scrape

Claude: What website do you want to scrape?

User: my company dashboard at dashboard.example.com

Claude: Opening browser... Please log in.

[You log in normally]

Claude: Authenticated! Navigate to the page with data you need.

[You click around to load the data]

Claude: Captured 8 requests:
1. GET /api/users - 200 OK (JSON, 45KB) ← Contains user data
2. GET /api/analytics - 200 OK (JSON, 12KB)
3. GET /api/settings - 200 OK (JSON, 2KB)

Which one has your data?

User: The first one

Claude: Found Authorization: Bearer eyJ...
Response has 200 users with: id, email, name, created_at

What data do you want?

User: All emails

Claude: Extracted 200 emails. Save to emails.json?

I can also create a reusable Node.js script with the auth headers. Want that?

🛠️ Technical Details

Under the Hood

🔧 Technology

Chrome DevTools Protocol
via MCP

⚡ Performance

Real-time network
capture

🔒 Security

Local execution
No cloud storage

Supported Auth Types

Auth Method	Status	Notes
🍪 Cookie-based	✅ Supported	Most common web apps
🎫 JWT/Bearer	✅ Supported	Modern APIs
🔐 OAuth 2.0	✅ Supported	Social logins, enterprise
🛡️ Custom Headers	✅ Supported	API keys, tokens
🔑 Multi-Factor Auth	✅ Supported	You handle MFA manually

Output Formats

┌─────────────────────────────────────────┐
│  📝 JSON         → Structured data      │
│  📊 CSV          → Spreadsheet ready    │
│  🐍 Python       → pip install requests │
│  📦 Node.js      → Production ready     │
│  🔧 cURL         → Command line tool    │
└─────────────────────────────────────────┘

⚡ Advanced Usage

Automating Login

If you want to automate the login process:

User: /authenticated-scrape

Claude: What site?

User: example.com - can you auto-login? Username is user@email.com

Claude: I'll automate the login for you...

Handling Pagination

The skill can help identify pagination parameters:

Claude: I see the request has a 'page' parameter.
Want me to create a script that fetches all pages?

Session Reuse

Save auth tokens for repeated use:

Claude: I'll save these headers to auth-headers.json
You can reuse them until the token expires.

⚠️ Known Limitations & Troubleshooting

Automated Login Detection

Many major platforms detect automated browsers and block login attempts. This is a security feature, not a bug.

Sites that typically block automation:

❌ GitHub, Google, Microsoft
❌ Banking and financial sites
❌ Social media platforms
❌ Sites with aggressive bot detection

Recommended Approach

✅ Best Practice: Manual Login + Network Capture

1. User logs in manually in the browser window
2. Skill captures the authenticated session
3. Extracts requests with auth tokens
4. Generates reusable scripts

This approach works on any website!

Common Issues & Solutions

"Could not log in - This browser may not be secure"

Cause: The site detected the automated browser (Chrome DevTools Protocol)

Solution:

Have user log in manually instead of automating
Focus on capturing the already-authenticated session
The skill still works perfectly for data extraction!

No network requests captured

Cause: Page hasn't loaded data yet

Solution:

Wait for page to fully load
Navigate to the actual page that loads data
Check "All" request types, not just XHR/Fetch

Getting HTML instead of JSON

Cause: Endpoint requires authentication or returns error page

Solution:

Verify you're authenticated on the site
Check the endpoint URL is correct
Test the request in DevTools Network tab first

What Actually Works

Feature	Status	Notes
Network request capture	✅ Always works	Core functionality
Auth header extraction	✅ Always works	Cookies, tokens, etc.
JSON/HTML parsing	✅ Always works	Any response format
Script generation	✅ Always works	Node.js, Python, cURL
Manual login	✅ Always works	User types credentials
Automated login	⚠️ Site-dependent	Blocked by major platforms

🔒 Security & Ethics

Important Notes

🔐 Your auth tokens are sensitive - Scripts contain your session credentials
⏰ Tokens expire - You may need to recapture after timeout
✅ Only your data - This is for scraping your own authenticated sessions
📜 Respect ToS - Check if scraping is allowed by the service terms
🚦 Rate limiting - Be respectful with automated requests

What This Skill Does NOT Do

❌ Bypass security measures
❌ Scrape other users' data
❌ Break authentication systems
❌ Violate terms of service

🤝 Contributing

Contributions welcome! Feel free to:

🐛 Report bugs
💡 Suggest features
🔧 Submit pull requests
📖 Improve documentation

📄 License

MIT License - see LICENSE file for details

👤 Author

Salberg

GitHub: @Salberg87

🙏 Acknowledgments

Built for Claude Code
Uses Chrome DevTools Protocol
Inspired by the "copy as cURL" workflow

🔗 Links

❓ FAQ

Do I need to know how to code?

No! This skill guides you through everything. Claude handles all the technical work.

Is my auth token safe?

Yes. Everything runs locally. Tokens are only used in your scripts and never sent to external servers.

What if my session expires?

Just run the skill again to capture fresh tokens. Claude will notify you if requests fail due to expiration.

Can I scrape any website?

You should only scrape websites where you have an authenticated account and proper authorization. Always respect terms of service.

Does this work with single sign-on (SSO)?

Yes! Whether it's Google OAuth, Microsoft, Okta, or any other SSO provider, the skill captures the final authenticated session.

⭐ If this skill saved you time, give it a star!

Created by Salberg

Made with ❤️ for the Claude Code community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

License

Salberg87/authenticated-scrape

Folders and files

Latest commit

History

Repository files navigation

🔐 Authenticated Scrape

Scrape any authenticated website in 2 minutes with Claude Code

✨ What It Does

Why This Exists

🎯 Use Cases

🚀 Quick Start

Installation

📖 How It Works

Step-by-Step Process

💡 Example Workflow

🛠️ Technical Details

Under the Hood

🔧 Technology

⚡ Performance

🔒 Security

Supported Auth Types

Output Formats

⚡ Advanced Usage

Automating Login

Handling Pagination

Session Reuse

⚠️ Known Limitations & Troubleshooting

Automated Login Detection

Recommended Approach

Common Issues & Solutions

What Actually Works

🔒 Security & Ethics

Important Notes

What This Skill Does NOT Do

🤝 Contributing

📄 License

👤 Author

🙏 Acknowledgments

🔗 Links

❓ FAQ

⭐ If this skill saved you time, give it a star!

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages