βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π Open Site β π Login β π‘ Auto-Capture β πΎ Extract β
β β
β No curl copying. No manual headers. Just pure automation. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Getting Started β’ Examples β’ Contributing
Scrape data from any authenticated website in minutes using Chrome DevTools automation. No more manually copying curl commands or dealing with complex auth flows.
The magic: Log in once in your browser, and this skill automatically captures all API requests with auth headers included.
|
π« The Old Way 1. Open DevTools
2. Navigate to Network tab
3. Refresh page
4. Find the right request
5. Right-click β Copy as cURL
6. Paste into terminal
7. Parse the response
8. Extract data manually
9. Repeat for every session π |
β¨ With This Skill 1. Run /authenticated-scrape
2. Log in once
3. Done! β
Claude handles everything:
- Captures requests
- Parses responses
- Extracts data
- Creates reusable scripts |
- π Export data from SaaS dashboards
- π Download your private analytics
- π’ Automate data extraction from business tools
- π¨ Scrape authenticated content you own
- π Create reusable scripts for recurring data needs
- Install the skill:
mkdir -p ~/.claude/skills/authenticated-scrape
cd ~/.claude/skills/authenticated-scrape
curl -O https://raw.githubusercontent.com/Salberg87/authenticated-scrape/main/SKILL.md-
Restart Claude Code
-
Invoke the skill:
/authenticated-scrapegraph LR
A[π Start Skill] --> B[π Open Browser]
B --> C[π Login]
C --> D[π‘ Capture Network]
D --> E[π― Select Request]
E --> F[π¦ Extract Data]
F --> G[πΎ Save/Script]
style A fill:#7C3AED,stroke:#5B21B6,color:#fff
style G fill:#10B981,stroke:#059669,color:#fff
| Step | Action | What Happens |
|---|---|---|
| 1 | π Launch | Skill opens Chrome and navigates to your target site |
| 2 | π Authenticate | You log in normally (or let Claude automate it) |
| 3 | π‘ Capture | All API calls with auth headers captured automatically |
| 4 | π― Select | Choose which request contains your data |
| 5 | π¦ Extract | Parse JSON/HTML and extract specific fields |
| 6 | πΎ Reuse | Generate Node.js/Python/curl scripts for future use |
User: /authenticated-scrape
Claude: What website do you want to scrape?
User: my company dashboard at dashboard.example.com
Claude: Opening browser... Please log in.
[You log in normally]
Claude: Authenticated! Navigate to the page with data you need.
[You click around to load the data]
Claude: Captured 8 requests:
1. GET /api/users - 200 OK (JSON, 45KB) β Contains user data
2. GET /api/analytics - 200 OK (JSON, 12KB)
3. GET /api/settings - 200 OK (JSON, 2KB)
Which one has your data?
User: The first one
Claude: Found Authorization: Bearer eyJ...
Response has 200 users with: id, email, name, created_at
What data do you want?
User: All emails
Claude: Extracted 200 emails. Save to emails.json?
I can also create a reusable Node.js script with the auth headers. Want that?
|
Chrome DevTools Protocol via MCP |
Real-time network capture |
Local execution No cloud storage |
| Auth Method | Status | Notes |
|---|---|---|
| πͺ Cookie-based | β Supported | Most common web apps |
| π« JWT/Bearer | β Supported | Modern APIs |
| π OAuth 2.0 | β Supported | Social logins, enterprise |
| π‘οΈ Custom Headers | β Supported | API keys, tokens |
| π Multi-Factor Auth | β Supported | You handle MFA manually |
βββββββββββββββββββββββββββββββββββββββββββ
β π JSON β Structured data β
β π CSV β Spreadsheet ready β
β π Python β pip install requests β
β π¦ Node.js β Production ready β
β π§ cURL β Command line tool β
βββββββββββββββββββββββββββββββββββββββββββIf you want to automate the login process:
User: /authenticated-scrape
Claude: What site?
User: example.com - can you auto-login? Username is user@email.com
Claude: I'll automate the login for you...
The skill can help identify pagination parameters:
Claude: I see the request has a 'page' parameter.
Want me to create a script that fetches all pages?
Save auth tokens for repeated use:
Claude: I'll save these headers to auth-headers.json
You can reuse them until the token expires.
Many major platforms detect automated browsers and block login attempts. This is a security feature, not a bug.
Sites that typically block automation:
- β GitHub, Google, Microsoft
- β Banking and financial sites
- β Social media platforms
- β Sites with aggressive bot detection
β Best Practice: Manual Login + Network Capture
1. User logs in manually in the browser window
2. Skill captures the authenticated session
3. Extracts requests with auth tokens
4. Generates reusable scripts
This approach works on any website!
"Could not log in - This browser may not be secure"
Cause: The site detected the automated browser (Chrome DevTools Protocol)
Solution:
- Have user log in manually instead of automating
- Focus on capturing the already-authenticated session
- The skill still works perfectly for data extraction!
No network requests captured
Cause: Page hasn't loaded data yet
Solution:
- Wait for page to fully load
- Navigate to the actual page that loads data
- Check "All" request types, not just XHR/Fetch
Getting HTML instead of JSON
Cause: Endpoint requires authentication or returns error page
Solution:
- Verify you're authenticated on the site
- Check the endpoint URL is correct
- Test the request in DevTools Network tab first
| Feature | Status | Notes |
|---|---|---|
| Network request capture | β Always works | Core functionality |
| Auth header extraction | β Always works | Cookies, tokens, etc. |
| JSON/HTML parsing | β Always works | Any response format |
| Script generation | β Always works | Node.js, Python, cURL |
| Manual login | β Always works | User types credentials |
| Automated login | Blocked by major platforms |
- π Your auth tokens are sensitive - Scripts contain your session credentials
- β° Tokens expire - You may need to recapture after timeout
- β Only your data - This is for scraping your own authenticated sessions
- π Respect ToS - Check if scraping is allowed by the service terms
- π¦ Rate limiting - Be respectful with automated requests
- β Bypass security measures
- β Scrape other users' data
- β Break authentication systems
- β Violate terms of service
Contributions welcome! Feel free to:
- π Report bugs
- π‘ Suggest features
- π§ Submit pull requests
- π Improve documentation
MIT License - see LICENSE file for details
Salberg
- GitHub: @Salberg87
- Built for Claude Code
- Uses Chrome DevTools Protocol
- Inspired by the "copy as cURL" workflow
Do I need to know how to code?
No! This skill guides you through everything. Claude handles all the technical work.
Is my auth token safe?
Yes. Everything runs locally. Tokens are only used in your scripts and never sent to external servers.
What if my session expires?
Just run the skill again to capture fresh tokens. Claude will notify you if requests fail due to expiration.
Can I scrape any website?
You should only scrape websites where you have an authenticated account and proper authorization. Always respect terms of service.
Does this work with single sign-on (SSO)?
Yes! Whether it's Google OAuth, Microsoft, Okta, or any other SSO provider, the skill captures the final authenticated session.