Automated Google search → visit results → extract cleaned page content locally.
Powered by Playwright for browsing and llm-scraper for LLM-guided extraction. Choose provider (OpenAI, Anthropic, Google) per run.
- Google search via Playwright, filters ads/internal links
- Iterates top N organic results
- Uses
llm-scraperto extract title/description/main content - Saves JSON under
output/ - Local web UI to pick provider, key, query, headless
npm install
npm run playwright:install# export the matching API key env for chosen provider
export PROVIDER=openai
export OPENAI_API_KEY=sk-...
npm start -- "best laptops 2025"
# Limit links and run headed
MAX_LINKS=3 HEADLESS=false npm start -- "vector databases"npm run serve
# open http://localhost:3000Pick provider, paste API key, enter query, press Start. Results display on the page and persist to output/.
- Respect websites' terms and robots.
- This is for research/consentful scraping.
- Playwright default is headless; toggle in UI.