A local-first agentic web automation project that scrapes a website, builds a small retrieval knowledge base (Chroma), plans actions using an Ollama chat model, and executes the plan with Playwright. It is designed for simple tasks like navigating pages, filling forms, clicking buttons, and taking screenshots.
- Website scraping to collect page/form context
- RAG over scraped content using Chroma + local embeddings
- Action planning using a local LLM (Ollama)
- Playwright-based execution: navigate, fill, click, wait, screenshot
- Validation layer with heuristics (and optional LLM check)
- Logs + screenshots for debugging and proof
- Python
- LangGraph (workflow orchestration)
- LangChain Ollama (LLM interface)
- Playwright (browser automation)
- ChromaDB (vector database)
- Sentence-Transformers / local embeddings (
all-MiniLM-L6-v2)
src/main.py: entry pointsrc/agent/: LangGraph agent (planner / executor / validator / graph)src/browser/: Playwright adaptersrc/scraper/: website scrapersrc/knowledge_base/: embeddings + chroma store + retrieverrequirements.txt: dependencies.env: local secrets (do not commit)
Windows (PowerShell):
python -m venv venv
.\venv\Scripts\activatemacOS / Linux:
python3 -m venv venv
source venv/bin/activatepip install -r requirements.txt
playwright install chromiumInstall Ollama: https://ollama.com/
Pull a model (example):
ollama pull llama3.2Create a .env file in the project root (never commit this file).
Example:
# Ollama
CHAT_MODEL=llama3.2
OLLAMA_BASE_URL=http://localhost:
# Optional: LangSmith tracing (only if you use it)
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=YOUR_LANGSMITH_KEY
LANGCHAIN_PROJECT=agentic-web-automation
# Optional: tuning
BROWSER_TIMEOUT=30000
MAX_RETRIES=2Notes:
- If you do not use LangSmith, remove the
LANGCHAIN_*variables or set tracing to false.
From the project root:
python -m src.main-
Change the
domainandtaskinsrc/main.py(or wherever your entry point defines them). -
Run once to scrape and rebuild the knowledge base for that domain.
-
Ensure the planner is dynamic (no hardcoded selectors) and the executor supports the actions produced by the planner.
-
For logins, ensure your task includes credentials clearly, for example:
- "Log in. Username is
user123. Password ispass123. Then take a screenshot."
- "Log in. Username is
Important limitations:
- CAPTCHA / Cloudflare checks are not supported.
- 2FA flows require extra handling (pause/wait for code entry).