A practical web perception layer for AI agents that replaces brittle scraping with structured web data, enabling fast, reliable LLM summarization to clean JSON. This repo includes multiple pipelines, a Streamlit UI, and agentic flows that combine search, scrape, and parser tools for real-world web research tasks.
- Build AI agents that use structured web data instead of fragile HTML parsing.
- Power search, scrape, and parser-driven workflows with Olostep.
- Normalize results into consistent JSON outputs for downstream apps.
- Prototype quickly with Streamlit or run scripts from the CLI.
User Query
|
v
WebPerceptionAgent (LangChain + OpenAI)
|
v
OlostepTool (search + scrape)
|
v
OlostepClient (API or deterministic mock)
|
v
Structured Results
|
v
LLM Summarization + Normalization
|
v
JSON Output
- Job discovery and ranking (e.g., software engineering roles in Berlin).
- Product research and price comparison (e.g., laptops under $1,000 on Amazon).
- Agentic web research with iterative scraping and parser selection.
Each demo reads a query from environment variables (with defaults) and prints structured output.
- Job search pipeline (single search + summarize):
python examples/run_job_demo.py
- Agentic web perception (search, scrape, plan, optional parsers, summarize):
python examples/run_agentic_web_demo.py
- Product price comparison (search + Amazon product parser + summarize):
python examples/run_product_price_demo.py
- Multi-pipeline demo (job + agentic web + product price):
python examples/run_multi_demo.py
- Streamlit UI:
streamlit run streamlit_app.py
- Python 3.10+
pipOLOSTEP_API_KEY(required for real data)OPENAI_API_KEY(required for LLM summaries)- Internet access for API requests
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Configure environment by copying
.env.exampleto.envand setting keys. - Run a demo (see above).
OPENAI_API_KEY: OpenAI key for LLM calls.OPENAI_MODEL: Model name (default:gpt-4o-mini).OLOSTEP_API_KEY: Olostep API key for real data.OLOSTEP_BASE_URL: Olostep base URL.LOG_LEVEL: Logging verbosity (e.g.,INFO).DEMO_QUERY: Demo query string.JOB_DEMO_QUERY: Query for the job pipeline in the multi demo.AGENTIC_DEMO_QUERY: Query for the agentic web pipeline in the multi demo.PRICE_DEMO_QUERY: Query for the product price pipeline in the multi demo.
- https://www.olostep.com
- https://docs.olostep.com
- https://docs.olostep.com/searches/searches
- https://docs.olostep.com/features/scrapes/scrapes
- https://www.olostep.com/dashboard/api-keys
├── agents/
│ ├── tools.py
│ └── web_agent.py
├── services/
│ └── olostep_client.py
├── pipelines/
│ ├── agentic_web_pipeline.py
│ ├── job_pipeline.py
│ └── product_price_pipeline.py
├── examples/
│ ├── run_agentic_web_demo.py
│ ├── run_job_demo.py
│ ├── run_multi_demo.py
│ └── run_product_price_demo.py
├── utils/
│ ├── parser.py
│ └── perception.py
├── tests/
│ └── test_agent.py
├── assets/
│ ├── Agentic Search.png
│ └── Thumbnail.png
├── .env.example
├── requirements.txt
├── README.md
├── AGENTS.md
├── streamlit_app.py
└── setup.py
- If
OLOSTEP_API_KEYis not set, the Olostep client returns deterministic mock data. - If
OPENAI_API_KEYis missing, the agent will attempt to run but LLM calls can fail at runtime. - The perception utilities filter and compress results to keep prompts within token limits.
Olostep, web perception, AI agents, structured web data, web scraping alternative, LangChain, OpenAI, Streamlit, Python, search and scrape pipelines, agentic web research, JSON normalization.

