A local AI content workflow that turns a rough idea into a researched outline and source-grounded draft. Built on Olostep for web research and page extraction, with a FastAPI backend, React frontend, and OpenAI-driven orchestration.
If you want to see how web research and page extraction fit into a real application flow, not just isolated scraping scripts, this repo is a small, composable reference you can run, read, and adapt.
An open-source AI content engine with:
- a FastAPI backend for orchestration and WebSocket streaming
- a React frontend for chat-style interaction and live updates
- OpenAI for workflow orchestration and drafting
- Olostep search for natural-language web research and source discovery
- Olostep Scrapes for clean page-level content extraction
It is designed as a reference implementation, not a polished product. The goal is a small, readable architecture that developers can plug into their own research, writing, or content automation pipelines.
For each content workflow, the app:
- Collects the brief through a chat-style interaction
- Turns the brief into a structured outline
- Lets the user review or approve the outline before continuing
- Researches the topic when source grounding is needed
- Scrapes relevant pages for cleaner source material
- Drafts the article grounded in collected sources
- Supports revision cycles for both outline and draft
- Streams live workflow updates to the frontend
The backend runs web research through Olostep's search capability and, in the current implementation, sends that research task to POST /v1/answers. This is how the app surfaces relevant sources and keeps research attached to the writing flow.
Use search when you want structured result discovery from a topic or question, without pre-knowing which URLs to target.
The backend calls the Olostep Scrapes API to extract clean content from URLs surfaced during research. Source material is converted to markdown and passed into downstream drafting steps.
Use Scrapes when you already have a URL and need clean, AI-ready content from that page.
Olostep supports broader web scraping and crawling workflows across its platform. This repo uses only the parts needed for research and page-level extraction in a content workflow.
The integration is straightforward to follow:
| File | What it does |
|---|---|
blog_agent/tools/search.py |
Calls Olostep POST /v1/answers for web research |
blog_agent/tools/scrape.py |
Calls Olostep POST /v1/scrapes for page extraction |
blog_agent/tools/tools.py |
Wraps both calls behind a shared tool provider |
blog_agent/agent/source_registry.py |
Stores source metadata so results can be reused across steps |
blog_agent/agent/blog_agent.py |
Orchestrates the brief -> outline -> draft -> revision flow |
- Python 3.11+
- Node.js 18+
- npm
- Docker and Docker Compose (optional, for containerized setup)
- An
OPENAI_API_KEY - An
OLOSTEP_API_KEY(required for web research and page extraction)
OPENAI_API_KEY=your_openai_api_key_here
OLOSTEP_API_KEY=your_olostep_api_key_here
OPENAI_MODEL=gpt-4.1-mini
LOG_LEVEL=INFOpip install -r requirements.txtuvicorn blog_agent.main:app --reload --host 0.0.0.0 --port 8000cd frontend
npm installnpm run devOpen the local URL printed by Vite, usually http://localhost:5173.
docker compose up --buildWhen running with Docker Compose:
- Frontend:
http://localhost:3000 - Backend health check:
http://localhost:8000/health - Environment variables are read from the root
.envfile
blog_agent/ Backend workflow, prompts, models, tools, and WebSocket server
frontend/ React + Vite frontend
Dockerfile Backend container image
docker-compose.yml Local multi-service setup
What is this repo for?
It is an open-source reference implementation of a source-grounded AI content workflow. It demonstrates how to combine web research and page-level data extraction inside a complete application, from brief to outline to draft.
Can I reuse this for other agentic content extraction workflows?
Yes. The same orchestration, tool-wrapper, and source-registry patterns apply to research automation, report generation, lead enrichment, and other source-grounded AI systems.
- Olostep homepage
- Welcome to Olostep docs
- Olostep Scrapes API - page-level data extraction
- Olostep blog
- Olostep Web Data API for AI Agents & RAG Pipelines
- Web Scraping vs Web Crawling: What's the Difference
- How to Extract Table Data From a Website Without Breakage
