A UI interface that combines web scraping with local AI models (via Ollama) to analyze web content and answer questions about it.
- 🌐 Web Scraping: Enter any URL to scrape website content.
- 🤖 AI Integration: Process scraped content with local LLMs through Ollama.
- 💬 Chat Interface: Ask follow-up questions about scraped content in a conversational format.
- 📋 Model Selection: Choose from available Ollama models to customize the AI's behavior.
- 💾 Local Storage: Conversations are stored in your browser for easy access.
Before setting up the project, ensure you have the following installed:
- Node.js (v14+)
- Python (v3.9+)
- Ollama installed and running locally
- Install as many models as you want using
ollama pull <model-name>.
- Install as many models as you want using
-
Install Ollama on your machine by following the instructions at ollama.ai.
-
Pull at least one model to use with the application:
ollama pull mistral # or any other model you prefer -
Start the Ollama server:
ollama serve
-
Navigate to the backend directory:
cd backend -
Create and activate a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Start the backend server:
uvicorn server:app --reload
The backend server will run at http://localhost:8000.
-
Navigate to the frontend directory:
cd frontend -
Install dependencies:
npm install
-
Start the development server:
npm start
The frontend will be available at http://localhost:3000.
- Open the application in your browser at http://localhost:3000.
- Select an Ollama model from the dropdown in the sidebar.
- Enter a URL in the input box and click Scrape.
- Wait for the scraping process to complete.
- Ask questions about the scraped content in the chat interface.
-
GET /api/models
Fetch a list of available Ollama models. -
POST /api/scrape
Scrape content from a website.
Request Body:{ "url": "https://example.com" } -
POST /api/ollama
Send content to Ollama for processing.
Request Body:{ "model": "mistral", "content": "Scraped content here...", "prompt": "Summarize this content." }
The frontend is built with React and uses:
- CSS for styling.
- Local Storage for persisting conversations.
The backend is built with FastAPI and uses:
- BeautifulSoup for web scraping.
- Playwright for rendering JavaScript-heavy websites.
- HTTPX for making HTTP requests.
- Ollama must be running locally for the application to work.
- Some websites may block web scraping attempts.
- Complex web pages (SPAs, JavaScript-heavy sites) may not scrape properly.
- Large web pages will be truncated to avoid overwhelming the LLM.
- Enhance scraping capabilities for JavaScript-heavy websites. [High Priority]
- Support document uploads (e.g., PDFs, DOCs).
- Implement vector search for better content retrieval.
- Add multi-language support for broader usability.
Contributions are welcome and would be greatly appreciated!
If you'd like to contribute, please fork the repository and submit a pull request.
If you encounter any issues or have questions, feel free to open an issue in the repository or contact the project maintainers.
