Skip to content

M0x37/The-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

The Scraper

An interactive, AI-driven research assistant that researches, filters, and summarizes topics.

How it Works

โžค The assistant uses a local AI model to generate search queries, browse websites, and extract relevant content. You interactively select sources and output format before the AI creates a live summary.

Features

โš™ Interactive Control: Source & format selection. ๐Ÿง  AI-Driven: Research & summarization by LLM. ๐Ÿ’พ Intelligent Caching: Accelerates repeated searches. โšก Live Summary: Real-time output in the terminal. ๐Ÿ“Š Progress Indicators: Visual feedback during long operations. ๐Ÿ›ก Robust Ollama Communication: Automatic retries. ๐Ÿ” Strict Content Filtering: By domain, language, length. ๐Ÿ”ง Easy Configuration & Validation: config.json is checked. ๐Ÿ“ฆ Fully Automatic Setup: Dependencies & Ollama models. ๐Ÿ  100% Local: Full data control.

Prerequisites

๐Ÿ Python 3.x ๐Ÿณ Ollama: A running local Ollama server.

Installation & Usage

  1. Download Files: scraper.py, requirements.txt, config.json in the same directory.
  2. Run Script:
    python scraper.py
  3. First Start: config.json is created, Python packages installed, Ollama model checked/offered.
  4. Interactive Process: Enter topic, select sources, choose format, observe summary.

The final result is saved in output.txt.

Configuration (config.json)

Adjust the assistant's behavior via the config.json file. Important settings include Ollama details, search parameters, filters, and caching options.

About

A Web Scraper working with AI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages