ScraperAI

A UI interface that combines web scraping with local AI models (via Ollama) to analyze web content and answer questions about it.

🚀 Features

🌐 Web Scraping: Enter any URL to scrape website content.
🤖 AI Integration: Process scraped content with local LLMs through Ollama.
💬 Chat Interface: Ask follow-up questions about scraped content in a conversational format.
📋 Model Selection: Choose from available Ollama models to customize the AI's behavior.
💾 Local Storage: Conversations are stored in your browser for easy access.

🛠️ Prerequisites

Before setting up the project, ensure you have the following installed:

Node.js (v14+)
Python (v3.9+)
Ollama installed and running locally
- Install as many models as you want using ollama pull <model-name>.

📦 Setup and Installation

Step 1: Install Ollama

Install Ollama on your machine by following the instructions at ollama.ai.

Pull at least one model to use with the application:

ollama pull mistral
# or any other model you prefer

Start the Ollama server:
```
ollama serve
```

Step 2: Set Up the Backend

Navigate to the backend directory:
```
cd backend
```

Create and activate a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Start the backend server:
```
uvicorn server:app --reload
```
The backend server will run at http://localhost:8000.

Step 3: Set Up the Frontend

Navigate to the frontend directory:
```
cd frontend
```
Install dependencies:
```
npm install
```
Start the development server:
```
npm start
```
The frontend will be available at http://localhost:3000.

🖥️ Usage

Open the application in your browser at http://localhost:3000.
Select an Ollama model from the dropdown in the sidebar.
Enter a URL in the input box and click Scrape.
Wait for the scraping process to complete.
Ask questions about the scraped content in the chat interface.

📡 API Endpoints

Backend Endpoints

GET /api/models
Fetch a list of available Ollama models.
POST /api/scrape
Scrape content from a website.
Request Body:
```
{
  "url": "https://example.com"
}
```

POST /api/ollama
Send content to Ollama for processing.
Request Body:

{
  "model": "mistral",
  "content": "Scraped content here...",
  "prompt": "Summarize this content."
}

🛠️ Stack

Frontend

The frontend is built with React and uses:

CSS for styling.
Local Storage for persisting conversations.

Backend

The backend is built with FastAPI and uses:

BeautifulSoup for web scraping.
Playwright for rendering JavaScript-heavy websites.
HTTPX for making HTTP requests.

⚠️ Limitations

- Ollama must be running locally for the application to work.
- Some websites may block web scraping attempts.
- Complex web pages (SPAs, JavaScript-heavy sites) may not scrape properly.
- Large web pages will be truncated to avoid overwhelming the LLM.

🌟 Future Improvements

Enhance scraping capabilities for JavaScript-heavy websites. [High Priority]
Support document uploads (e.g., PDFs, DOCs).
Implement vector search for better content retrieval.
Add multi-language support for broader usability.

🧑‍💻 Contributing

Contributions are welcome and would be greatly appreciated!
If you'd like to contribute, please fork the repository and submit a pull request.

📞 Support

If you encounter any issues or have questions, feel free to open an issue in the repository or contact the project maintainers.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
backend		backend
frontend		frontend
.DS_Store		.DS_Store
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScraperAI

🚀 Features

🛠️ Prerequisites

📦 Setup and Installation

Step 1: Install Ollama

Step 2: Set Up the Backend

Step 3: Set Up the Frontend

🖥️ Usage

📡 API Endpoints

Backend Endpoints

🛠️ Stack

Frontend

Backend

⚠️ Limitations

🌟 Future Improvements

🧑‍💻 Contributing

📞 Support

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ScraperAI

🚀 Features

🛠️ Prerequisites

📦 Setup and Installation

Step 1: Install Ollama

Step 2: Set Up the Backend

Step 3: Set Up the Frontend

🖥️ Usage

📡 API Endpoints

Backend Endpoints

🛠️ Stack

Frontend

Backend

⚠️ Limitations

🌟 Future Improvements

🧑‍💻 Contributing

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages