Skip to content

Sashanksurya/discoverai-endee

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 DiscoverAI β€” Agentic Movie Recommendation Engine with Endee

An Agentic AI Workflow that understands natural language queries and autonomously searches for semantic movie recommendations β€” powered by Endee as the vector database.


🧠 Project Overview & Problem Statement

Finding the right movie to watch is hard. Traditional keyword search fails to understand mood, theme, and narrative style β€” searching "dark thriller" returns everything tagged as thriller, not what actually feels dark and tense.

DiscoverAI solves this using semantic vector search. Instead of matching keywords, it converts both movie descriptions and user queries into dense numerical vectors and finds movies that are semantically closest β€” meaning they share similar themes, tone, and narrative style β€” even if they share no common words.

The system is built as a multi-agent agentic workflow:

  • A Planner Agent interprets what the user really means
  • A Movie Agent searches Endee's vector database semantically
  • A Synthesis Agent explains why each movie matches

πŸ—οΈ System Design & Technical Approach

Architecture

User Query
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Planner Agent     β”‚  ← Uses Groq LLaMA 3.3 to refine query
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
    Movie Agent           ← Embeds query β†’ searches Endee
         β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Endee Vector   β”‚  ← HNSW semantic similarity search
    β”‚  Database       β”‚     (cosine distance, INT8 precision)
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Synthesis Agent β”‚  ← Uses Groq LLaMA 3.3 to write response
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    🎬 Movie Recommendations

Key Features

  • πŸ€– Agentic workflow β€” 3 specialized agents working in a pipeline
  • πŸ” Semantic search β€” Natural language mapped to 384-dim dense vectors
  • ⚑ Endee vector DB β€” High-performance HNSW indexing, up to 1B vectors
  • πŸ†“ Fully free β€” Groq free tier + Endee cloud free tier
  • 🎬 Rich metadata β€” genre, director, rating, year stored alongside vectors
  • 🐍 Pure Python β€” Simple setup, easy to extend

πŸ”§ How Endee Is Used

Endee is the core vector database of this project. Here is exactly how it is integrated:

1. Index Creation

A single movies index is created in Endee cloud with these parameters:

Index Dimension Space Precision Algorithm
movies 384 cosine INT8 HNSW

2. Data Ingestion (seed_data.py)

Each movie's description is converted into a 384-dimensional vector using sentence-transformers/all-MiniLM-L6-v2 and upserted into Endee along with metadata:

client.create_index(name="movies", dimension=384, space_type="cosine", precision=Precision.INT8)
index.upsert([{ "id": "mov_001", "vector": [...384 floats...], "meta": { "title": "Inception", ... } }])

3. Semantic Search (movie_agent.py)

At query time, the user's refined query is embedded and sent to Endee which returns the top-K most similar movies using HNSW approximate nearest neighbor search:

vector = embedder.embed("psychological thriller with a twist ending")
results = index.query(vector=vector, top_k=5)

Endee returns results ranked by cosine similarity β€” movies whose semantic meaning is closest to the query.


πŸš€ Quick Start

Prerequisites

1. Clone the Repository

git clone https://github.com/Sashanksurya/discoverai-endee.git
cd discoverai-endee

2. Install Python Dependencies

pip install -r requirements.txt

3. Create your .env file

cp .env.example .env

Fill in your keys:

GROQ_API_KEY=your_groq_api_key_here
ENDEE_BASE_URL=https://your-cluster.endee.io/api/v1
ENDEE_AUTH_TOKEN=your_endee_auth_token_here
  • GROQ_API_KEY β†’ console.groq.com β†’ API Keys β†’ Create Key
  • ENDEE_BASE_URL β†’ dapp.endee.io β†’ Your Project β†’ Getting Started
  • ENDEE_AUTH_TOKEN β†’ dapp.endee.io β†’ Auth Tokens β†’ Create Token

4. Seed the Vector Database

python seed_data.py

This creates the movies Endee index and loads 10 movies with embeddings.

5. Run the Agent

python main.py

πŸ’¬ Example Queries

You: I want something thrilling with time travel, like Interstellar.

You: Show me dark psychological thrillers.

You: Find me movies about survival in the wild.

You: Something with a mind-bending plot and great cinematography.

πŸ“ Project Structure

discoverai-endee/
β”œβ”€β”€ docker-compose.yml      # Endee local setup (alternative to cloud)
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ main.py                 # Entry point β€” interactive agent loop
β”œβ”€β”€ seed_data.py            # Loads movie data into Endee
β”œβ”€β”€ .env.example            # Environment variable template
β”œβ”€β”€ config/
β”‚   └── settings.py         # Reads API keys from .env
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ planner.py          # Planner Agent β€” refines user queries via Groq
β”‚   β”œβ”€β”€ movie_agent.py      # Movie Agent β€” queries Endee vector database
β”‚   └── synthesizer.py      # Synthesis Agent β€” writes recommendations via Groq
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ endee_client.py     # Endee Python SDK wrapper
β”‚   └── embedder.py         # Sentence-transformers embedding utility
└── data/
    └── sample_data.py      # 10 movies with descriptions and metadata

πŸ€– Agent Details

Planner Agent (agents/planner.py)

  • Model: Groq LLaMA 3.3 70B
  • Input: Raw user query (e.g. "dark thriller")
  • Output: Rich semantic query (e.g. "psychological thriller with complex morally ambiguous characters in a dark urban setting")

Movie Agent (agents/movie_agent.py)

  • Embedding: all-MiniLM-L6-v2 (384 dimensions)
  • Search: Endee HNSW cosine similarity, top-5 results
  • Output: Ranked list of movies with similarity scores and metadata

Synthesis Agent (agents/synthesizer.py)

  • Model: Groq LLaMA 3.3 70B
  • Input: User query + ranked movie results
  • Output: Natural language recommendation explaining why each film matches

πŸ“¦ Tech Stack

Component Technology
Vector Database Endee Cloud (HNSW, cosine, INT8)
Embedding Model sentence-transformers/all-MiniLM-L6-v2
LLM Groq β€” LLaMA 3.3 70B (free tier)
Language Python 3.9+
SDK endee, groq, sentence-transformers

🀝 Contributing

Pull requests are welcome! Ideas for extension:

  • Add more movies to the dataset
  • Add a Streamlit or Gradio web UI
  • Implement user preference memory across sessions
  • Add hybrid search (vector + keyword filters on genre/year)

πŸ“„ License

MIT License

About

Agentic AI movie recommendation engine using Endee vector database + Groq LLaMA 3.3

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages