Agentic RAG Notebooks

A collection of advanced Retrieval-Augmented Generation (RAG) implementations with intelligent agents using LangChain, LangGraph, and various vector store backends.

📚 Project Overview

This repository demonstrates different approaches to building RAG systems with autonomous agents. Each notebook showcases a distinct implementation strategy, from basic vector search to hybrid search and cloud-based vector stores.

📔 Notebooks

1. Agentic RAG (FAISS Vector Store)

Description: A foundational RAG system combining document retrieval with an intelligent agent. This notebook demonstrates:

Document Loading & Processing: Load PDF documents and split them into manageable chunks using recursive character splitting
Vector Store Setup: Create a FAISS (Facebook AI Similarity Search) vector database for fast semantic similarity search
RAG Tool Implementation: Define a retrieval tool that fetches relevant document context
LangGraph Agent: Build an agentic workflow that decides when to retrieve documents and generates contextual responses
System-Guided Responses: Uses system prompts to ensure answers are based on provided documents with proper citations

Key Components:

LLM: ChatGroq (GPT OSS 120B)
Embeddings: OllamaEmbeddings (nomic-embed-text:v1.5)
Vector Store: FAISS
Framework: LangChain + LangGraph

2. Agentic RAG - Hybrid Search

Description: An advanced RAG system that combines dense (semantic) and sparse (keyword) search for comprehensive document retrieval. This notebook showcases:

Hybrid Search Strategy: Combines semantic embeddings with BM25-based keyword search for better recall
Pinecone Integration: Leverages Pinecone cloud vector database for scalable vector operations
BM25 Encoder: Trains TF-IDF-based sparse vectors for keyword matching
Dense Embeddings: Uses Google Gemini embeddings for semantic understanding (3072 dimensions)
LangGraph Workflow: Implements multi-step agent logic with both retrieval and response generation
Production-Ready: Demonstrates enterprise-grade RAG architecture

Key Components:

LLM: ChatGroq (GPT OSS 120B)
Dense Embeddings: GoogleGenerativeAIEmbeddings (Gemini)
Sparse Search: BM25Encoder (Pinecone)
Vector Store: Pinecone (Cloud-based)
Framework: LangChain + LangGraph

3. Agentic RAG - MongoDB Vector Store

Description: A cloud-native RAG implementation using MongoDB Atlas for vector storage and retrieval. This notebook demonstrates:

MongoDB Atlas Setup: Connect to MongoDB Atlas cluster for managed vector storage
Vector Search Index: Create and manage vector search indices in MongoDB (768-dimensional vectors)
Cloud Document Storage: Store both documents and their embeddings in MongoDB
Similarity Search: Leverage MongoDB's native similarity search capabilities
ReAct Agent Pattern: Implements the Reasoning + Acting agent pattern for dynamic tool usage
Multi-turn Conversations: Uses MemorySaver for maintaining conversation state across multiple turns
Scalable Architecture: Built for production deployment with proper checkpointing

Key Components:

LLM: ChatGroq (GPT OSS 120B)
Embeddings: OllamaEmbeddings (nomic-embed-text:v1.5, 768 dimensions)
Vector Store: MongoDB Atlas
Database: MongoDB Atlas Cloud
Framework: LangChain (create_agent utility)

4. Agentic Knowledge Graph RAG

Description: An advanced RAG system that combines knowledge graphs with vector search for enhanced reasoning and relationship discovery. This notebook demonstrates:

Neo4j Graph Database: Connect to Neo4j for storing and querying structured knowledge relationships
Graph Data Loading: Import CSV data to create Article, Researcher, and Topic nodes with relationships
Hybrid Vector Index: Combine semantic embeddings with graph structure for comprehensive search
Dual Tool Architecture: Separate tools for semantic retrieval and graph-based queries
Cypher Query Generation: Automatic translation of natural language to Cypher queries
ReAct Agent Pattern: Intelligent tool selection between vector search and graph queries
Relationship Analysis: Discover connections between entities, authors, and research topics

Key Components:

LLM: ChatGroq (GPT OSS 20B)
Embeddings: GoogleGenerativeAIEmbeddings (Gemini, 3072 dimensions)
Graph Database: Neo4j
Vector Store: Neo4j Vector Index (hybrid search)
Framework: LangChain + Neo4j Integration

📋 Prerequisites

Python 3.9+
API Keys:
- GROQ_API_KEY - For ChatGroq LLM
- GOOGLE_API_KEY - For Google Gemini embeddings (Hybrid Search & Knowledge Graph notebooks)
- PINECONE_API_KEY - For Pinecone vector database (Hybrid Search notebook)
- MONGODB_ATLAS_CLUSTER_URI - For MongoDB Atlas connection (MongoDB notebook)
- NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD - For Neo4j graph database (Knowledge Graph notebook)
Installed Dependencies (see pyproject.toml)

🛠️ Installation

# Clone the repository
git clone https://github.com/Saad-Shakeel/Agentic-RAG-Notebooks.git
cd Agentic-RAG-Notebooks

# Install dependencies
uv sync

📝 Environment Setup

Rename the environment template file:

# Copy the example environment file
cp .env.example .env

Add your API keys to the .env file:

# Groq API - Get from https://console.groq.com
GROQ_API_KEY=your_groq_api_key

# Google Gemini API - Get from https://makersuite.google.com/app/apikey
GOOGLE_API_KEY=your_google_api_key

# Pinecone API - Get from https://app.pinecone.io
PINECONE_API_KEY=your_pinecone_api_key

# MongoDB Atlas Connection - Get from https://cloud.mongodb.com
MONGODB_ATLAS_CLUSTER_URI=mongodb+srv://username:password@cluster.mongodb.net/database

# Neo4j Graph Database - Get from https://neo4j.com/cloud/aura/
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password

Place your PDF documents in the data/ directory

🎯 Quick Start

Running Locally

# Start Jupyter
jupyter notebook

# Open desired notebook and run cells sequentially

Running on Google Colab

Click the "Open in Colab" badge at the top of each notebook to run directly in your browser without local setup.

📚 Resources

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

⭐ Star History

If you find this repository helpful, please consider giving it a star! Your support motivates continued development.

📧 Support

For issues, questions, or suggestions, please open an issue on GitHub or contact the maintainer.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
1. Agentic_Rag.ipynb		1. Agentic_Rag.ipynb
2. Agentic_RAG_Hybrid_Search.ipynb		2. Agentic_RAG_Hybrid_Search.ipynb
3. Agentic_RAG_with_MongoDB_VectorStore.ipynb		3. Agentic_RAG_with_MongoDB_VectorStore.ipynb
4. Agentic_Knowledge_Graph_RAG.ipynb		4. Agentic_Knowledge_Graph_RAG.ipynb
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic RAG Notebooks

📚 Project Overview

📔 Notebooks

1. Agentic RAG (FAISS Vector Store)

2. Agentic RAG - Hybrid Search

3. Agentic RAG - MongoDB Vector Store

4. Agentic Knowledge Graph RAG

📋 Prerequisites

🛠️ Installation

📝 Environment Setup

🎯 Quick Start

Running Locally

Running on Google Colab

📚 Resources

🤝 Contributing

⭐ Star History

📧 Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG Notebooks

📚 Project Overview

📔 Notebooks

1. Agentic RAG (FAISS Vector Store)

2. Agentic RAG - Hybrid Search

3. Agentic RAG - MongoDB Vector Store

4. Agentic Knowledge Graph RAG

📋 Prerequisites

🛠️ Installation

📝 Environment Setup

🎯 Quick Start

Running Locally

Running on Google Colab

📚 Resources

🤝 Contributing

⭐ Star History

📧 Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages