📚 Study RAG Assistant

test.mov

Uploading Screen Recording 2026-01-08 at 16.01.53.mov…

A context-aware Research Assistant for deep technical document analysis. This tool is designed to aid the consumption of documents for students and researchers studying dense materials by providing structural awareness and persistent conversation memory.

🌟 Key Technical Features

Structural Context Buffer: Unlike standard RAG systems, this assistant extracts and caches the first 30 pages of documents to understand the "global structure" (Table of Contents, Introduction, and Chapters), allowing it to answer high-level structural questions.
Persistent Multi-Book Workspace: Switch between different research subjects in your library without losing your specific chat history for each book.
High-Speed Streaming: Optimized "typewriter-style" response delivery using Gemini 2.5 Flash for a seamless, ChatGPT-like user experience.
Security-First: Fully integrated with python-dotenv to ensure API keys remain private and are never leaked to version control.

🛠 Tech Stack

LLM: Google Gemini 2.5 Flash
Embeddings: Google text-embedding-004
Vector Store: FAISS (Facebook AI Similarity Search)
Orchestration: LangChain
UI Framework: Streamlit

🚀 Quick Start

1. Clone & Setup

git clone [https://github.com/cyrexez/study-rag-assistant.git](https://github.com/cyrexez/study-rag-assistant.git)
cd study-rag-assistant
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Configure API Key

Create a .env file in the root directory: GOOGLE_API_KEY=your_gemini_api_key_here

3. Run the Assistant

python -m streamlit run app.py

📂 Project Structure

app.py: Streamlit frontend and session state management.

rag_backend.py: Core RAG logic, structural indexing, and retrieval chains.

data/: Local directory for your academic PDFs (excluded from Git).

vectorstore/: Local FAISS index storage (excluded from Git).

💡 Credits & Inspiration

This project was inspired by the original RAG-with-Langchain-and-FastAPI repository by Ana Rojo-Echeburúa.

I have modified and extended the original concept to better suit academic research needs by implementing:

Streamlit-based Interactive UI: Replaced the FastAPI backend with a dedicated researcher dashboard.
Google Gemini 2.5 Flash Integration: Transitioned from OpenAI to Gemini.
Structural Context Retrieval: Added specialized logic to index Table of Contents and document structure for better global awareness.
Persistent Chat History: Enabled per-book session management to keep multiple research threads organized.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.py		app.py
install_deps.py		install_deps.py
rag.py		rag.py
rag_backend.py		rag_backend.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Study RAG Assistant

🌟 Key Technical Features

🛠 Tech Stack

🚀 Quick Start

1. Clone & Setup

2. Configure API Key

3. Run the Assistant

📂 Project Structure

💡 Credits & Inspiration

About

Uh oh!

Releases

Packages

Languages

cyrexez/study-rag-assistant

Folders and files

Latest commit

History

Repository files navigation

📚 Study RAG Assistant

🌟 Key Technical Features

🛠 Tech Stack

🚀 Quick Start

1. Clone & Setup

2. Configure API Key

3. Run the Assistant

📂 Project Structure

💡 Credits & Inspiration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages