Onboarding RAG System

A Retrieval-Augmented Generation (RAG) system designed as an AI-powered onboarding assistant for "Learning Thoughts" company. The system helps new employees get answers to onboarding questions by retrieving relevant information from company documents.

Key Components

Core Architecture:

Indexer: Processes and indexes company documents (PDFs) into vector embeddings
Retriever: Searches indexed documents for relevant context based on user queries
Generator: Uses retrieved context with LLM to generate accurate, contextual responses

Technology Stack:

LangChain framework for RAG pipeline
ChromaDB for vector storage
Google Vertex AI for embeddings and LLM (Gemini 2.5 Flash Lite)
PyPDF for document processing
SQLite for record management

Data Sources:

Employee handbook (PDF format) stored in data/ directory
Configurable to handle multiple document types and sources

Key Features:

Incremental indexing with cleanup management
Customizable chunking strategies (1000 chars, 100 overlap)
Source citation in responses
India-specific context awareness
Similarity and MMR search options

Usage

The system processes company documents, creates searchable embeddings, and provides an interactive Q&A interface where employees can ask onboarding-related questions and receive accurate, source-cited answers based solely on company documentation.

Installation

uv sync

Running

uv run onboarding

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
prompts		prompts
src/onboarding		src/onboarding
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Onboarding RAG System

Key Components

Usage

Installation

Running

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Onboarding RAG System

Key Components

Usage

Installation

Running

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages