π A lightweight, provider-agnostic in-memory RAG (Retrieval-Augmented Generation) library with seamless Vercel AI SDK integration.
- π§ In-Memory Vector Store: Lightning-fast similarity search without external dependencies
- π Multi-Provider Support: Works with OpenAI, Anthropic, Google, Cohere, and more via Vercel AI SDK
- β‘ Zero Configuration: Get started with sensible defaults, customize when needed
- π¦ Modular Architecture: Clean separation between vector storage, RAG service, and providers
- π― TypeScript First: Complete type safety with full IntelliSense support
- π Session Isolation: Manage multiple independent knowledge bases per user/session
- π€ Vercel AI SDK Native: Built-in streaming, tools, and edge runtime support
- π Smart Chunking: Automatic document chunking with configurable size and overlap
- π¨ Flexible API: Use high-level helpers or low-level components directly
npm install @aid-on/memory-rag
# Install optional peer dependencies based on your needs:
npm install @ai-sdk/anthropic # For Claude models
npm install @ai-sdk/google # For Gemini models
npm install @ai-sdk/cohere # For Cohere modelsimport { createSimpleRAG } from '@aid-on/memory-rag';
// Create a RAG instance with OpenAI (default)
const rag = createSimpleRAG();
// Add documents to the knowledge base
await rag.addDocument('RAG combines retrieval and generation for better AI responses.');
await rag.addDocument('Vector embeddings capture semantic meaning of text.');
// Search and generate an answer
const response = await rag.search('What is RAG?', 3);
console.log(response.answer);
// Output: RAG (Retrieval-Augmented Generation) combines retrieval and generation...import { InMemoryVectorStore, RAGService } from '@aid-on/memory-rag';
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
// Mix and match providers for embeddings and LLM
const store = new InMemoryVectorStore(
openai.embedding('text-embedding-3-large')
);
const service = new RAGService(
anthropic('claude-3-haiku-20240307')
);
// Add documents with metadata
await store.addDocument(
'Advanced RAG techniques include hybrid search and reranking.',
{ source: 'docs', topic: 'rag-advanced' }
);
// Search with answer generation
const results = await service.search(store, 'advanced RAG', 5, true);
console.log(results.answer);import { getStore, RAGService } from '@aid-on/memory-rag';
// Create isolated stores for different users/sessions
const userStore = getStore('user-123');
const adminStore = getStore('admin-456');
// Each session maintains its own knowledge base
await userStore.addDocument('User dashboard shows personal metrics.');
await adminStore.addDocument('Admin panel includes system monitoring.');
// Queries only search within the session's knowledge
const service = new RAGService();
const userResults = await service.search(userStore, 'dashboard features');
// Returns only user-specific resultsFactory function for creating a complete RAG system.
const rag = createInMemoryRAG({
llmProvider: 'openai', // or 'anthropic', 'google', 'cohere'
embeddingProvider: 'openai', // or any supported provider
llmModel: 'gpt-4o-mini', // optional: specific model
embeddingModel: 'text-embedding-3-small', // optional
config: {
vectorStore: {
maxDocuments: 1000, // max documents to store
chunkSize: 500, // characters per chunk
chunkOverlap: 50 // overlap between chunks
},
search: {
defaultTopK: 5, // default results to return
minScore: 0.5 // minimum similarity score
}
}
});Quick start function with OpenAI defaults.
In-memory vector storage with similarity search.
class InMemoryVectorStore {
constructor(embeddingProvider?: EmbeddingProvider | EmbeddingModel | string);
async addDocument(content: string, metadata?: DocumentMetadata): Promise<string>;
async removeDocument(id: string): Promise<boolean>;
async search(query: string, topK?: number): Promise<SearchResult[]>;
clear(): void;
size(): number;
getStats(): StoreStats;
}Orchestrates RAG operations with LLM integration.
class RAGService {
constructor(llmProvider?: LLMProvider | LanguageModel | string);
async search(
store: IVectorStore,
query: string,
topK?: number,
generateAnswer?: boolean
): Promise<RAGSearchResult>;
async addDocument(
store: IVectorStore,
content: string,
metadata?: DocumentMetadata,
useChunks?: boolean,
chunkSize?: number
): Promise<AddDocumentResult>;
}Perfect for chat applications with real-time streaming:
import { streamRAGResponse } from '@aid-on/memory-rag';
// In your API route or server action
const stream = await streamRAGResponse({
messages: [
{ role: 'user', content: 'Explain vector embeddings' }
],
sessionId: 'user-123',
enableRAG: true, // Enable RAG context
topK: 3, // Number of documents to retrieve
model: 'gpt-4o-mini', // LLM model
temperature: 0.7 // Response creativity
});
// Return stream to client
return new Response(stream);Integrate RAG with Vercel AI SDK's tool system:
import { createRAGTool } from '@aid-on/memory-rag';
import { generateText } from 'ai';
const ragTool = createRAGTool('session-123');
const result = await generateText({
model: openai('gpt-4'),
tools: {
searchKnowledge: ragTool.search,
addKnowledge: ragTool.add
},
prompt: 'Help me understand our documentation'
});Configure defaults via environment variables:
# Provider selection
MEMORY_RAG_LLM_PROVIDER=openai
MEMORY_RAG_EMBEDDING_PROVIDER=openai
# Model selection
MEMORY_RAG_MODEL=gpt-4o-mini
MEMORY_RAG_EMBEDDING_MODEL=text-embedding-3-small
# API Keys (if not set elsewhere)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...import { setConfig } from '@aid-on/memory-rag';
setConfig({
defaultProvider: {
llm: 'anthropic',
embedding: 'openai', // Mix providers
},
vectorStore: {
maxDocuments: 10000, // Increase capacity
chunkSize: 1000, // Larger chunks
chunkOverlap: 100 // More context overlap
},
search: {
defaultTopK: 10, // Return more results
minScore: 0.7 // Higher quality threshold
},
});const service = new RAGService();
// Automatically chunks large documents
const result = await service.addDocument(
store,
longArticle, // 10,000+ characters
{ source: 'blog', author: 'John' },
true, // Enable auto-chunking
1000 // Characters per chunk
);
console.log(`Added ${result.documentIds.length} chunks`);const documents = [
{ content: 'Getting started guide...', metadata: { type: 'tutorial' } },
{ content: 'API reference...', metadata: { type: 'reference' } },
{ content: 'Best practices...', metadata: { type: 'guide' } },
];
// Efficiently add multiple documents
const results = await service.bulkAddDocuments(store, documents);
console.log(`Imported ${results.documentIds.length} documents`);import {
registerLanguageModelProvider,
registerEmbeddingModelProvider
} from '@aid-on/memory-rag';
// Register a custom provider
registerLanguageModelProvider('custom-llm', (model) => {
return {
async generateText({ messages }) {
// Your custom implementation
return 'Generated response';
}
};
});
// Use the custom provider
const service = new RAGService('custom-llm');// Add documents with rich metadata
await store.addDocument('Python tutorial', {
language: 'python',
level: 'beginner',
updated: '2024-01'
});
// Future: Query with metadata filters
// const results = await store.search('tutorial', {
// filter: { language: 'python', level: 'beginner' }
// });Build chatbots that remember context and provide accurate, grounded responses.
Create an AI that can answer questions about your codebase, API, or product docs.
Implement intelligent search that understands intent, not just keywords.
Deploy AI agents that can access your knowledge base to resolve customer queries.
Generate articles, summaries, or reports augmented with factual information.
Build personalized learning assistants with access to course materials.
@aid-on/memory-rag
βββ π types/ # TypeScript interfaces & types
βββ π providers/ # Provider abstraction layer
β βββ factory.ts # Provider factory pattern
β βββ base.ts # Base provider classes
β βββ vercel-ai.ts # Vercel AI SDK adapter
βββ π stores/ # Vector storage layer
β βββ in-memory.ts # In-memory vector store
βββ π services/ # Business logic
β βββ rag-service.ts # RAG orchestration
β βββ store-manager.ts # Session management
βββ π integrations/ # Framework integrations
β βββ vercel-ai.ts # Vercel AI SDK tools
βββ π index.ts # Public API exports
- Provider Agnostic: Swap LLM/embedding providers without changing code
- Memory Efficient: Optimized for in-memory operations
- Type Safe: Full TypeScript with strict typing
- Modular: Use only what you need
- Edge Ready: Works in serverless and edge environments
# Install dependencies
npm install
# Run tests
npm test
# Run tests in watch mode
npm run test:watch
# Generate coverage report
npm run test:coverage
# Build the library
npm run build
# Type checking
npm run type-check
# Linting
npm run lint- Fast Embedding: ~50ms per document (varies by provider)
- Instant Search: <10ms for 1000 documents
- Low Memory: ~1MB per 100 documents
- Zero Cold Start: No external services to initialize
- No data persistence by default
- Session isolation for multi-tenant apps
- Provider API keys stay on your server
- Works in secure edge environments
MIT Β© Aid-On
We welcome contributions! Please see our Contributing Guide for details.
Built with β€οΈ by the Aid-On team