Skip to content

perf: Share SentenceTransformer between RAGIndexer and ContextRetriever #151

@omsherikar

Description

@omsherikar

Scope

refactron/rag/indexer.py (RAGIndexer) and refactron/rag/retriever.py (ContextRetriever)

Problem

Both classes construct SentenceTransformer(embedding_model) in __init__. Loading the model twice in one process duplicates memory and startup latency (common in workflows: index then query, or repeated CLI invocations if embedder is ever kept warm).

Suggested direction

  • Introduce a small factory or module-level LRU keyed by (model_name, device), or allow injecting a shared SentenceTransformer instance into both classes.
  • Keep backward-compatible defaults.

Acceptance

  • Single-model workflows only load weights once; public API still works without callers passing a custom instance.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions