Skip to content

Make embedding provider, model, and dimensions configurable #133

@100yenadmin

Description

@100yenadmin

Summary

GBrain currently hardcodes OpenAI text-embedding-3-large at 1536 dimensions in the shared embedding service.

Why this matters

  • Users may want to run a different provider, like Voyage AI, without forking the code.
  • Different embedding models have different native or preferred output dimensions.
  • Making provider/model/dimensions config-driven keeps the default path intact while enabling cheaper, higher-quality, or infrastructure-specific setups.
  • This also helps future engine work because chunk metadata can track the chosen embedding model cleanly.

Proposed direction

  • Keep the default behavior exactly as-is: OpenAI text-embedding-3-large, 1536 dims.
  • Add explicit env-driven overrides such as EMBEDDING_PROVIDER, EMBEDDING_MODEL, and EMBEDDING_DIMENSIONS.
  • Support Voyage via VOYAGE_API_KEY.
  • Keep docs minimal and upstream-friendly.

Notes

This is intentionally a small compatibility-preserving change, not a broader embedding subsystem redesign.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions