Skip to content

feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172

Open
nbzy1995 wants to merge 4 commits intogarrytan:masterfrom
nbzy1995:feat/embedding-provider-layer
Open

feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172
nbzy1995 wants to merge 4 commits intogarrytan:masterfrom
nbzy1995:feat/embedding-provider-layer

Conversation

@nbzy1995
Copy link
Copy Markdown

Summary

Promotes the env-var shim in src/core/embedding.ts to a proper EmbeddingProvider interface so embedding backends slot in without touching call sites or the schema. Ships OpenAIProvider (default, identical behavior to before) and OllamaProvider (any model on a local Ollama daemon, e.g. nomic-embed-text @ 768d). Templates the vector(N) schema dim by the resolved provider so gbrain init --provider=ollama --model=nomic-embed-text actually creates a vector(768) brain.

Why

Today embedding.ts reads EMBEDDING_MODEL / EMBEDDING_DIMENSIONS / EMBEDDING_BASE_URL env vars and gates the Matryoshka dimensions param via model.startsWith('text-embedding-3'). The schema, however, hardcodes vector(1536) and 'text-embedding-3-large', so setting those env vars produces a brain whose vector column doesn't match what the embedder writes — a silent footgun.

This patch finishes the abstraction: provider quirks (Matryoshka, error normalization, dim inference) live in concrete provider classes; schema dim flows from the resolved provider at gbrain init time and is persisted to ~/.gbrain/config.json. After init, connectEngine() hydrates the persisted provider before any command runs, so embed, query, import, serve all use the brain's frozen choice — env-var changes can't silently corrupt vectors mid-life.

Scope (4 commits, ~870 LOC, all behavior-preserving by default)

# Commit What
1 8b206b8 Extract EmbeddingProvider interface, OpenAIProvider, OllamaProvider, factory, service. embedding.ts becomes a backward-compat shim. 15 new provider tests.
2 0abd57c Template pglite-schema.ts and schema-embedded.ts by (dimensions, defaultModel). engine.initSchema(opts?) plumbs them through. init.ts parses --provider/--model/--dimensions/--base-url, persists embedding: {...} to config, refuses re-init on dim mismatch. cli.ts --version prints the active provider. 11 new schema-templating tests.
3 7081e12 parseEmbeddingFlags accepts both --flag value and --flag=value.
4 41469e1 cli.ts connectEngine() hydrates the provider from config.embedding before any command — so embed/query don't fall back to OpenAI defaults when the brain was inited for Ollama.

Tests

  • 883 pass / 0 fail (was 861) — added 26 new tests across test/embedding/provider.test.ts and test/schema-templating.test.ts
  • All default behavior preserved: gbrain init with no flags → identical SQL as before, OpenAIProvider instantiated, all existing tests unchanged
  • Ran end-to-end on a real personal corpus: 447 pages, 925 chunks, 742 embeddings via Ollama nomic-embed-text. Verified gbrain query returns 1.0 cosine scores on topic-aligned questions

Backward compatibility

  • embedding.ts still exports embed and embedBatch with unchanged signatures (re-exports from the new embedding/service.ts). No call sites need editing.
  • PGLITE_SCHEMA_SQL and SCHEMA_SQL const aliases preserved (evaluate the schema function with default opts → identical SQL).
  • engine.initSchema() with no args defaults to (1536, 'text-embedding-3-large') — existing test harnesses keep working.
  • EMBEDDING_* env vars still honored as the resolution fallback when no CLI flag and no persisted config.

Example usage

# Default (unchanged): OpenAI text-embedding-3-large at 1536d
gbrain init

# Local Ollama, free, offline, 768d
ollama pull nomic-embed-text
gbrain init --provider=ollama --model=nomic-embed-text --base-url=http://localhost:11434/v1

# Any OpenAI-compatible endpoint (vLLM, LiteLLM, etc.)
gbrain init --provider=openai --model=text-embedding-3-large --base-url=https://my-proxy.example/v1

Notes for maintainer

  • Two follow-ups not in this PR (happy to send separately if you want):
    • gbrain config show renders nested objects as [object Object] — predates this PR but newly visible because embedding is the first nested config field
    • Some chunks failed to embed via Ollama with "input length exceeds the context length" — nomic-embed-text's native context is ~2K tokens, but the chunker can produce larger chunks. Worth gating chunker max size by provider.maxInputChars or the model's context window
  • This PR is independent of feat(expansion): OpenAI-compat JSON mode replaces Anthropic tool use #165 (query expansion / Anthropic SDK swap) — different code path
  • Happy to split into two PRs (provider-layer first, schema-templating second) if that's easier to review

🤖 Generated with Claude Code

nbzy1995 and others added 4 commits April 16, 2026 22:58
…lama implementations

The embedding service was a monolithic OpenAI-specific module. This extracts a
provider interface so new backends (Ollama, vLLM, LiteLLM, Voyage) slot in without
touching callers.

Changes:
- Add src/core/embedding/provider.ts — EmbeddingProvider interface + ProviderConfig type
- Add src/core/embedding/providers/openai.ts — OpenAIProvider with Matryoshka dim param
  gated to text-embedding-3 family
- Add src/core/embedding/providers/ollama.ts — OllamaProvider over /v1/embeddings,
  infers dim from known model registry, normalizes errors for retry
- Add src/core/embedding/factory.ts — createProvider(config) + resolveConfig that merges
  explicit config > EMBEDDING_* env vars > defaults
- Add src/core/embedding/service.ts — provider-agnostic batching, retry, truncation
- Add src/core/embedding/index.ts — public surface
- Keep src/core/embedding.ts as a thin re-export shim so existing imports work unchanged
- Add test/embedding/provider.test.ts — 15 tests covering both providers, factory, env resolution

Default behavior is preserved: no flags, no env vars → OpenAI text-embedding-3-large
at 1536 dimensions. The full existing test suite (861 tests) passes without changes.

The schema still hardcodes vector(1536); provider-driven schema templating lands in
the follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now the PGLite and Postgres schemas hardcoded vector(1536) and
text-embedding-3-large — the result of the v0.6 env-var shim stopping at
embedding.ts without reaching the schema layer. This patch finishes the
abstraction: a brain's embedding dim and default model are chosen at
init time from the resolved EmbeddingProvider, templated into the schema,
and persisted to ~/.gbrain/config.json.

Changes:
- Convert PGLITE_SCHEMA_SQL const to pgliteSchema({dimensions, defaultModel})
  function; keep the const as a backward-compat alias that evaluates defaults.
- Same shape for postgresSchema in src/core/schema-embedded.ts; SCHEMA_SQL
  alias preserved.
- Engine.initSchema() now takes optional opts (same shape), passes through
  to the schema function. Default behavior unchanged when called with no args.
- Add embedding: {provider, model, dimensions, base_url} field to GBrainConfig.
- init.ts: parse --provider / --model / --dimensions / --base-url; resolve
  via createProvider() (validates + infers Ollama dims); dim-mismatch guard
  refuses re-init against an existing brain with different dimensions; pass
  opts to initSchema; persist the chosen provider to config.
- cli.ts: --version also prints active provider when a config is loadable.
- test/schema-templating.test.ts — 11 new unit tests covering default fallback,
  partial opts, Postgres dollar-quote preservation, and const-alias parity.

Example usage:
  gbrain init --provider=ollama --model=nomic-embed-text  # 768d brain
  gbrain init --provider=openai                           # 1536d brain (default)
  gbrain init --provider=openai --dimensions=3072         # full text-embedding-3-large
  gbrain init                                             # defaults (openai 1536d)

All 861 existing tests still pass; 11 new schema tests added (872 total).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`gbrain init --provider=ollama --model=nomic-embed-text --base-url=http://...`
was silently falling through to defaults because parseEmbeddingFlags only
handled `--flag value` (space-separated) form. Supporting both forms is
standard CLI behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
connectEngine() only loaded the database config. Commands that trigger
embedding (embed, import, query, search) fell back to the service's
default provider (OpenAI) regardless of what the brain was initialized
with, causing 401s when the brain was configured for Ollama.

Now connectEngine reads config.embedding, builds the matching provider,
and installs it via setProvider before any command runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant