feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama) by nbzy1995 · Pull Request #172 · garrytan/gbrain

nbzy1995 · 2026-04-17T03:51:57Z

Summary

Promotes the env-var shim in src/core/embedding.ts to a proper EmbeddingProvider interface so embedding backends slot in without touching call sites or the schema. Ships OpenAIProvider (default, identical behavior to before) and OllamaProvider (any model on a local Ollama daemon, e.g. nomic-embed-text @ 768d). Templates the vector(N) schema dim by the resolved provider so gbrain init --provider=ollama --model=nomic-embed-text actually creates a vector(768) brain.

Why

Today embedding.ts reads EMBEDDING_MODEL / EMBEDDING_DIMENSIONS / EMBEDDING_BASE_URL env vars and gates the Matryoshka dimensions param via model.startsWith('text-embedding-3'). The schema, however, hardcodes vector(1536) and 'text-embedding-3-large', so setting those env vars produces a brain whose vector column doesn't match what the embedder writes — a silent footgun.

This patch finishes the abstraction: provider quirks (Matryoshka, error normalization, dim inference) live in concrete provider classes; schema dim flows from the resolved provider at gbrain init time and is persisted to ~/.gbrain/config.json. After init, connectEngine() hydrates the persisted provider before any command runs, so embed, query, import, serve all use the brain's frozen choice — env-var changes can't silently corrupt vectors mid-life.

Scope (4 commits, ~870 LOC, all behavior-preserving by default)

#	Commit	What
1	`8b206b8`	Extract `EmbeddingProvider` interface, `OpenAIProvider`, `OllamaProvider`, factory, service. `embedding.ts` becomes a backward-compat shim. 15 new provider tests.
2	`0abd57c`	Template `pglite-schema.ts` and `schema-embedded.ts` by `(dimensions, defaultModel)`. `engine.initSchema(opts?)` plumbs them through. `init.ts` parses `--provider/--model/--dimensions/--base-url`, persists `embedding: {...}` to config, refuses re-init on dim mismatch. `cli.ts --version` prints the active provider. 11 new schema-templating tests.
3	`7081e12`	`parseEmbeddingFlags` accepts both `--flag value` and `--flag=value`.
4	`41469e1`	`cli.ts connectEngine()` hydrates the provider from `config.embedding` before any command — so embed/query don't fall back to OpenAI defaults when the brain was inited for Ollama.

Tests

883 pass / 0 fail (was 861) — added 26 new tests across test/embedding/provider.test.ts and test/schema-templating.test.ts
All default behavior preserved: gbrain init with no flags → identical SQL as before, OpenAIProvider instantiated, all existing tests unchanged
Ran end-to-end on a real personal corpus: 447 pages, 925 chunks, 742 embeddings via Ollama nomic-embed-text. Verified gbrain query returns 1.0 cosine scores on topic-aligned questions

Backward compatibility

embedding.ts still exports embed and embedBatch with unchanged signatures (re-exports from the new embedding/service.ts). No call sites need editing.
PGLITE_SCHEMA_SQL and SCHEMA_SQL const aliases preserved (evaluate the schema function with default opts → identical SQL).
engine.initSchema() with no args defaults to (1536, 'text-embedding-3-large') — existing test harnesses keep working.
EMBEDDING_* env vars still honored as the resolution fallback when no CLI flag and no persisted config.

Example usage

# Default (unchanged): OpenAI text-embedding-3-large at 1536d
gbrain init

# Local Ollama, free, offline, 768d
ollama pull nomic-embed-text
gbrain init --provider=ollama --model=nomic-embed-text --base-url=http://localhost:11434/v1

# Any OpenAI-compatible endpoint (vLLM, LiteLLM, etc.)
gbrain init --provider=openai --model=text-embedding-3-large --base-url=https://my-proxy.example/v1

Notes for maintainer

Two follow-ups not in this PR (happy to send separately if you want):
- gbrain config show renders nested objects as [object Object] — predates this PR but newly visible because embedding is the first nested config field
- Some chunks failed to embed via Ollama with "input length exceeds the context length" — nomic-embed-text's native context is ~2K tokens, but the chunker can produce larger chunks. Worth gating chunker max size by provider.maxInputChars or the model's context window
This PR is independent of feat(expansion): OpenAI-compat JSON mode replaces Anthropic tool use #165 (query expansion / Anthropic SDK swap) — different code path
Happy to split into two PRs (provider-layer first, schema-templating second) if that's easier to review

🤖 Generated with Claude Code

…lama implementations The embedding service was a monolithic OpenAI-specific module. This extracts a provider interface so new backends (Ollama, vLLM, LiteLLM, Voyage) slot in without touching callers. Changes: - Add src/core/embedding/provider.ts — EmbeddingProvider interface + ProviderConfig type - Add src/core/embedding/providers/openai.ts — OpenAIProvider with Matryoshka dim param gated to text-embedding-3 family - Add src/core/embedding/providers/ollama.ts — OllamaProvider over /v1/embeddings, infers dim from known model registry, normalizes errors for retry - Add src/core/embedding/factory.ts — createProvider(config) + resolveConfig that merges explicit config > EMBEDDING_* env vars > defaults - Add src/core/embedding/service.ts — provider-agnostic batching, retry, truncation - Add src/core/embedding/index.ts — public surface - Keep src/core/embedding.ts as a thin re-export shim so existing imports work unchanged - Add test/embedding/provider.test.ts — 15 tests covering both providers, factory, env resolution Default behavior is preserved: no flags, no env vars → OpenAI text-embedding-3-large at 1536 dimensions. The full existing test suite (861 tests) passes without changes. The schema still hardcodes vector(1536); provider-driven schema templating lands in the follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Until now the PGLite and Postgres schemas hardcoded vector(1536) and text-embedding-3-large — the result of the v0.6 env-var shim stopping at embedding.ts without reaching the schema layer. This patch finishes the abstraction: a brain's embedding dim and default model are chosen at init time from the resolved EmbeddingProvider, templated into the schema, and persisted to ~/.gbrain/config.json. Changes: - Convert PGLITE_SCHEMA_SQL const to pgliteSchema({dimensions, defaultModel}) function; keep the const as a backward-compat alias that evaluates defaults. - Same shape for postgresSchema in src/core/schema-embedded.ts; SCHEMA_SQL alias preserved. - Engine.initSchema() now takes optional opts (same shape), passes through to the schema function. Default behavior unchanged when called with no args. - Add embedding: {provider, model, dimensions, base_url} field to GBrainConfig. - init.ts: parse --provider / --model / --dimensions / --base-url; resolve via createProvider() (validates + infers Ollama dims); dim-mismatch guard refuses re-init against an existing brain with different dimensions; pass opts to initSchema; persist the chosen provider to config. - cli.ts: --version also prints active provider when a config is loadable. - test/schema-templating.test.ts — 11 new unit tests covering default fallback, partial opts, Postgres dollar-quote preservation, and const-alias parity. Example usage: gbrain init --provider=ollama --model=nomic-embed-text # 768d brain gbrain init --provider=openai # 1536d brain (default) gbrain init --provider=openai --dimensions=3072 # full text-embedding-3-large gbrain init # defaults (openai 1536d) All 861 existing tests still pass; 11 new schema tests added (872 total). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`gbrain init --provider=ollama --model=nomic-embed-text --base-url=http://...` was silently falling through to defaults because parseEmbeddingFlags only handled `--flag value` (space-separated) form. Supporting both forms is standard CLI behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

connectEngine() only loaded the database config. Commands that trigger embedding (embed, import, query, search) fell back to the service's default provider (OpenAI) regardless of what the brain was initialized with, causing 401s when the brain was configured for Ollama. Now connectEngine reads config.embedding, builds the matching provider, and installs it via setProvider before any command runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

nbzy1995 and others added 4 commits April 16, 2026 22:58

This was referenced Apr 17, 2026

Make embedding provider, model, and dimensions configurable #133

Open

feat: PDF ingestion (extend gbrain import beyond markdown) #173

Open

prasadus92 mentioned this pull request Apr 18, 2026

feat: Gemini embedding support via GEMINI_API_KEY env var (zero-config, free tier) #89

Open

jamebobob mentioned this pull request Apr 18, 2026

init: --provider flags required even when config.json has persisted embedding settings (DX) #203

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172

feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172
nbzy1995 wants to merge 4 commits intogarrytan:masterfrom
nbzy1995:feat/embedding-provider-layer

nbzy1995 commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nbzy1995 commented Apr 17, 2026

Summary

Why

Scope (4 commits, ~870 LOC, all behavior-preserving by default)

Tests

Backward compatibility

Example usage

Notes for maintainer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant