feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172
Open
nbzy1995 wants to merge 4 commits intogarrytan:masterfrom
Open
feat(embedding): pluggable EmbeddingProvider layer (OpenAI + Ollama)#172nbzy1995 wants to merge 4 commits intogarrytan:masterfrom
nbzy1995 wants to merge 4 commits intogarrytan:masterfrom
Conversation
…lama implementations The embedding service was a monolithic OpenAI-specific module. This extracts a provider interface so new backends (Ollama, vLLM, LiteLLM, Voyage) slot in without touching callers. Changes: - Add src/core/embedding/provider.ts — EmbeddingProvider interface + ProviderConfig type - Add src/core/embedding/providers/openai.ts — OpenAIProvider with Matryoshka dim param gated to text-embedding-3 family - Add src/core/embedding/providers/ollama.ts — OllamaProvider over /v1/embeddings, infers dim from known model registry, normalizes errors for retry - Add src/core/embedding/factory.ts — createProvider(config) + resolveConfig that merges explicit config > EMBEDDING_* env vars > defaults - Add src/core/embedding/service.ts — provider-agnostic batching, retry, truncation - Add src/core/embedding/index.ts — public surface - Keep src/core/embedding.ts as a thin re-export shim so existing imports work unchanged - Add test/embedding/provider.test.ts — 15 tests covering both providers, factory, env resolution Default behavior is preserved: no flags, no env vars → OpenAI text-embedding-3-large at 1536 dimensions. The full existing test suite (861 tests) passes without changes. The schema still hardcodes vector(1536); provider-driven schema templating lands in the follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now the PGLite and Postgres schemas hardcoded vector(1536) and
text-embedding-3-large — the result of the v0.6 env-var shim stopping at
embedding.ts without reaching the schema layer. This patch finishes the
abstraction: a brain's embedding dim and default model are chosen at
init time from the resolved EmbeddingProvider, templated into the schema,
and persisted to ~/.gbrain/config.json.
Changes:
- Convert PGLITE_SCHEMA_SQL const to pgliteSchema({dimensions, defaultModel})
function; keep the const as a backward-compat alias that evaluates defaults.
- Same shape for postgresSchema in src/core/schema-embedded.ts; SCHEMA_SQL
alias preserved.
- Engine.initSchema() now takes optional opts (same shape), passes through
to the schema function. Default behavior unchanged when called with no args.
- Add embedding: {provider, model, dimensions, base_url} field to GBrainConfig.
- init.ts: parse --provider / --model / --dimensions / --base-url; resolve
via createProvider() (validates + infers Ollama dims); dim-mismatch guard
refuses re-init against an existing brain with different dimensions; pass
opts to initSchema; persist the chosen provider to config.
- cli.ts: --version also prints active provider when a config is loadable.
- test/schema-templating.test.ts — 11 new unit tests covering default fallback,
partial opts, Postgres dollar-quote preservation, and const-alias parity.
Example usage:
gbrain init --provider=ollama --model=nomic-embed-text # 768d brain
gbrain init --provider=openai # 1536d brain (default)
gbrain init --provider=openai --dimensions=3072 # full text-embedding-3-large
gbrain init # defaults (openai 1536d)
All 861 existing tests still pass; 11 new schema tests added (872 total).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`gbrain init --provider=ollama --model=nomic-embed-text --base-url=http://...` was silently falling through to defaults because parseEmbeddingFlags only handled `--flag value` (space-separated) form. Supporting both forms is standard CLI behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
connectEngine() only loaded the database config. Commands that trigger embedding (embed, import, query, search) fell back to the service's default provider (OpenAI) regardless of what the brain was initialized with, causing 401s when the brain was configured for Ollama. Now connectEngine reads config.embedding, builds the matching provider, and installs it via setProvider before any command runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes the env-var shim in
src/core/embedding.tsto a properEmbeddingProviderinterface so embedding backends slot in without touching call sites or the schema. ShipsOpenAIProvider(default, identical behavior to before) andOllamaProvider(any model on a local Ollama daemon, e.g.nomic-embed-text @ 768d). Templates thevector(N)schema dim by the resolved provider sogbrain init --provider=ollama --model=nomic-embed-textactually creates avector(768)brain.Why
Today
embedding.tsreadsEMBEDDING_MODEL/EMBEDDING_DIMENSIONS/EMBEDDING_BASE_URLenv vars and gates the Matryoshkadimensionsparam viamodel.startsWith('text-embedding-3'). The schema, however, hardcodesvector(1536)and'text-embedding-3-large', so setting those env vars produces a brain whose vector column doesn't match what the embedder writes — a silent footgun.This patch finishes the abstraction: provider quirks (Matryoshka, error normalization, dim inference) live in concrete provider classes; schema dim flows from the resolved provider at
gbrain inittime and is persisted to~/.gbrain/config.json. After init,connectEngine()hydrates the persisted provider before any command runs, soembed,query,import,serveall use the brain's frozen choice — env-var changes can't silently corrupt vectors mid-life.Scope (4 commits, ~870 LOC, all behavior-preserving by default)
8b206b8EmbeddingProviderinterface,OpenAIProvider,OllamaProvider, factory, service.embedding.tsbecomes a backward-compat shim. 15 new provider tests.0abd57cpglite-schema.tsandschema-embedded.tsby(dimensions, defaultModel).engine.initSchema(opts?)plumbs them through.init.tsparses--provider/--model/--dimensions/--base-url, persistsembedding: {...}to config, refuses re-init on dim mismatch.cli.ts --versionprints the active provider. 11 new schema-templating tests.7081e12parseEmbeddingFlagsaccepts both--flag valueand--flag=value.41469e1cli.ts connectEngine()hydrates the provider fromconfig.embeddingbefore any command — so embed/query don't fall back to OpenAI defaults when the brain was inited for Ollama.Tests
test/embedding/provider.test.tsandtest/schema-templating.test.tsgbrain initwith no flags → identical SQL as before,OpenAIProviderinstantiated, all existing tests unchangedgbrain queryreturns 1.0 cosine scores on topic-aligned questionsBackward compatibility
embedding.tsstill exportsembedandembedBatchwith unchanged signatures (re-exports from the newembedding/service.ts). No call sites need editing.PGLITE_SCHEMA_SQLandSCHEMA_SQLconst aliases preserved (evaluate the schema function with default opts → identical SQL).engine.initSchema()with no args defaults to(1536, 'text-embedding-3-large')— existing test harnesses keep working.EMBEDDING_*env vars still honored as the resolution fallback when no CLI flag and no persisted config.Example usage
Notes for maintainer
gbrain config showrenders nested objects as[object Object]— predates this PR but newly visible becauseembeddingis the first nested config fieldnomic-embed-text's native context is ~2K tokens, but the chunker can produce larger chunks. Worth gating chunker max size byprovider.maxInputCharsor the model's context windowprovider-layerfirst,schema-templatingsecond) if that's easier to review🤖 Generated with Claude Code