feat(gemini): add GeminiDenseEmbedder text embedding provider by chethanuk · Pull Request #751 · volcengine/OpenViking

chethanuk · 2026-03-18T23:21:53Z

Description

Phase 1 of #566: Adds GeminiDenseEmbedder — a production-ready, opt-in text-only dense embedding provider backed by Google's google-genai SDK. Volcengine remains the default; Gemini is enabled via provider: "gemini" in ov.conf. No existing behaviour changes.

This PR implements Phase 1 (text-only). Phase 2 (multimodal embedding) is tracked in #566 — the issue remains open.

Phased Implementation

Phase	Scope	Status
Phase 1 (this PR)	Text-only dense embedding: 8 task types, MRL 1-3072, SDK retry, async batching, non-symmetric routing via #702 pattern	✅
Phase 2 (future)	Multimodal embedding: image/audio/video via Parts API, cross-modal retrieval, align with #718 approach	Planned

Key capabilities

Text-only — supports_multimodal = False; multimodal is Phase 2
Non-symmetric routing — get_query_embedder() / get_document_embedder() route query_param/document_param to Gemini task types via factory (follows feat(embedding): combine document embedder and query embedder to avoi… #702 pattern)
Per-call task_type + title — override at call time; all 8 Gemini task types
Case-insensitive config — task_type, query_param, document_param auto-uppercased in config normalization
SDK-native retry — HttpRetryOptions(attempts=3, exp_base=2, max_delay=30s) with import guard for SDK < 0.8
Async concurrent batching — async_embed_batch() dispatches 100-text chunks in parallel via anyio
Empty-text guard — returns zero-vectors with warning log; preserves index alignment

flowchart LR
    subgraph Config["ov.conf / EmbeddingModelConfig"]
        CFG["provider: gemini\napi_key: sk-…\ntask_type: RETRIEVAL_DOCUMENT\ndimension: 1536"]
    end

    subgraph Embedder["GeminiDenseEmbedder(DenseEmbedderBase)"]
        direction TB
        BC["_build_config(task_type?, title?)"]
        E["embed(text, task_type?, title?)"]
        EB["embed_batch(texts, titles?)"]
        AEB["async_embed_batch(texts)"]
        E --> BC
        EB --> BC
        AEB --> BC
    end

    subgraph SDK["google-genai SDK"]
        direction TB
        HTTP["HttpOptions\n(retry: 3× exp backoff)"]
        SYNC["models.embed_content()"]
        ASYNC["aio.models.embed_content()\n(semaphore-bounded)"]
        HTTP --> SYNC
        HTTP --> ASYNC
    end

    Config --> Embedder
    BC --> SYNC
    AEB --> ASYNC

Related Issue

Partial implementation of #566 (Phase 1: text-only dense embedding; Phase 2: multimodal — issue stays open)

Type of Change

Changes Made

openviking/models/embedder/gemini_embedders.py — GeminiDenseEmbedder(DenseEmbedderBase): _build_config(), per-call task_type/title, SDK retry, __repr__, empty-text guard with warning log
openviking_cli/utils/config/embedding_config.py — registers "gemini" provider; validates + auto-uppercases task_type/query_param/document_param; factory routes context for non-symmetric mode
openviking/models/embedder/__init__.py — exports GeminiDenseEmbedder; annotation: "text-only; multimodal planned"
examples/ov.conf.example — Gemini config snippet
pyproject.toml — google-genai>=0.8 dep; optional gemini-async extra for anyio
tests/unit/test_gemini_embedder.py — 41 unit tests (mock-only, no API key needed)
tests/unit/test_embedding_config_gemini.py — 18 config validation tests (incl. case-insensitive task_type)
tests/integration/ — 49 integration tests (auto-skip without GOOGLE_API_KEY)

Testing

Suite	Tests	Live key required
`test_gemini_embedder.py`	41	No
`test_embedding_config_gemini.py`	18	No
`test_gemini_e2e.py`	7	Yes
`test_gemini_embedding_it.py`	22	Yes
`test_gemini_openviking_it.py`	20	Yes

108 passed, 0 skipped  ← with GOOGLE_API_KEY
 59 passed, 49 skipped ← without GOOGLE_API_KEY (CI)

Checklist

Code follows project coding style (ruff + black formatted)
Self-review completed
Tests added and passing
Tested on Linux and macOS

- Adds GeminiDenseEmbedder backed by google-genai>=0.8.0 SDK - Supports all 8 task types, MRL dimensions 1-3072, async batch via anyio - Wires gemini provider into EmbeddingModelConfig factory + context routing - Adds unit tests (mocked, no API key) and IT tests (auto-skip without GOOGLE_API_KEY) - Updates pyproject.toml with google-genai, anyio, pytest-asyncio, pytest-cov deps

…t from IT test - GeminiDenseEmbedder.supports_multimodal = False (text-only provider) - Remove ModalContent/Vectorize import from test_gemini_e2e.py (doesn't exist in gem-v1) - Remove TestGeminiE2EMultimodalEmbedding class (multimodal out of scope for text-only slice)

- embed_batch() fallback: pass is_query through to self.embed() - Remove redundant _l2_normalize() calls in embed/embed_batch/async_embed_batch; Gemini API already returns L2-normalized vectors, truncate_and_normalize is sufficient - Remove now-dead _l2_normalize() function and unused math import - embedding_config.py: fix typo "Secretfor" → "Secret for" in sk field description - conftest.py: narrow broad except Exception to (ImportError, ModuleNotFoundError, AttributeError) - test_gemini_embedding_it.py: construct fake API key dynamically to avoid hardcoded secret pattern - test_gemini_openviking_it.py: add dimension assertion in test_dimension_variant_add_search; fix tautology assertion (>= 0 → > 0) in test_session_search_smoke

…dd pytest-xdist - embed()/embed_batch(): return zero vector immediately for empty/whitespace-only text, avoiding non-descriptive Gemini API errors - embed_batch(): filter empty strings pre-API call; reassemble results in original order so callers receive a zero vector at each empty position - Remove _l2_normalize() function and all call sites (embed, embed_batch, async_embed_batch): Gemini API already returns L2-normalized vectors; truncate_and_normalize() is sufficient - Remove unused `import math` (was only used by _l2_normalize) - Add pytest-xdist>=3.0 to [dependency-groups].dev for parallel test runs - Add 2 unit tests: test_embed_empty_string_returns_zero_vector, test_embed_batch_skips_empty_strings (50 unit tests total)

…ry, repr - Add _build_config() to merge per-call task_type/title with instance defaults - embed() gains keyword-only task_type= and title= overrides - embed_batch() gains keyword-only task_type= and titles=; falls back to per-item embed() calls when titles is provided (per-document metadata) - async_embed_batch() switches to _build_config() (no new args) - Client constructed with HttpOptions(retry_options=HttpRetryOptions( attempts=3, initial_delay=1.0, max_delay=30.0, exp_base=2.0)) when SDK >= 0.8.0; falls back to plain Client on ImportError - Add __repr__: GeminiDenseEmbedder(model=..., dim=..., task_type=...) - Remove now-dead self._embed_config from __init__ - 8 new unit tests in TestBuildConfig; fix test_init_stores_fields assertion to allow http_options kwarg

chethanuk · 2026-03-18T23:27:01Z

@qin-ctx Please review and merge this

ZaynJarvis · 2026-03-19T03:50:55Z

#702 is megred for query and document embedding support.

get_query_embedder() is not the expected usage.

btw check if you can improve based on #718

qin-ptr

Review Summary

This PR adds Google Gemini as a text embedding provider with excellent test coverage (107 tests) and solid error handling. However, there are three blocking issues that need to be addressed:

Design mismatch: Issue #566 explicitly requests multimodal retrieval support, but this PR only implements text-only embedding (supports_multimodal = False). The PR description should clarify this is a phased implementation with a roadmap for multimodal support, or change the issue reference to indicate partial completion.
Documentation inconsistency: Code comment claims "multimodal" but implementation is text-only.
CI failure: lint job failed due to formatting issues in test_gemini_embedder.py.

Once these are resolved, the PR will be ready to merge. The implementation quality is strong, with comprehensive testing, graceful error handling, and proper SDK retry integration.

🤖 I am a bot owned by @qin-ctx.

qin-ptr · 2026-03-19T05:32:36Z

openviking/models/embedder/__init__.py

 - OpenAI: Dense only
 - Volcengine: Dense, Sparse, Hybrid
 - Jina AI: Dense only
+- Google Gemini: Dense only (multimodal)


[Bug] (blocking)

Comment says "Google Gemini: Dense only (multimodal)" but the actual implementation has GeminiDenseEmbedder.supports_multimodal = False. This is contradictory.

Should either:

Change to "Google Gemini: Dense only (text-only; multimodal planned)"

Or remove the "(multimodal)" annotation entirely until multimodal support is actually implemented

qin-ptr · 2026-03-19T05:32:36Z

openviking/models/embedder/gemini_embedders.py

+    # text-embedding-004:         768  fixed-dim legacy model, does not support MRL truncation
+    # Future gemini-embedding-*:  default 3072 via _default_dimension() fallback
+    # Future text-embedding-*:    default 768  via _default_dimension() prefix rule
+    supports_multimodal: bool = False  # text-only; multimodal planned separately


[Design] (blocking)

This PR claims to close issue #566, which explicitly requests "first-class Gemini Embedding 2 support for multimodal retrieval". The issue emphasizes:

"it should not reduce everything to plain text before embedding"

"multimodal inputs handled as multimodal, not flattened by default"

However, this implementation sets supports_multimodal = False and only handles text. This is a fundamental mismatch between the issue requirement and the PR implementation.

Recommendation: Either:

Update the PR description to clarify this is phase 1 (text-only) with a roadmap for phase 2 (multimodal), and change the issue reference to "Partial implementation of [Feature]: Add first-class Gemini Embedding 2 support for multimodal retrieval #566", keeping the issue open

Or implement multimodal support in this PR as originally requested by the issue

The current "Closes: #566" statement is misleading and will cause users to expect multimodal functionality that doesn't exist.

qin-ptr · 2026-03-19T05:32:36Z

tests/unit/test_gemini_embedder.py

@@ -0,0 +1,479 @@
+# Copyright (c) 2026 Beijing Volcano Engine Technology Co., Ltd.


[Bug] (blocking)

CI lint job failed because this file needs formatting:

Would reformat: tests/unit/test_gemini_embedder.py

Please run black tests/unit/test_gemini_embedder.py and commit the formatted version.

qin-ptr · 2026-03-19T05:32:36Z

openviking_cli/utils/config/embedding_config.py

            if backend is not None and provider is None:
                data["provider"] = backend
-            for key in ("query_param", "document_param"):
+            for key in ("input_type",):


[Suggestion] (non-blocking)

The for key in ("input_type",): loop normalizes input_type to lowercase, but the new Gemini provider uses task_type instead of input_type. Consider also normalizing task_type here, or document that task_type must be uppercase (which is currently enforced in the Gemini validator at line 176).

qin-ptr · 2026-03-19T05:32:36Z

openviking/models/embedder/gemini_embedders.py

+        task_type: Optional[str] = None,
+        title: Optional[str] = None,
+    ) -> EmbedResult:
+        if not text or not text.strip():


[Suggestion] (non-blocking)

Returning a zero vector for empty text is a reasonable fallback, but it happens silently. Consider logging a warning so callers know their empty input was handled specially:

if not text or not text.strip(): logger.warning("Empty text passed to embed(), returning zero vector") return EmbedResult(dense_vector=[0.0] * self._dimension)

This helps with debugging when unexpected empty strings appear in production.

…fig normalization, empty-text warning - Fix "(multimodal)" annotation to "(text-only; multimodal planned)" to match supports_multimodal=False (B1) - Run black formatter on test_gemini_embedder.py (B3) - Add auto-uppercase normalization for task_type/query_param/document_param in sync_provider_backend so lowercase config values pass validation (NB1) - Add logger.warning on empty text in embed() for production debuggability (NB2) - Add test_gemini_task_type_case_insensitive test (NB1) - Fix ruff import sorting in __init__.py

chethanuk · 2026-03-19T08:59:35Z

Thanks for the thorough review! All issues addressed in commit bcfd612:

@qin-ptr — blocking fixes

B1 __init__.py:15: Fixed — annotation now reads "Dense only (text-only; multimodal planned)"
B2 Design mismatch: Reframed PR as Phase 1 (text-only). Changed "Closes: [Feature]: Add first-class Gemini Embedding 2 support for multimodal retrieval #566" → "Partial implementation of [Feature]: Add first-class Gemini Embedding 2 support for multimodal retrieval #566". Issue stays open for Phase 2 (multimodal).
B3 Lint: Ran black on test_gemini_embedder.py

@qin-ptr — non-blocking (adopted both)

NB1 task_type normalization: Added auto-uppercase in sync_provider_backend for task_type, query_param, document_param. Users can now write task_type: "retrieval_document" (any case) in config. Added test test_gemini_task_type_case_insensitive.
NB2 Empty text warning: Added logger.warning("Empty text passed to embed(), returning zero vector")

@ZaynJarvis — #702 pattern + #718

feat(embedding): combine document embedder and query embedder to avoi… #702 pattern: Already implemented ✅ — factory at embedding_config.py:370-384 does context-based task_type routing via get_query_embedder() / get_document_embedder() / _get_contextual_embedder(). Non-symmetric mode activates when query_param or document_param is set in config.
feat: add native Google/Gemini Embedding 2 support with Parts API #718 alignment: Reviewed. Our SDK-based approach (google-genai) vs feat: add native Google/Gemini Embedding 2 support with Parts API #718's raw REST approach are complementary. Phase 2 multimodal work will incorporate Parts API structure from feat: add native Google/Gemini Embedding 2 support with Parts API #718. If feat: add native Google/Gemini Embedding 2 support with Parts API #718 merges first, we'll consolidate file naming (google_embedders.py vs gemini_embedders.py).

Verification

ruff check: All checks passed!
pytest: 59 passed (41 embedder + 18 config, incl. 1 new normalization test)
black --check: 1 file would be left unchanged

CI uses ruff format (not black); re-ran ruff format to match.

chethanuk added 5 commits March 18, 2026 22:13

github-project-automation bot added this to OpenViking project Mar 18, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 18, 2026

chethanuk mentioned this pull request Mar 18, 2026

[Feature]: 支持gemini embedding 2这种多模态向量模型 #695

Open

1 task

qin-ctx requested a review from ZaynJarvis March 19, 2026 03:43

qin-ctx assigned ZaynJarvis Mar 19, 2026

qin-ptr suggested changes Mar 19, 2026

View reviewed changes

style(gemini): reformat test_gemini_embedder.py with ruff format

1586f65

CI uses ruff format (not black); re-ran ruff format to match.

chethanuk force-pushed the feat/gemini-landing branch from 5ac0ebc to 1586f65 Compare March 19, 2026 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gemini): add GeminiDenseEmbedder text embedding provider#751

feat(gemini): add GeminiDenseEmbedder text embedding provider#751
chethanuk wants to merge 7 commits intovolcengine:mainfrom
chethanuk:feat/gemini-landing

chethanuk commented Mar 18, 2026 •

edited

Loading

Uh oh!

chethanuk commented Mar 18, 2026

Uh oh!

ZaynJarvis commented Mar 19, 2026

Uh oh!

qin-ptr left a comment

Uh oh!

qin-ptr Mar 19, 2026

Uh oh!

qin-ptr Mar 19, 2026

Uh oh!

qin-ptr Mar 19, 2026

Uh oh!

qin-ptr Mar 19, 2026

Uh oh!

qin-ptr Mar 19, 2026

Uh oh!

chethanuk commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,479 @@
		# Copyright (c) 2026 Beijing Volcano Engine Technology Co., Ltd.

Conversation

chethanuk commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Phased Implementation

Key capabilities

Related Issue

Type of Change

Changes Made

Testing

Checklist

Uh oh!

chethanuk commented Mar 18, 2026

Uh oh!

ZaynJarvis commented Mar 19, 2026

Uh oh!

qin-ptr left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chethanuk commented Mar 19, 2026

@qin-ptr — blocking fixes

@qin-ptr — non-blocking (adopted both)

@ZaynJarvis — #702 pattern + #718

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chethanuk commented Mar 18, 2026 •

edited

Loading