Skip to content

[Feature]: About non-symmetric embedding support #655

@ZaynJarvis

Description

@ZaynJarvis

Problem Statement

we are seeing many PR supporting non-symmetrics embedding
#608 #624 #635 #636 etc.

Proposed Solution

My suggestion for now is

  1. don't give config options
  2. carefully check embedding context is right (query_embedder used in query scenario only, not miss used)
  3. set context in embedder or embedding request contexual_embedder(context) -> XXXEmbedder(context)
  4. for Embedder that supports non-symmetric, map context to static value of [type, input_type, task etc.]. if changing it is required, use envvar, don't bloat config file with provider specific settings.

Alternatives Considered

No response

Feature Area

Model Integration

Use Case

to make long term (task_type [query / document]) support for embedding models: Jina, OpenAI, Minimax, Gemini etc.

Example API (Optional)

> set in request level, is embedder is used in both query & document context
            return OpenAIDenseEmbedder(
                model_name=self.dense.model,
                api_key=self.dense.api_key,
                api_base=self.dense.api_base,
                dimension=self.dense.dimension,
                context=context,
            )
            ...
            return JinaDenseEmbedder(
                model_name=self.dense.model,
                api_key=self.dense.api_key,
                api_base=self.dense.api_base,
                dimension=self.dense.dimension,
                context=context,
            )
---

in XXXEmbedder impl:

  class OpenAIDenseEmbedder:
    ...
    input_type = "x" if context == "query" else "y"


  class JinaDenseEmbedder:
    ...
    task  = "a" if context == "query" else "b"


would this be better?

Additional Context

No response

Contribution

  • I am willing to contribute to implementing this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions