-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem Statement
we are seeing many PR supporting non-symmetrics embedding
#608 #624 #635 #636 etc.
Proposed Solution
My suggestion for now is
- don't give config options
- carefully check embedding context is right (query_embedder used in query scenario only, not miss used)
- set context in embedder or embedding request contexual_embedder(context) -> XXXEmbedder(context)
- for Embedder that supports non-symmetric, map context to static value of [type, input_type, task etc.]. if changing it is required, use envvar, don't bloat config file with provider specific settings.
Alternatives Considered
No response
Feature Area
Model Integration
Use Case
to make long term (task_type [query / document]) support for embedding models: Jina, OpenAI, Minimax, Gemini etc.
Example API (Optional)
> set in request level, is embedder is used in both query & document context
return OpenAIDenseEmbedder(
model_name=self.dense.model,
api_key=self.dense.api_key,
api_base=self.dense.api_base,
dimension=self.dense.dimension,
context=context,
)
...
return JinaDenseEmbedder(
model_name=self.dense.model,
api_key=self.dense.api_key,
api_base=self.dense.api_base,
dimension=self.dense.dimension,
context=context,
)
---
in XXXEmbedder impl:
class OpenAIDenseEmbedder:
...
input_type = "x" if context == "query" else "y"
class JinaDenseEmbedder:
...
task = "a" if context == "query" else "b"
would this be better?Additional Context
No response
Contribution
- I am willing to contribute to implementing this feature
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Type
Projects
Status
Backlog