-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
我们采用OV 0.2.6,embedding模型采用bce-embedding-base_v1-f16.gguf,这个embeddding模型max输入长度是512,physical batch size是1024,当openclaw对话结束后,还会持续几次embedding调度,但这几次调用的输入tokens会最长到几K,如8129,会报下面的错
Steps to Reproduce
Bug Info:
2026-03-18 11:02:50,993 - openviking.storage.queuefs.semantic_processor - INFO - Completed semantic generation for: viking://user/man/memories/preferences
2026-03-18 11:02:52,458 - openviking.storage.collection_schemas - ERROR - Error processing embedding message: OpenAI API error: Error code: 500 - {'error': {'code': 500, 'message': 'input (8129 tokens) is too large to process. increase the physical batch size (current batch size: 1024)', 'type': 'server_error'}}
Traceback (most recent call last):
File "/usr/local/lib64/python3.11/site-packages/openviking/models/embedder/openai_embedders.py", line 99, in embed
response = self.client.embeddings.create(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/resources/embeddings.py", line 132, in create
return self._post(
^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1294, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1067, in request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'error': {'code': 500, 'message': 'input (8129 tokens) is too large to process. increase the physical batch size (current batch size: 1024)', 'type': 'server_error'}}
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib64/python3.11/site-packages/openviking/storage/collection_schemas.py", line 197, in on_dequeue
result: EmbedResult = await asyncio.to_thread(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib64/python3.11/site-packages/openviking/models/embedder/openai_embedders.py", line 104, in embed
raise RuntimeError(f"OpenAI API error: {e.message}") from e
RuntimeError: OpenAI API error: Error code: 500 - {'error': {'code': 500, 'message': 'input (8129 tokens) is too large to process. increase the physical batch size (current batch size: 1024)', 'type': 'server_error'}}Expected Behavior
No Error.
Actual Behavior
ERROR
Minimal Reproducible Example
Error Logs
OpenViking Version
0.2.6
Python Version
3.11.0
Operating System
Linux
Model Backend
Other
Additional Context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
In progress