-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Summary
When OpenViking generates directory .overview.md content for large folders, the overview prompt can grow without a clear input budget. In practice this can produce extremely large prompts, trigger VLM timeouts, and even leave the upstream VLM service in a degraded or semi-broken state for subsequent requests.
This is not just a performance issue. It can break the memory / semantic generation pipeline for otherwise valid data.
Environment
- Repository: OpenViking
- Branch base used for testing: local
mainrebased from upstreammain - Deployment: local source checkout + systemd service
- Storage backend: local AGFS + local vectordb
- VLM provider: OpenAI-compatible endpoint behind a local router
- Model route:
ov-llm - Relevant config during mitigation:
auto_generate_l1=false(workaround currently in use)vlm.max_concurrent=1
Actual behavior
For large directories, _generate_overview() assembles the overview prompt by concatenating:
- all file summaries in the directory
- all child directory abstracts
without a strong budget guard.
That can create very large prompts. In our case, the prompt reached approximately 117954 tokens before the VLM call. The result was:
- VLM timeout / failure on overview generation
- the
ov-llmpath becoming unstable for follow-up requests - semantic generation and related memory workflows becoming unreliable until the service recovered
Expected behavior
Directory overview generation should remain bounded and fail-safe:
- large folders should not generate unbounded prompts
- one oversized folder should not destabilize later VLM calls
- OpenViking should degrade gracefully when overview input exceeds a safe threshold
Reproduction
A general reproduction path is:
- Prepare a directory with many files and/or many child directories.
- Ensure semantic processing attempts to generate L1 overview content for that directory.
- Use file summaries / child abstracts large enough that the assembled prompt becomes very large.
- Trigger semantic generation.
- Observe timeout / failure in the VLM call and degraded behavior afterward.
Suspected root cause
The issue appears to come from the overview assembly logic in:
openviking/storage/queuefs/semantic_processor.py- method:
_generate_overview()
The current logic concatenates the full list of file summaries and child directory abstracts into one prompt. For sufficiently large directories, this creates an oversized request.
Why this matters
This is especially problematic because overview generation is an internal maintenance / semantic task. Users may hit it indirectly while doing normal memory or resource workflows, but the failure mode is severe enough to affect unrelated follow-up requests.
Current workaround
We mitigated the problem locally by disabling automatic directory overview generation:
auto_generate_l1=false
This avoids the problematic path, but it is a workaround rather than a product-level fix.
Suggested fixes
I think OpenViking should consider one or more of the following:
-
Input budget guard before VLM call
- estimate prompt size before calling the model
- if over budget, truncate or summarize inputs first
-
Hierarchical / staged overview generation
- summarize file summaries in batches
- summarize child abstracts in batches
- then combine the batch summaries into the final overview
-
Configurable hard limits
- max number of file summaries included
- max number of child abstracts included
- max characters or tokens per overview input
-
Fail-safe degradation
- if the budget is exceeded, skip L1 generation and keep L0 only
- or generate a short fallback overview instead of failing the whole path
-
Observability
- log prompt size / estimated tokens for overview generation
- log when budget guards trigger
Additional note
Even if a bounded assembly strategy is added, I still think the system should protect the VLM path from a single oversized semantic task causing downstream instability.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status