Directory overview generation can build oversized prompts and destabilize VLM calls

## Summary
When OpenViking generates directory `.overview.md` content for large folders, the overview prompt can grow without a clear input budget. In practice this can produce extremely large prompts, trigger VLM timeouts, and even leave the upstream VLM service in a degraded or semi-broken state for subsequent requests.

This is not just a performance issue. It can break the memory / semantic generation pipeline for otherwise valid data.

## Environment
- Repository: OpenViking
- Branch base used for testing: local `main` rebased from upstream `main`
- Deployment: local source checkout + systemd service
- Storage backend: local AGFS + local vectordb
- VLM provider: OpenAI-compatible endpoint behind a local router
- Model route: `ov-llm`
- Relevant config during mitigation:
  - `auto_generate_l1=false` (workaround currently in use)
  - `vlm.max_concurrent=1`

## Actual behavior
For large directories, `_generate_overview()` assembles the overview prompt by concatenating:
- all file summaries in the directory
- all child directory abstracts

without a strong budget guard.

That can create very large prompts. In our case, the prompt reached approximately **117954 tokens** before the VLM call. The result was:
- VLM timeout / failure on overview generation
- the `ov-llm` path becoming unstable for follow-up requests
- semantic generation and related memory workflows becoming unreliable until the service recovered

## Expected behavior
Directory overview generation should remain bounded and fail-safe:
- large folders should not generate unbounded prompts
- one oversized folder should not destabilize later VLM calls
- OpenViking should degrade gracefully when overview input exceeds a safe threshold

## Reproduction
A general reproduction path is:
1. Prepare a directory with many files and/or many child directories.
2. Ensure semantic processing attempts to generate L1 overview content for that directory.
3. Use file summaries / child abstracts large enough that the assembled prompt becomes very large.
4. Trigger semantic generation.
5. Observe timeout / failure in the VLM call and degraded behavior afterward.

## Suspected root cause
The issue appears to come from the overview assembly logic in:
- `openviking/storage/queuefs/semantic_processor.py`
- method: `_generate_overview()`

The current logic concatenates the full list of file summaries and child directory abstracts into one prompt. For sufficiently large directories, this creates an oversized request.

## Why this matters
This is especially problematic because overview generation is an internal maintenance / semantic task. Users may hit it indirectly while doing normal memory or resource workflows, but the failure mode is severe enough to affect unrelated follow-up requests.

## Current workaround
We mitigated the problem locally by disabling automatic directory overview generation:
- `auto_generate_l1=false`

This avoids the problematic path, but it is a workaround rather than a product-level fix.

## Suggested fixes
I think OpenViking should consider one or more of the following:

1. **Input budget guard before VLM call**
   - estimate prompt size before calling the model
   - if over budget, truncate or summarize inputs first

2. **Hierarchical / staged overview generation**
   - summarize file summaries in batches
   - summarize child abstracts in batches
   - then combine the batch summaries into the final overview

3. **Configurable hard limits**
   - max number of file summaries included
   - max number of child abstracts included
   - max characters or tokens per overview input

4. **Fail-safe degradation**
   - if the budget is exceeded, skip L1 generation and keep L0 only
   - or generate a short fallback overview instead of failing the whole path

5. **Observability**
   - log prompt size / estimated tokens for overview generation
   - log when budget guards trigger

## Additional note
Even if a bounded assembly strategy is added, I still think the system should protect the VLM path from a single oversized semantic task causing downstream instability.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directory overview generation can build oversized prompts and destabilize VLM calls #529

Summary

Environment

Actual behavior

Expected behavior

Reproduction

Suspected root cause

Why this matters

Current workaround

Suggested fixes

Additional note

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Directory overview generation can build oversized prompts and destabilize VLM calls #529

Description

Summary

Environment

Actual behavior

Expected behavior

Reproduction

Suspected root cause

Why this matters

Current workaround

Suggested fixes

Additional note

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions