feat: context-window guard for LLM document context by Copilot · Pull Request #133 · johannehouweling/ToxTempAssistant

Copilot · 2026-04-25T17:07:11Z

Uploading too many files silently produced empty answers: oversized context caused the LLM API to raise BadRequestError, which the generic exception handler caught and returned "" — no user-visible indication anything went wrong.

Changes

filehandling.py — two new utilities:
- estimate_token_count(text) — tiktoken (cl100k_base) with len // 4 char-based fallback
- truncate_context_to_token_limit(text, max_tokens) -> (str, bool) — proportional trim with a _TRUNCATION_SAFETY_MARGIN = 0.95 buffer; appends a visible [... context truncated ...] marker in the returned text
__init__.py — two new config values replacing a blanket token limit:
- context_window_headroom_tokens = 10_000 — tokens reserved for system prompts, per-question instructions, and the model's output
- context_window_fallback_tokens = 128_000 — conservative fallback used when the active model has no context-window tag
views.py:
- process_llm_async — resolves the active model's context_window tag (set via AZURE_E{n}_TAGS_{TAG}=...,context-window:131072,...) and computes the PDF context budget as model_context_window - headroom; falls back to fallback - headroom with a log warning when the tag is absent; on truncation writes a non-error warning to assay.status_context (visible in the UI) advising the user to upload fewer/shorter files
- generate_answer — adds a dedicated except BadRequestError block using logger.exception so any API-level rejection is logged with a full traceback rather than silently swallowed
tests/test_context_window.py — unit tests for both utilities; integration tests asserting that process_llm_async populates status_context when over-limit (fallback path and per-model path) and leaves it clean when context fits

# views.py — budget derived from per-model context-window tag
_context_budget = model_entry.context_window - config.context_window_headroom_tokens
full_pdf_context, was_truncated = truncate_context_to_token_limit(
    full_pdf_context, _context_budget
)
if was_truncated:
    add_status_context(assay, "Uploaded documents exceeded the available context-window budget …", is_error=False)
    assay.save()

- Add estimate_token_count() and truncate_context_to_token_limit() helpers in filehandling.py (use tiktoken with char-based fallback) - Add Config.context_window_max_tokens = 100,000 in __init__.py - In process_llm_async: proactively truncate full_pdf_context and write a non-error warning into assay.status_context when truncation occurs - In generate_answer: handle openai.BadRequestError explicitly with a logger.error call so context-length failures are logged, not silently swallowed - Add test_context_window.py with unit tests for the utilities and integration tests for the warning behaviour Agent-Logs-Url: https://github.com/johannehouweling/ToxTempAssistant/sessions/8064d2c3-5696-4c13-9f7c-6f0d4adf28f1 Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

- Extract _TRUNCATION_SAFETY_MARGIN = 0.95 named constant in filehandling.py - Log tiktoken fallback in estimate_token_count (ImportError vs generic) - Use logger.exception (with traceback) for BadRequestError in generate_answer Agent-Logs-Url: https://github.com/johannehouweling/ToxTempAssistant/sessions/8064d2c3-5696-4c13-9f7c-6f0d4adf28f1 Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

johannehouweling · 2026-04-25T23:25:07Z

we now have a context-window flag on the .env model TAGS, so we have access to per-model context-window information.
Hence instead of defining a blanket threshold context-window for context pdfs what we should do is define a headroom for overhead like prompt and instrcutions and then leave context-window minus headroom for the pdf context files..

johannehouweling · 2026-04-25T23:28:40Z

@copilot can you update the PR to take into account the newest changes to the repo in #138, we now have context-window information per model accessible in the .env model TAGS. so instead of hard-coding context-window size in init.py we have to define a head-room token size that reservers space in the context window (for prompt, output etc) and the difference we can give to the pdf context

…-window-upload Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

Replace blanket context_window_max_tokens with per-model budget derived from ModelEntry.context_window (set via the TAGS context-window: key) minus a configurable headroom. Fall back to context_window_fallback_tokens when the active model has no context-window tag. - __init__.py: replace context_window_max_tokens with context_window_headroom_tokens (10k) and context_window_fallback_tokens (128k) - views.py: resolve ModelEntry for llm_model key, compute budget as context_window - headroom; fall back to fallback - headroom with a warning - Merge upstream changes from main (ModelEntry.context_window property, context-window KNOWN_TAG_KEYS entry, context_window in llm_signature) - tests: update integration tests to patch the new config fields and add a test that exercises the per-model context-window resolution path Agent-Logs-Url: https://github.com/johannehouweling/ToxTempAssistant/sessions/b1bcef97-dba0-4c5d-998a-8d322da78b35 Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

Agent-Logs-Url: https://github.com/johannehouweling/ToxTempAssistant/sessions/b1bcef97-dba0-4c5d-998a-8d322da78b35 Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

Copilot · 2026-04-25T23:34:16Z

@copilot can you update the PR to take into account the newest changes to the repo in #138, we now have context-window information per model accessible in the .env model TAGS. so instead of hard-codin...

Done in commit 8e9d022. The blanket context_window_max_tokens has been replaced with two new config values:

context_window_headroom_tokens = 10_000 — tokens reserved for system prompts, per-question instructions, and the model's output
context_window_fallback_tokens = 128_000 — conservative fallback used when the active model has no context-window tag

In process_llm_async, the budget is now computed per-model:

budget = model_entry.context_window - config.context_window_headroom_tokens

If the model has no context-window tag (or the key can't be resolved), it falls back to fallback - headroom and logs a warning. The .env.dummy already shows context-window:131072 as an example, e.g.:

AZURE_E1_TAGS_GPT4O=...,context-window:131072,...

Initial plan

e2e180a

Copilot AI assigned Copilot and johannehouweling Apr 25, 2026

Copilot started work on behalf of johannehouweling April 25, 2026 17:07 View session

Copilot AI linked an issue Apr 25, 2026 that may be closed by this pull request

Context window check #11

Open

Copilot AI and others added 2 commits April 25, 2026 17:13

Copilot AI changed the title ~~[WIP] Investigate context window behavior on multiple PDF uploads~~ feat: context-window guard for LLM document context Apr 25, 2026

Copilot AI requested a review from johannehouweling April 25, 2026 17:16

Copilot finished work on behalf of johannehouweling April 25, 2026 17:16

Copilot started work on behalf of johannehouweling April 25, 2026 23:28 View session

Copilot AI and others added 3 commits April 25, 2026 23:29

Merge remote-tracking branch 'origin/main' into copilot/check-context…

309202c

…-window-upload Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

refactor: move get_azure_model import to module level in views.py

8e9d022

Agent-Logs-Url: https://github.com/johannehouweling/ToxTempAssistant/sessions/b1bcef97-dba0-4c5d-998a-8d322da78b35 Co-authored-by: johannehouweling <95692173+johannehouweling@users.noreply.github.com>

Copilot finished work on behalf of johannehouweling April 25, 2026 23:35

johannehouweling marked this pull request as ready for review April 25, 2026 23:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: context-window guard for LLM document context#133

feat: context-window guard for LLM document context#133
Copilot wants to merge 6 commits intomainfrom
copilot/check-context-window-upload

Copilot AI commented Apr 25, 2026 •

edited

Loading

Uh oh!

johannehouweling commented Apr 25, 2026

Uh oh!

johannehouweling commented Apr 25, 2026

Uh oh!

Copilot AI commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

johannehouweling commented Apr 25, 2026

Uh oh!

johannehouweling commented Apr 25, 2026

Uh oh!

Copilot AI commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 25, 2026 •

edited

Loading