feat: add configurable RAG context retrieval settings for assistants by gdmrino · Pull Request #231 · eneo-ai/eneo

gdmrino · 2026-02-17T23:34:25Z

Changes

Add per-assistant settings to control how much knowledge is retrieved during RAG.
Users can choose between four modes:

Default (50% of model context)
Automatic relevance-based filtering (using autocut from v1)
Custom percentage of context window
Fixed number of chunks

Backend:

New RagContextType enum and rag_context_type/rag_context_value fields
on the Assistant domain entity, API models, DB table, and repository
_get_retrieval_params() method replaces hardcoded chunk calculation
Thread autocut_cutoff through ReferencesService for relevance mode
Alembic migration adding columns to assistants table

Frontend:

New RagContextSettings Svelte component with mode selector and
value inputs (percentage slider, chunk count, or auto-relevance)
Wired into the assistant edit page with change tracking and revert
i18n strings for English and Swedish
TypeScript schema types updated for the new fields

Why

Previously, retrieval during RAG was fixed to use 50% of the model context window, which did not provide enough flexibility for different use cases. This was especially limiting when working with open or smaller-context models, where consuming half the context for retrieval can significantly reduce space available for the actual response.

Testing

Tested both backend and frontend behavior:

Backend:

Verified new database fields are created correctly via migration
Confirmed each RagContextType produces the expected retrieval parameters
Ensured default behavior remains unchanged for existing assistants
Validated relevance mode correctly threads the autocut cutoff through retrieval

Frontend:

Confirmed all four modes render and switch correctly in the assistant editor
Verified percentage slider, chunk count input, and relevance mode settings persist after save
Checked revert/change-tracking behavior works as expected

End-to-end:

Created assistants using each mode and confirmed retrieval size matches configuration during queries

Screenshots

Add per-assistant settings to control how much knowledge is retrieved during RAG. Users can choose between four modes: - Default (50% of model context) - Automatic relevance-based filtering (using autocut) - Custom percentage of context window - Fixed number of chunks Backend: - New RagContextType enum and rag_context_type/rag_context_value fields on the Assistant domain entity, API models, DB table, and repository - _get_retrieval_params() method replaces hardcoded chunk calculation - Thread autocut_cutoff through ReferencesService for relevance mode - Alembic migration adding columns to assistants table Frontend: - New RagContextSettings Svelte component with mode selector and value inputs (percentage slider, chunk count, or auto-relevance) - Wired into the assistant edit page with change tracking and revert - i18n strings for English and Swedish - TypeScript schema types updated for the new fields

gdmrino added 2 commits February 18, 2026 00:28

Update RagContextSettings.svelte

bc8d9f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add configurable RAG context retrieval settings for assistants#231

feat: add configurable RAG context retrieval settings for assistants#231
gdmrino wants to merge 2 commits intoeneo-ai:developfrom
gdmrino:feat/rag-ctx-settings-pr

gdmrino commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gdmrino commented Feb 17, 2026

Changes

Why

Testing

Screenshots

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants