Feature: Response Strictness Setting for Assistants (RAG Grounding Control)

## Summary

Add a response strictness setting to the assistant create/edit page that controls how strictly the agent should ground its responses in the retrieved knowledge base documents. Presented as a slider with three modes.

## Motivation

Different assistants have different trust requirements. A policy FAQ bot should only answer from its documents and say "I don't know" otherwise. A research assistant should use documents as a starting point but supplement with general knowledge. Giving the assistant creator control over this builds user trust and makes assistants more fit-for-purpose.

## Proposed Behavior

### Three Strictness Modes

| Mode | Label | Behavior |
|---|---|---|
| `strict` | **Only use documents** | Agent must answer exclusively from retrieved knowledge base content. If the answer is not in the documents, it should say so explicitly rather than speculate. |
| `balanced` | **Prefer documents** | Agent uses retrieved documents as the primary source but may supplement with general knowledge when documents are insufficient. Should clearly distinguish between sourced and general information. |
| `flexible` | **Use documents as context** | Agent treats retrieved documents as helpful context alongside its general knowledge. Documents inform but do not constrain the response. |

### Default

`balanced` — matches the current behavior most closely (the existing RAG prompt says "use this information to answer accurately" without strict grounding constraints).

## Implementation

### Data Model

New field on the assistant:

```
ragStrictness: STRING (enum: "strict" | "balanced" | "flexible", default: "balanced")
```

**`Assistant`** model (`shared/assistants/models.py`):
```python
rag_strictness: Literal["strict", "balanced", "flexible"] = Field("balanced", alias="ragStrictness")
```

Add to `CreateAssistantRequest`, `UpdateAssistantRequest`, and `AssistantResponse` similarly.

### RAG Prompt Modification

The strictness mode changes the instruction text in `augment_prompt_with_context()` (`shared/assistants/rag_service.py`). Currently the function hardcodes:

```
The following context is retrieved from the assistant's knowledge base.
Use this information to answer the user's question accurately and comprehensively.
```

This becomes mode-dependent:

**`strict`:**
```
The following context is retrieved from the assistant's knowledge base.
You MUST answer ONLY using the provided context. Do not use any outside knowledge.
If the answer cannot be found in the context below, respond with:
"I don't have enough information in my knowledge base to answer that question."
Do not speculate or infer beyond what is explicitly stated in the context.
```

**`balanced`:**
```
The following context is retrieved from the assistant's knowledge base.
Use this information as your primary source to answer the user's question accurately.
You may supplement with general knowledge when the context is insufficient,
but clearly prioritize the provided documents.
```

**`flexible`:**
```
The following context is retrieved from the assistant's knowledge base.
Use this information as helpful context alongside your general knowledge
to provide the most comprehensive and accurate answer possible.
```

### Where the Strictness is Applied

In `inference_api/chat/routes.py`, the assistant is already loaded before `augment_prompt_with_context` is called. Pass the strictness mode through:

```python
augmented_message = augment_prompt_with_context(
    user_message=input_data.message,
    context_chunks=context_chunks,
    strictness=assistant.rag_strictness,  # new parameter
)
```

Same pattern in `app_api/chat/routes.py` if RAG augmentation happens there.

## Frontend Changes

### Assistant Create/Edit Form

- Add a **3-position slider** (or segmented control) labeled "Response Strictness"
- Positions: **Only use documents** | **Prefer documents** | **Use documents as context**
- Default position: center ("Prefer documents")
- Brief helper text below the slider explaining the selected mode:
  - Strict: "The assistant will only answer from its uploaded documents"
  - Balanced: "The assistant will prioritize documents but may use general knowledge"
  - Flexible: "The assistant will use documents as context alongside general knowledge"

### Placement

Below the instructions/system prompt field, near the other assistant behavior settings (alongside the citation toggles from #111).

## Migration

- Existing assistants have no `ragStrictness` attribute in DynamoDB
- Pydantic default of `"balanced"` handles this — existing assistants continue to behave as they do today

## Out of Scope
- Per-conversation strictness override by end users
- Strictness levels beyond the three defined modes
- Custom prompt template editing (could be a future "advanced" option)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Response Strictness Setting for Assistants (RAG Grounding Control) #112

Summary

Motivation

Proposed Behavior

Three Strictness Modes

Default

Implementation

Data Model

RAG Prompt Modification

Where the Strictness is Applied

Frontend Changes

Assistant Create/Edit Form

Placement

Migration

Out of Scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mode	Label	Behavior
`strict`	Only use documents	Agent must answer exclusively from retrieved knowledge base content. If the answer is not in the documents, it should say so explicitly rather than speculate.
`balanced`	Prefer documents	Agent uses retrieved documents as the primary source but may supplement with general knowledge when documents are insufficient. Should clearly distinguish between sourced and general information.
`flexible`	Use documents as context	Agent treats retrieved documents as helpful context alongside its general knowledge. Documents inform but do not constrain the response.

Feature: Response Strictness Setting for Assistants (RAG Grounding Control) #112

Description

Summary

Motivation

Proposed Behavior

Three Strictness Modes

Default

Implementation

Data Model

RAG Prompt Modification

Where the Strictness is Applied

Frontend Changes

Assistant Create/Edit Form

Placement

Migration

Out of Scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions