Problem
All queries go through the full search + RAG pipeline, even ones that don't need web search
(e.g., "explain recursion", "what is 2+2", "hello").
Expected Behavior
Classify intent before routing:
| Intent |
Action |
| Factual / Current Events |
Full search + RAG |
| Conceptual / Definitional |
LLM-only (skip search) |
| Math / Code |
Structured generation |
| Greeting / Chitchat |
Lightweight response |
| Ambiguous |
Search with low confidence flag |
Why It Matters
- Reduces unnecessary SearXNG calls (critical since we self-host)
- Faster response for non-search queries
- Lower Groq token usage
Suggested Approach
- Use
llama-3.1-8b-instant as a fast intent classifier (single prompt, JSON output)
- Gate the search pipeline on intent result
Labels
enhancement performance good first issue
Problem
All queries go through the full search + RAG pipeline, even ones that don't need web search
(e.g., "explain recursion", "what is 2+2", "hello").
Expected Behavior
Classify intent before routing:
Why It Matters
Suggested Approach
llama-3.1-8b-instantas a fast intent classifier (single prompt, JSON output)Labels
enhancementperformancegood first issue