Welcome to your new TanStack app!
To run this application:
pnpm install
pnpm startTo build this application for production:
pnpm buildThis project uses Vitest for testing. You can run the tests with:
pnpm testThis project uses Tailwind CSS for styling.
This project uses Biome for linting and formatting. The following scripts are available:
pnpm lint
pnpm format
pnpm check- Set the
VITE_WORKOS_CLIENT_IDin your.env.local.
- Set the
VITE_CONVEX_URLandCONVEX_DEPLOYMENTenvironment variables in your.env.local. (Or runnpx convex initto set them automatically.) - Run
npx convex devto start the Convex server.
Add components using the latest version of Shadcn.
pnpx shadcn@latest add button- You can use T3Env to add type safety to your environment variables.
- Add Environment variables to the
src/env.mjsfile. - Use the environment variables in your code.
import { env } from "@/env";
console.log(env.VITE_APP_TITLE);Next-Generation Document Retrieval: 3-5x Faster & 60-70% Cheaper Than Traditional RAG
Production-ready intelligent search platform that goes beyond traditional RAG with adaptive compression, multi-modal OCR, speculative execution, hybrid vector+graph storage, and real-time streaming. Built with human-in-the-loop learning, multi-model orchestration, and continuous fine-tuning.
- Static Retrieval: Single-pass embedding lookup, no reasoning
- Token Waste: Retrieves full documents even when only snippets needed
- No Validation: Blindly trusts retrieved information
- Fixed Context: Limited by embedding window size
- Single-Modal: Text only, struggles with images/tables/diagrams
- No Learning: Cannot improve from user feedback
- Slow: Sequential retrieve → rank → generate pipeline
- Dynamic Reasoning: Multi-step think→act→think cycles with query segmentation
- Adaptive Compression: 10x OCR compression with DeepSeek Vision for multimodal understanding
- Adversarial Validation: ADD discriminators ensure quality before serving results
- Hierarchical Context: Multiple compression levels (paragraph, section, document)
- Multi-Modal: Images, PDFs, tables, charts via DeepSeek Vision OCR
- Self-Improving: Logs every interaction for continuous fine-tuning
- Parallel Execution: Speculative prefetching + concurrent segment processing
- Hybrid Storage: LanceDB vectors + knowledge graphs + BM25 keyword search
- Real-Time Streaming: Progressive results while still processing
- Speed: 3-5x faster via parallel execution + speculative prefetching
- Cost: 60-70% reduction through adaptive compression + smart model routing
- Accuracy: Higher quality via adversarial validation + human feedback
- Context: 10-15x more effective context through hierarchical compression
✅ All Critical Bugs Fixed:
- TanStack devtools menu removed
- CSRF 403 errors resolved with token endpoint
- Infinite Ollama detection loop fixed with useRef guard
- Hydration warnings suppression turned off.
# Required for Convex backend
VITE_CONVEX_URL=your_convex_deployment_url
CONVEX_DEPLOYMENT=your_convex_deployment
# Required for WorkOS authentication
VITE_WORKOS_CLIENT_ID=your_workos_client_id
# Optional: AI providers (or use UI to configure encrypted keys)
ANTHROPIC_API_KEY=your_anthropic_api_key
OPENAI_API_KEY=your_openai_api_key
# Optional: Error tracking
VITE_SENTRY_DSN=your_sentry_dsn- 🔍 Real-Time Visibility: Watch every search step as it happens - see which sources are being queried, documents being pulled, and AI reasoning in real-time
- ⏯️ Full Control: Pause, resume, or stop searches at any time. Adjust scope mid-execution to refine results
- 🎛️ Dynamic Scope Adjustment: Enable/disable sources (Firecrawl, academic, news), change result limits, toggle reasoning features - all while searching
- 📊 Progress Tracking: View tokens used, documents found, confidence scores, and execution time for each step
- ✅ Step Approval: Review and approve individual steps for critical searches. Modify parameters based on intermediate results
- 🔐 Secure API Keys: All API keys stored in Convex with server-side encryption, never exposed to browser
Replaces RAG Black Box: Unlike traditional RAG systems, you can see and control exactly what's being researched and pulled. Perfect for understanding AI decision-making and ensuring quality results.
- 🎯 Multi-Modal OCR: DeepSeek Vision processes images, tables, charts, diagrams
- 📦 Adaptive Compression: Content-aware (legal: 3-5x, news: 10-15x, code: 2-3x)
- 🌳 Hierarchical Storage: Summaries at paragraph, section, and document levels
- 🔍 Hybrid Search: LanceDB vectors + BM25 keywords + knowledge graphs
- ⚡ Speculative Execution: Prefetch likely documents during reasoning
- 📊 Incremental Indexing: Only re-index changed document sections
- 🎭 Layout-Aware Extraction: Preserves tables, columns, annotations
- 🔄 Progressive Loading: Stream OCR results as they process
- 🧠 Query Segmentation: Break complex queries into parallel sub-tasks
- 🔗 Interleaved Reasoning: Think→Act→Think cycles with context compression
- 🎯 Intent Prediction: Start processing before user finishes typing
- 📝 Query Rewriting: Spelling correction, entity recognition, expansion
- 🌍 Multi-Language: Translate queries for better coverage
- 🎪 Confidence Routing: Escalate to stronger models only when needed
- 🔄 Ensemble Predictions: Combine multiple models for critical queries
- 🧠 Interactive Segmentation: AI proposes query segments, user approves/modifies before execution
- 📊 Search History Browsing: View past searches, results, quality scores, re-run queries
- 🎯 Reasoning Step Control: Validate AI reasoning at each step, provide corrections
- 📈 Training Data Collection: All interactions stored for model fine-tuning
- 🔄 Comparison Dashboard: Side-by-side results from different segment approaches
- 🏷️ Result Annotations: Tag and comment on findings
- 📋 Search Templates: Save and share successful patterns
- 🤖 13 Providers: OpenAI, Anthropic, Google, DeepSeek, Moonshot, Kimi, Ollama, LM Studio, vLLM, GGUF, ONNX, Azure
- 🔍 Auto-Detection: Finds local Ollama models at localhost:11434
- 🔐 Encrypted Keys: Web Crypto API + Convex encrypted storage (AES-256-GCM)
- ⚡ Smart Routing: Dynamic model selection per segment type
- 📝 Rich Formatting: Markdown with syntax highlighting, streaming responses
- 🎯 Model Escalation: Automatically upgrade to stronger models on failure
- 💾 Semantic Caching: Match similar queries, not just exact duplicates
- 🗄️ Multi-Tier: Memory → Redis → LanceDB → S3
- 🔄 Partial Results: Cache segment results independently
- ⏱️ Smart Expiration: Keep frequently accessed items longer
- 📊 LanceDB Vectors: 100x faster than traditional vector databases
- 🕸️ Knowledge Graphs: Entity relationships and semantic connections
- 📈 Temporal Indexing: Track how information changes over time
- 🌊 Progressive Enhancement: Show results as they arrive
- 🎭 Stream-First: Display incomplete but useful information immediately
- 🛑 User Interruption: Stop/redirect mid-search
- 📊 Live Metrics: Token usage, confidence scores, execution time
- 🔒 CSRF Protection: HttpOnly cookies with X-CSRF-Token headers
- 🔑 Key Encryption: Client-side Web Crypto API + server-side Convex backup
- 📊 Quality Metrics: ADD discriminator scores, user approval rates
- 🎭 PII Detection: Automatic anonymization of sensitive data
- 🚨 Sentry Monitoring: Error tracking and performance monitoring
- 📈 LangSmith Integration: Deep observability with LangSmith tracing
- 🔭 OpenTelemetry: Distributed tracing across all services
- Frontend: TanStack Start, TanStack Router, TanStack Store
- Backend: Convex (real-time database + functions)
- Vector Storage: LanceDB (fast hybrid search with SQL)
- Caching: Redis (multi-tier with semantic matching)
- Knowledge Graphs: Neo4j or similar for relationships
- Object Storage: AWS S3 for raw documents
- Auth: WorkOS (enterprise SSO)
- Styling: Tailwind CSS + Shadcn/ui
- AI Providers: OpenAI, Anthropic, Google, DeepSeek, Moonshot, Kimi, Ollama, LM Studio, vLLM, GGUF, ONNX, Azure
- Observability: Sentry + LangSmith + OpenTelemetry
- Deployment: Cloudflare Pages at mikepfunk.com
- EnhancedModelSelector: Multi-provider model selection with auto-detection
- AgenticChat: Chat interface with CSRF protection and streaming
- SegmentApprovalModal: Interactive UI for approving AI query segments (pending)
- SearchHistoryPage: Browse past searches with results (pending)
- ComparisonDashboard: Side-by-side segment results (pending)
For complete system architecture, see SYSTEM_ARCHITECTURE.md
Key Docs:
- CHANGELOG - Recent bug fixes and system updates
- Project Plan - Full roadmap and current status
- System Architecture - Complete technical design
- Convex Schema - Database tables for human-in-the-loop learning
This project uses TanStack Router. The initial setup is a file based router. Which means that the routes are managed as files in src/routes.
To add a new route to your application just add another a new file in the ./src/routes directory.
TanStack will automatically generate the content of the route file for you.
Now that you have two routes you can use a Link component to navigate between them.
To use SPA (Single Page Application) navigation you will need to import the Link component from @tanstack/react-router.
import { Link } from "@tanstack/react-router";Then anywhere in your JSX you can use it like so:
<Link to="/about">About</Link>This will create a link that will navigate to the /about route.
More information on the Link component can be found in the Link documentation.
In the File Based Routing setup the layout is located in src/routes/__root.tsx. Anything you add to the root route will appear in all the routes. The route content will appear in the JSX where you use the <Outlet /> component.
Here is an example layout that includes a header:
import { Outlet, createRootRoute } from '@tanstack/react-router'
import { TanStackRouterDevtools } from '@tanstack/react-router-devtools'
import { Link } from "@tanstack/react-router";
export const Route = createRootRoute({
component: () => (
<>
<header>
<nav>
<Link to="/">Home</Link>
<Link to="/about">About</Link>
</nav>
</header>
<Outlet />
<TanStackRouterDevtools />
</>
),
})The <TanStackRouterDevtools /> component is not required so you can remove it if you don't want it in your layout.
More information on layouts can be found in the Layouts documentation.
There are multiple ways to fetch data in your application. You can use TanStack Query to fetch data from a server. But you can also use the loader functionality built into TanStack Router to load the data for a route before it's rendered.
For example:
const peopleRoute = createRoute({
getParentRoute: () => rootRoute,
path: "/people",
loader: async () => {
const response = await fetch("https://swapi.dev/api/people");
return response.json() as Promise<{
results: {
name: string;
}[];
}>;
},
component: () => {
const data = peopleRoute.useLoaderData();
return (
<ul>
{data.results.map((person) => (
<li key={person.name}>{person.name}</li>
))}
</ul>
);
},
});Loaders simplify your data fetching logic dramatically. Check out more information in the Loader documentation.
React-Query is an excellent addition or alternative to route loading and integrating it into you application is a breeze.
First add your dependencies:
pnpm add @tanstack/react-query @tanstack/react-query-devtoolsNext we'll need to create a query client and provider. We recommend putting those in main.tsx.
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
// ...
const queryClient = new QueryClient();
// ...
if (!rootElement.innerHTML) {
const root = ReactDOM.createRoot(rootElement);
root.render(
<QueryClientProvider client={queryClient}>
<RouterProvider router={router} />
</QueryClientProvider>
);
}You can also add TanStack Query Devtools to the root route (optional).
import { ReactQueryDevtools } from "@tanstack/react-query-devtools";
const rootRoute = createRootRoute({
component: () => (
<>
<Outlet />
<ReactQueryDevtools buttonPosition="top-right" />
<TanStackRouterDevtools />
</>
),
});Now you can use useQuery to fetch your data.
import { useQuery } from "@tanstack/react-query";
import "./App.css";
function App() {
const { data } = useQuery({
queryKey: ["people"],
queryFn: () =>
fetch("https://swapi.dev/api/people")
.then((res) => res.json())
.then((data) => data.results as { name: string }[]),
initialData: [],
});
return (
<div>
<ul>
{data.map((person) => (
<li key={person.name}>{person.name}</li>
))}
</ul>
</div>
);
}
export default App;You can find out everything you need to know on how to use React-Query in the React-Query documentation.
Another common requirement for React applications is state management. There are many options for state management in React. TanStack Store provides a great starting point for your project.
First you need to add TanStack Store as a dependency:
pnpm add @tanstack/storeNow let's create a simple counter in the src/App.tsx file as a demonstration.
import { useStore } from "@tanstack/react-store";
import { Store } from "@tanstack/store";
import "./App.css";
const countStore = new Store(0);
function App() {
const count = useStore(countStore);
return (
<div>
<button onClick={() => countStore.setState((n) => n + 1)}>
Increment - {count}
</button>
</div>
);
}
export default App;One of the many nice features of TanStack Store is the ability to derive state from other state. That derived state will update when the base state updates.
Let's check this out by doubling the count using derived state.
import { useStore } from "@tanstack/react-store";
import { Store, Derived } from "@tanstack/store";
import "./App.css";
const countStore = new Store(0);
const doubledStore = new Derived({
fn: () => countStore.state * 2,
deps: [countStore],
});
doubledStore.mount();
function App() {
const count = useStore(countStore);
const doubledCount = useStore(doubledStore);
return (
<div>
<button onClick={() => countStore.setState((n) => n + 1)}>
Increment - {count}
</button>
<div>Doubled - {doubledCount}</div>
</div>
);
}
export default App;We use the Derived class to create a new store that is derived from another store. The Derived class has a mount method that will start the derived store updating.
Once we've created the derived store we can use it in the App component just like we would any other store using the useStore hook.
You can find out everything you need to know on how to use TanStack Store in the TanStack Store documentation.
Issue: Unwanted settings panel appearing on page
Solution: Removed TanStackDevtools component from __root.tsx, added suppressHydrationWarning
Issue: POST /api/chat failing with 403 due to missing CSRF token
Solution:
- Created
/api/csrf-tokenGET endpoint to set HttpOnly cookie - Modified
useCsrfTokenhook to fetch token on mount - Added
isReadystate toAgenticChatto disable chat until token ready
Issue: http://localhost:11434/api/tags fetching repeatedly in infinite loop
Solution:
- Wrapped
modelOptionsinuseMemoto prevent recreation - Added
useRefguard (hasDetected) to ensure detection runs once - Updated
useEffectdependency array correctly
Files prefixed with demo can be safely deleted. They are there to provide a starting point for you to play around with the features you've installed.