Context-Engine-AI · m1rl0k · Nov 26, 2025 · Nov 26, 2025 · Nov 26, 2025
diff --git a/README.md b/README.md
@@ -42,44 +42,86 @@ Context-Engine is a plug-and-play MCP retrieval stack that unifies code indexing
 
 > **See [docs/IDE_CLIENTS.md](docs/IDE_CLIENTS.md) for detailed configuration examples.**
 
-## Quickstart (5 minutes)
 
-This gets you from zero to “search works” in under five minutes.
+## Getting Started
 
-1) Prereqs
-- Docker + Docker Compose
-- make (optional but recommended)
-- Node/npm if you want to use mcp-remote (optional)
+### Option 1: Deploy & Connect (Recommended)
 
-2) command (recommended)
+Deploy Context-Engine once, connect any IDE. No need to clone this repo into your project.
+
+**1. Start the stack** (on your dev machine or a server):
 ```bash
-# Provisions tokenizer.json, downloads a tiny llama.cpp model, reindexes, and brings all services up
-INDEX_MICRO_CHUNKS=1 MAX_MICRO_CHUNKS_PER_FILE=200 make reset-dev-dual
+git clone https://github.com/m1rl0k/Context-Engine.git && cd Context-Engine
+docker compose up -d
+```
+
+**2. Index your codebase** (point to any project):
+```bash
+HOST_INDEX_PATH=/path/to/your/project docker compose run --rm indexer
+```
+
+**3. Connect your IDE** — add to your MCP config:
+```json
+{
+  "mcpServers": {
+    "context-engine": { "url": "http://localhost:8001/sse" }
+  }
+}
+```
+
+> See [docs/IDE_CLIENTS.md](docs/IDE_CLIENTS.md) for Cursor, Windsurf, Cline, Codex, and other client configs.
+
+### Option 2: Remote Deployment
+
+Run Context-Engine on a server and connect from anywhere.
+
+**Docker on a server:**
+```bash
+# On server (e.g., context.yourcompany.com)
+git clone https://github.com/m1rl0k/Context-Engine.git && cd Context-Engine
+docker compose up -d
 ```
+
+**Index from your local machine:**
 ```bash
-# Provisions the context-engine for rapid development, 
-HOST_INDEX_PATH=. COLLECTION_NAME=codebase docker compose run --rm indexer --root /work --recreate --no-skip-unchanged
+# VS Code extension (recommended) - install, set server URL, click "Upload Workspace"
+# Or CLI:
+scripts/remote_upload_client.py --server http://context.yourcompany.com:9090 --path /your/project
+```
+
+**Connect IDE to remote:**
+```json
+{ "mcpServers": { "context-engine": { "url": "http://context.yourcompany.com:8001/sse" } } }
 ```
 
-- Default ports: Memory MCP :8000, Indexer MCP :8001, 8003, Qdrant :6333, llama.cpp :8080
-
-**Seamless Setup Note:**
-- The stack uses a **single unified `codebase` collection** by default
-- All your code goes into one collection for seamless cross-repo search
-- No per-workspace fragmentation - search across everything at once
-- Health checks auto-detect and fix cache/collection sync issues
-- Just run `make reset-dev-dual` on any machine and it works™
-
-### Make targets: SSE, RMCP, and dual-compat
-- Legacy SSE only (default):
-  - Ports: 8000 (/sse), 8001 (/sse)
-  - Command: `INDEX_MICRO_CHUNKS=1 MAX_MICRO_CHUNKS_PER_FILE=200 make reset-dev`
-- RMCP (Codex) only:
-  - Ports: 8002 (/mcp), 8003 (/mcp)
-  - Command: `INDEX_MICRO_CHUNKS=1 MAX_MICRO_CHUNKS_PER_FILE=200 make reset-dev-codex`
-- Dual compatibility (SSE + RMCP together):
-  - Ports: 8000/8001 (/sse) and 8002/8003 (/mcp)
-  - Command: `INDEX_MICRO_CHUNKS=1 MAX_MICRO_CHUNKS_PER_FILE=200 make reset-dev-dual`
+**Kubernetes:** See [deploy/kubernetes/README.md](deploy/kubernetes/README.md) for Kustomize deployment.
+
+### Option 3: Full Development Setup
+
+For contributors or advanced customization with LLM decoder:
+
+```bash
+INDEX_MICRO_CHUNKS=1 MAX_MICRO_CHUNKS_PER_FILE=200 make reset-dev-dual
+```
+
+### Default Endpoints
+
+| Service | Port | Use |
+|---------|------|-----|
+| Indexer MCP | 8001 (SSE), 8003 (RMCP) | Code search, context retrieval |
+| Memory MCP | 8000 (SSE), 8002 (RMCP) | Knowledge storage |
+| Qdrant | 6333 | Vector database |
+| llama.cpp | 8080 | Local LLM decoder |
+
+**Stack behavior:**
+- Single `codebase` collection — search across all indexed repos
+- Health checks auto-detect and fix cache/collection sync
+- Live file watching with automatic reindexing
+
+### Transport Modes
+- **SSE** (default): `http://localhost:8001/sse` — Cursor, Cline, Windsurf, Augment
+- **RMCP**: `http://localhost:8003/mcp` — Codex, Qodo
+- **Dual**: Both SSE + RMCP simultaneously (`make reset-dev-dual`)
 
 ### Environment Setup
 
@@ -131,19 +173,17 @@ docker compose up -d --force-recreate mcp_indexer mcp_indexer_http llamacpp
 This re-enables the `llamacpp` container and resets `.env` to `http://llamacpp:8080`.
 
 ### Make targets (quick reference)
-- reset-dev: SSE stack on 8000/8001; seeds Qdrant, downloads tokenizer + tiny llama.cpp model, reindexes, brings up memory + indexer + watcher
-- reset-dev-codex: RMCP stack on 8002/8003; same seeding + bring-up for Codex/Qodo
-- reset-dev-dual: SSE + RMCP together (8000/8001 and 8002/8003)
-- up / down / logs / ps: Docker Compose lifecycle helpers
-- index / reindex / reindex-hard: Index current repo; `reindex` recreates the collection; `reindex-hard` also clears the local cache so unchanged files are re-uploaded
-- index-here / index-path: Index arbitrary host path without cloning into this repo
-- watch: Watch-and-reindex on file changes
-- warm / health: Warm caches and run health checks
-- hybrid / rerank: Example hybrid search + reranker helper
-- setup-reranker / rerank-local / quantize-reranker: Manage ONNX reranker assets and local runs
-- prune / prune-path: Remove stale points (missing files or hash mismatch)
-- llama-model / tokenizer: Fetch tiny GGUF model and tokenizer.json
-- qdrant-status / qdrant-list / qdrant-prune / qdrant-index-root: Convenience wrappers that route through the MCP bridge to inspect or maintain collections
+- **Setup**: `reset-dev`, `reset-dev-codex`, `reset-dev-dual` - Full stack with SSE, RMCP, or both
+- **Lifecycle**: `up`, `down`, `logs`, `ps`, `restart`, `rebuild`
+- **Indexing**: `index`, `reindex`, `reindex-hard`, `index-here`, `index-path`
+- **Watch**: `watch` (local), `watch-remote` (upload to remote server)
+- **Maintenance**: `prune`, `prune-path`, `warm`, `health`, `decoder-health`
+- **Search**: `hybrid`, `rerank`, `rerank-local`
+- **LLM**: `llama-model`, `tokenizer`, `llamacpp-up`, `setup-reranker`, `quantize-reranker`
+- **MCP Tools**: `qdrant-status`, `qdrant-list`, `qdrant-prune`, `qdrant-index-root`
+- **Remote**: `dev-remote-up`, `dev-remote-down`, `dev-remote-bootstrap`
+- **Router**: `route-plan`, `route-run`, `router-eval`, `router-smoke`
+- **CLI**: `ctx Q="your question"` - Prompt enhancement with repo context
 
 
 ### CLI: ctx prompt enhancer

diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
@@ -122,6 +122,12 @@ Context Engine is a production-ready MCP (Model Context Protocol) retrieval stac
 - **Local LLM Integration**: llama.cpp for offline expansion
 - **Caching**: Expanded query results cached for reuse
 
+#### MCP Router (`scripts/mcp_router.py`)
+- **Intent Classification**: Determines which MCP tool to call based on query
+- **Tool Orchestration**: Routes to search, answer, memory, or index tools
+- **HTTP Execution**: Executes tools via RMCP/HTTP without extra dependencies
+- **Plan Mode**: Preview tool selection without execution
+
 ## Data Flow Architecture
 
 ### Search Request Flow

diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md
@@ -73,15 +73,22 @@ Context-Engine/
 ├── scripts/                    # Core application code
 │   ├── mcp_memory_server.py   # Memory MCP server implementation
 │   ├── mcp_indexer_server.py  # Indexer MCP server implementation
+│   ├── mcp_router.py          # Intent-based tool routing
 │   ├── hybrid_search.py       # Search algorithm implementation
+│   ├── ctx.py                 # CLI prompt enhancer
 │   ├── cache_manager.py       # Unified caching system
 │   ├── async_subprocess_manager.py  # Process management
 │   ├── deduplication.py       # Request deduplication
 │   ├── semantic_expansion.py  # Query expansion
-│   ├── utils.py              # Shared utilities
-│   ├── ingest_code.py        # Code indexing logic
-│   ├── watch_index.py        # File system watcher
-│   └── logger.py             # Structured logging
+│   ├── collection_health.py   # Cache/collection sync checks
+│   ├── utils.py               # Shared utilities
+│   ├── ingest_code.py         # Code indexing logic
+│   ├── watch_index.py         # File system watcher
+│   ├── upload_service.py      # Remote upload HTTP service
+│   ├── remote_upload_client.py # Remote sync client
+│   ├── memory_backup.py       # Memory export
+│   ├── memory_restore.py      # Memory import
+│   └── logger.py              # Structured logging
 ├── tests/                     # Test suite
 │   ├── conftest.py           # Test configuration
 │   ├── test_*.py            # Unit and integration tests

diff --git a/docs/IDE_CLIENTS.md b/docs/IDE_CLIENTS.md
@@ -1,20 +1,39 @@
 # IDE & Client Configuration
 
-Configuration examples for connecting various IDEs and MCP clients to Context Engine.
+Connect your IDE to a running Context-Engine stack. No need to clone this repo into your project.
 
 **Documentation:** [README](../README.md) · [Configuration](CONFIGURATION.md) · [IDE Clients](IDE_CLIENTS.md) · [MCP API](MCP_API.md) · [ctx CLI](CTX_CLI.md) · [Memory Guide](MEMORY_GUIDE.md) · [Architecture](ARCHITECTURE.md) · [Multi-Repo](MULTI_REPO_COLLECTIONS.md) · [Kubernetes](../deploy/kubernetes/README.md) · [VS Code Extension](vscode-extension.md) · [Troubleshooting](TROUBLESHOOTING.md) · [Development](DEVELOPMENT.md)
 
 ---
 
 **On this page:**
+- [Quick Start](#quick-start)
 - [Supported Clients](#supported-clients)
 - [SSE Clients](#sse-clients-port-80008001)
 - [RMCP Clients](#rmcp-clients-port-80028003)
 - [Mixed Transport](#mixed-transport-examples)
+- [Remote Server](#remote-server)
 - [Verification](#verification)
 
 ---
 
+## Quick Start
+
+**Prerequisites:** Context-Engine running somewhere (localhost, remote server, or Kubernetes).
+
+**Minimal config** — add to your IDE's MCP settings:
+```json
+{
+  "mcpServers": {
+    "context-engine": { "url": "http://localhost:8001/sse" }
+  }
+}
+```
+
+Replace `localhost` with your server IP/hostname for remote setups.
+
+---
+
 ## Supported Clients
 
 | Client | Transport | Notes |
@@ -169,6 +188,31 @@ url = "http://127.0.0.1:8003/mcp"
 
 ---
 
+## Remote Server
+
+When Context-Engine runs on a remote server (e.g., `context.yourcompany.com`):
+
+```json
+{
+  "mcpServers": {
+    "context-engine": { "url": "http://context.yourcompany.com:8001/sse" }
+  }
+}
+```
+
+**Indexing your local project to the remote server:**
+```bash
+# Using VS Code extension (recommended)
+# Install vscode-context-engine, configure server URL, click "Upload Workspace"
+
+# Using CLI
+scripts/remote_upload_client.py --server http://context.yourcompany.com:9090 --path /your/project
+```
+
+> See [docs/MULTI_REPO_COLLECTIONS.md](MULTI_REPO_COLLECTIONS.md) for multi-repo and Kubernetes deployment.
+
+---
+
 ## Important Notes for IDE Agents
 
 - **Do not send null values** to MCP tools. Omit the field or pass an empty string "" instead.

diff --git a/docs/MEMORY_GUIDE.md b/docs/MEMORY_GUIDE.md
@@ -167,5 +167,14 @@ Different hash lengths for different workspace types:
 
 ## Backup and Migration
 
-For production-grade backup/migration strategies, see the official Qdrant documentation for snapshots and export/import. For local development, rely on Docker volumes and reindexing when needed.
+### Memory Backup/Restore Scripts
+
+```bash
+# Export memories to JSON
+python scripts/memory_backup.py --collection codebase --output memories.json
 
+# Restore memories from backup
+python scripts/memory_restore.py --input memories.json --collection codebase
+```
+
+For production-grade backup/migration strategies, see the official Qdrant documentation for snapshots and export/import. For local development, rely on Docker volumes and reindexing when needed.
diff --git a/docs/vscode-extension.md b/docs/vscode-extension.md
@@ -7,19 +7,74 @@ Context Engine Uploader extension for automatic workspace sync and Prompt+ integ
 ---
 
 **On this page:**
+- [Quick Start](#quick-start)
 - [Features](#features)
+- [Workflow Examples](#workflow-examples)
 - [Installation](#installation)
 - [Configuration](#configuration)
 - [Commands](#commands-and-lifecycle)
 
 ---
 
+## Quick Start
+
+1. **Install**: Build the `.vsix` and install in VS Code (see [Installation](#installation))
+2. **Configure server**: Settings → `contextEngineUploader.endpoint` → `http://localhost:9090` (or remote server)
+3. **Index workspace**: Click status bar button or run `Context Engine Uploader: Start`
+4. **Use Prompt+**: Select code, click `Prompt+` in status bar to enhance with AI
+
 ## Features
 
 - **Auto-sync**: Force sync on startup + watch mode keeps your workspace indexed
 - **Prompt+ button**: Status bar button to enhance selected text with unicorn mode
 - **Output channel**: Real-time logs for force-sync and watch operations
 - **GPU decoder support**: Configure llama.cpp, Ollama, or GLM as decoder backend
+- **Remote server support**: Index to any Context-Engine server (local, remote, Kubernetes)
+
+## Workflow Examples
+
+### Local Development
+Context-Engine running on same machine:
+```
+Endpoint: http://localhost:9090
+Target Path: (leave empty - uses current workspace)
+```
+Open any project → extension auto-syncs → MCP tools have your code context.
+
+### Remote Server
+Context-Engine on a team server:
+```
+Endpoint: http://context.yourcompany.com:9090
+Target Path: /Users/you/projects/my-app
+```
+Your local code is indexed to the shared server. Team members search across all indexed repos.
+
+### Multi-Project Workflow
+Index multiple projects to the same server:
+1. Open Project A → auto-syncs to `codebase` collection
+2. Open Project B → auto-syncs to same collection
+3. MCP tools search across both projects seamlessly
+
+### Prompt+ Enhancement
+1. Select code or write a prompt in your editor
+2. Click `Prompt+` in status bar (or run command)
+3. Extension runs `ctx.py --unicorn` with your selection
+4. Enhanced prompt replaces selection with code-grounded context
+
+**Example input:**
+```
+Add error handling to the upload function
+```
+
+**Example output:**
+```
+Looking at upload_service.py lines 120-180, the upload_file() function currently lacks error handling. Add try/except blocks to handle:
+1. Network timeouts (requests.exceptions.Timeout)
+2. Invalid file paths (FileNotFoundError)
+3. Server errors (HTTP 5xx responses)
+
+Reference the existing error patterns in remote_upload_client.py lines 45-67 which use structured logging via logger.error().
+```
 
 ## Installation
 
@@ -85,3 +140,29 @@ All settings live under `Context Engine Uploader` in the VS Code settings UI or
 - `Context Engine Uploader: Prompt+ (Unicorn Mode)` — runs `scripts/ctx.py --unicorn` on your current selection and replaces it with the enhanced prompt (status bar button).
 
 The extension logs all subprocess output to the **Context Engine Upload** output channel so you can confirm uploads without leaving VS Code. The watch process shuts down automatically when VS Code exits or when you run the Stop command.
+
+## Troubleshooting
+
+### Extension not syncing
+1. Check **Context Engine Upload** output channel for errors
+2. Verify `endpoint` setting points to running upload service
+3. Ensure Python 3.8+ is available at configured `pythonPath`
+
+### Prompt+ not working
+1. Verify decoder is running: `curl http://localhost:8081/health`
+2. Check `decoderUrl` setting matches your decoder (llama.cpp, Ollama, or GLM)
+3. For GPU decoder: enable `useGpuDecoder` setting
+
+### Connection refused
+```bash
+# Verify upload service is running
+curl http://localhost:9090/health
+
+# Check Docker logs
+docker compose logs upload_service
+```
+
+### Remote server issues
+1. Ensure port 9090 is accessible from your machine
+2. Check firewall rules allow inbound connections
+3. Verify server's `upload_service` container is running