diff --git a/.claude/skills/rpg/SKILL.md b/.claude/skills/rpg/SKILL.md
index 760d7c8..01c31df 100644
--- a/.claude/skills/rpg/SKILL.md
+++ b/.claude/skills/rpg/SKILL.md
@@ -1,10 +1,37 @@
 ---
 name: rpg
-description: 'Build and query semantic code graphs using RPG-Encoder. Use when the user wants to understand codebase structure, search for code by intent, explore dependencies, analyze change impact, or perform semantic lifting.'
+description: 'Build and query semantic code graphs using RPG-Encoder. Use BEFORE grep/cat/find for any question about code structure, behavior, relationships, impact, dependencies, or cross-file patterns.'
 allowed-tools: Bash(rpg-encoder *), Read, Glob, Grep
 ---
 
-# RPG-Encoder CLI Skill
+# RPG-Encoder — USE BEFORE grep/cat/find
+
+## Reach for RPG first, shell commands second
+
+Whenever the user asks about code structure, behavior, relationships, impact,
+dependencies, or cross-file patterns, reach for RPG commands BEFORE falling
+back to `grep`, `cat`, `find`, `wc`, or `Read`. RPG is indexed, semantically
+organized, and answers structural questions in one call that would otherwise
+require dozens of chained text searches.
+
+| If you'd otherwise... | Use this instead |
+|---|---|
+| `grep -r` by intent | `rpg-encoder search "describe what it does"` |
+| `grep -r` by name | `rpg-encoder search "name" --mode snippets` |
+| `cat file` for a function | `rpg-encoder fetch "file:func"` |
+| chained greps for callers | `rpg-encoder explore ENTITY --direction up` |
+| chained greps for callees | `rpg-encoder explore ENTITY --direction down` |
+| `wc -l` / `find` / `tree` | `rpg-encoder info` |
+| reading many files | Use the MCP `semantic_snapshot` tool if available |
+
+Fall back to `grep` / `cat` / `Read` only when the query is about literal text
+(string search, comments, TODOs, log messages) — not structure or semantics.
+
+If you have the RPG MCP server connected, prefer its tools (`search_node`,
+`fetch_node`, `explore_rpg`, `impact_radius`, `plan_change`, `semantic_snapshot`,
+`context_pack`) over the CLI — they're faster and return structured data.
+
+---
 
 You have access to `rpg-encoder`, a CLI tool that builds semantic code graphs (Repository Planning Graphs) from any codebase. Use it to understand code structure, search by intent, trace dependencies, and perform autonomous semantic lifting.
 
diff --git a/.gemini/extensions/rpg/CONTEXT.md b/.gemini/extensions/rpg/CONTEXT.md
index e661f1e..4891809 100644
--- a/.gemini/extensions/rpg/CONTEXT.md
+++ b/.gemini/extensions/rpg/CONTEXT.md
@@ -2,6 +2,19 @@
 
 RPG-Encoder builds semantic code graphs (Repository Planning Graphs) for AI-assisted code understanding.
 
+## Use RPG before grep / cat / find
+
+For any question about code structure, behavior, relationships, impact, or
+cross-file patterns, reach for RPG tools (MCP or CLI) before shell commands.
+RPG answers structural questions in one call that would otherwise require
+dozens of chained text searches.
+
+Fall back to grep / cat / file reads only when the query is about literal
+text (string search, comments, TODOs, log messages).
+
+See the MCP server instructions for the full mapping of shell patterns to
+RPG tools — it's loaded automatically when the extension is active.
+
 ## CLI Commands
 
 - `rpg-encoder build` — Index codebase, build graph (run once)
diff --git a/.gemini/extensions/rpg/commands/rpg-lift.toml b/.gemini/extensions/rpg/commands/rpg-lift.toml
index 96ddf43..b50ec62 100644
--- a/.gemini/extensions/rpg/commands/rpg-lift.toml
+++ b/.gemini/extensions/rpg/commands/rpg-lift.toml
@@ -18,5 +18,5 @@ The lift pipeline:
 
 Progress is saved after each batch. If interrupted, re-running continues from where it stopped.
 
-After completion, run `rpg-encoder info` to show the updated graph statistics.
+After completion, run `rpg-encoder info` to show the updated graph statistics. If an MCP RPG server is currently connected to your editor, also call its `reload_rpg` tool so it picks up the new graph from disk.
 """
diff --git a/.gemini/extensions/rpg/gemini-extension.json b/.gemini/extensions/rpg/gemini-extension.json
index 6d05a36..fb5093e 100644
--- a/.gemini/extensions/rpg/gemini-extension.json
+++ b/.gemini/extensions/rpg/gemini-extension.json
@@ -1,6 +1,6 @@
 {
   "name": "rpg-encoder",
-  "version": "0.8.2",
+  "version": "0.8.3",
   "description": "Build and query semantic code graphs (Repository Planning Graphs) for AI-assisted code understanding. Provides entity search, dependency exploration, and autonomous LLM-driven semantic lifting.",
   "mcpServers": {
     "rpg-encoder": {
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7d9c6a0..178a222 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,185 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/),
 and this project adheres to [Semantic Versioning](https://semver.org/).
 
+## [0.8.3] - 2026-04-14
+
+### Added
+
+- `lifting_status` now tracks stale-feature drift across calls. A
+  persistent per-server set records entities whose source was modified
+  after they were lifted; the dashboard reports `stale_features: N
+  entities modified since last lift` and the NEXT STEP state machine
+  prompts re-lift even when coverage is 100%.
+- `get_entities_for_lifting(scope="*")` now returns stale entities
+  alongside unlifted ones, so a single call covers both "never lifted"
+  and "lifted-but-outdated" work.
+- `lifting_status` emits a sub-agent dispatch recommendation when the
+  remaining work (unlifted + stale) is ≥100 entities. The response
+  contains `LOOP` / `DISPATCH` / `FALLBACK` blocks so callers delegate
+  directly without first loading a batch of source into their own
+  context.
+- `get_entities_for_lifting` batch-0 emits a one-line dispatch hint
+  when ≥10 token-aware batches are queued, pointing back to
+  `lifting_status` for the full recommendation.
+- `submit_lift_results` NEXT action is now scale-aware — when remaining
+  work ≥100 entities, it redirects the caller to `lifting_status`
+  instead of encouraging another foreground batch.
+- `build_rpg` response now emits an action-oriented NEXT STEP directing
+  the agent to lift immediately. Small scope lifts inline; large scope
+  dispatches a sub-agent with the LOOP pattern embedded.
+- New `USE RPG FIRST` top section in `server_instructions.md` with a
+  mapping table from shell patterns (`grep -r`, `cat`, `find`, `wc -l`,
+  chained greps) to the RPG tool that replaces them.
+- New `DRIFT MAINTENANCE` section in `server_instructions.md`
+  explaining the three auto-sync notice variants and framing re-lift
+  as part of definition-of-done for any task that wrote code.
+- Tool descriptions for `search_node`, `fetch_node`, `explore_rpg`,
+  `rpg_info`, `semantic_snapshot`, `context_pack`, `impact_radius`,
+  `plan_change`, `analyze_health`, `detect_cycles`, `find_paths`, and
+  `slice_between` now open with a `PREFER THIS OVER ...` marker naming
+  the shell command or workflow they replace.
+- `.claude/skills/rpg/SKILL.md` and `README.md` carry the same
+  RPG-first mapping table as the server prompt.
+- Crate-visible `LARGE_SCOPE_ENTITIES` (100) and `LARGE_SCOPE_BATCHES`
+  (10) constants replace duplicated magic numbers across
+  `server.rs`/`tools.rs` with doc comments describing the
+  heuristic-vs-authoritative relationship.
+- Canonical lock-order invariant documented on the `RpgServer` struct
+  so reviewers don't have to re-derive it from scattered call sites.
+- `reload_config_with_warning` helper on `RpgServer` that distinguishes
+  missing `.rpg/config.toml` (silent default) from malformed TOML
+  (stderr warning, keeps previous in-memory config).
+
+### Changed
+
+- `lifting_status` NEXT STEP is runtime-neutral. No specific runtime's
+  dispatch syntax appears in the response; callers use whatever
+  sub-agent or cheaper-model mechanism their runtime exposes. Explicit
+  fallbacks: scoped lifting for callers with no delegation mechanism,
+  and `rpg-encoder lift --provider anthropic|openai` for callers with
+  an API key and no sub-agent support.
+- Batch-size estimates in NEXT STEP messages read from the live
+  `encoding.max_batch_tokens` config instead of a hard-coded `~12K`
+  figure, so the estimate scales when the user overrides the budget.
+- `NEXT STEP:` remains a single parseable line; dispatch detail is
+  emitted in labeled blocks immediately below.
+- Auto-sync notice now prescribes a verb: it distinguishes per-update
+  delta from pre-existing backlog and separates new-entity drift from
+  stale-feature drift, so an agent that commits code and sees the
+  notice is told to re-lift rather than informed of a count.
+- CLI fallback in large-scope guidance is gated to cases with actual
+  unlifted work, with a note that `rpg-encoder lift` does not re-lift
+  stale entities (it filters to entities with no features).
+- `get_routing_candidates` response header no longer includes the
+  graph revision hash — it moved to the NEXT_ACTION block at the
+  bottom. Keeps the stable preamble (instructions + entity table)
+  cache-eligible while still surfacing the revision where the agent
+  needs to read it back.
+- `server_instructions.md` LIFTING FLOW sub-section rewritten and
+  shortened.
+- `set_project_root` on project switch now loads the new project's
+  `.rpg/config.toml` independently; on parse failure it falls back to
+  `RpgConfig::default()` rather than preserving the previous project's
+  config. Same-project `reload_rpg` preserves the previous config on
+  parse failure (different flow, different correctness requirement).
+
+### Fixed
+
+- `lifting_status` previously reported `Graph is complete` as soon as
+  every entity had some features, ignoring stale features from
+  modified sources. The state machine now considers
+  `remaining + stale_features` combined.
+- `get_entities_for_lifting(scope="*")` previously returned zero
+  entities when all drift was stale (features present, sources
+  modified) because `resolve_scope` filters to entities with no
+  features. It now augments the resolved scope with tracked stale
+  entities and routes them through the LLM loop.
+- Auto-sync notice previously conflated per-update delta with global
+  backlog, so a one-line edit on a partially-lifted repo could claim
+  "50 new entities unlifted" when only 1 was actually new.
+- `finalize_lifting` fallback guidance previously said to call it
+  after each scoped subtree. That auto-routes pending entities
+  against incomplete signals and locks the hierarchy early. Guidance
+  now says to call `finalize_lifting` once after all scopes complete.
+- `rpg-encoder lift --provider ...` (CLI fallback) left the MCP
+  server on a stale in-memory graph. All docs that mention the CLI
+  fallback now specify that the caller must call `reload_rpg`
+  afterward.
+- `set_project_root` and `reload_rpg` previously used
+  `unwrap_or_default()` on config loads, collapsing missing-config
+  and malformed-TOML into identical silent behavior.
+- `set_project_root` failed to refresh `self.config` on project
+  switch; the server kept serving the previous project's encoding
+  settings.
+- `lifting_status` large-scope recommendation previously ran off raw
+  unlifted count, before auto-lift had reduced the set. On repos full
+  of trivial entities (getters, setters, constructors) it could
+  recommend delegation for ~0 LLM calls. The large-scope branch now
+  signals likely-large and defers the authoritative check to the
+  post-auto-lift batch count in `get_entities_for_lifting`.
+- `rpg_info` error wording ("No RPG found") was miscited as a
+  friendly status string; corrected to "any RPG tool returns 'No RPG
+  found'".
+- `build_rpg` NEXT STEP and `lifted: X/Y` header previously counted
+  `Module` entities against the unlifted total, while
+  `lifting_status` and `get_entities_for_lifting` exclude them. The
+  two could disagree by hundreds of entities on large codebases,
+  tripping the delegation threshold in `build_rpg` when
+  `lifting_status` would still recommend foreground lifting. Both
+  paths now use `lifting_coverage()` (non-module) for the count, and
+  the header reads `liftable_entities: X/Y`.
+- `submit_lift_results` previously emitted `DONE` as soon as coverage
+  reached 100%, which could terminate a stale-only re-lift loop after
+  batch 1 while later batches were still queued. The NEXT/DONE
+  branch now counts unlifted + stale remaining.
+- `update_rpg` now feeds `summary.modified_entity_ids` into the
+  stale-tracking set so its `needs_relift: N` reply aligns with what
+  `lifting_status` and `get_entities_for_lifting(scope="*")` report.
+- Server startup auto-update now feeds `modified_entity_ids` into
+  the stale-tracking set and seeds the auto-sync changeset hash for a
+  clean workdir. Previously modifications between the last lift and
+  a session restart silently dropped off the dashboard.
+- `auto_lift` on a non-`*` scope now drains stale entries for every
+  in-scope entity. The pipeline freshens features for each
+  regardless of existing state, so stale markers for those IDs are
+  invalid after the call; the unconditional drain also handles the
+  identical-features case where a cosmetic edit re-lifts to the
+  same output.
+- Auto-lifted features for entities previously flagged stale now
+  drain the stale-tracking set inline in
+  `get_entities_for_lifting`, so the count doesn't inflate when the
+  auto-lifter writes fresh features directly.
+- `reload_rpg` now prunes the stale-tracking set against the newly
+  loaded graph rather than clearing it wholesale. The CLI / isolated
+  sub-agent re-lift flow only refreshes entities with no features —
+  stale entities survive it, so clearing would let `lifting_status`
+  report 100% coverage while re-lift work remained.
+- `reload_rpg` drift-tracking reset now sits on the success path,
+  after `storage::load` returns `Ok`. Transient read errors no longer
+  wipe the backlog while leaving the previous graph in memory.
+- `build_rpg` now prunes the stale-tracking set against the newly
+  built graph so dead entity IDs don't accumulate across rebuilds.
+- `build_semantic_hierarchy` sharded init no longer acquires
+  `hierarchy_session.write()` before `graph.read()`. The original
+  order formed a deadlock cycle with `update_rpg`'s
+  graph-then-session order under concurrent scheduling. The init
+  path now collapses decision + snapshot into a single
+  `hierarchy_session.write()` held under `graph.read()` and packages
+  the work into an `Action` enum so there's no peek-then-trust.
+- `build_batch_0_domain_discovery` and `build_cluster_batch` now take
+  `&RPGraph` and (where applicable) clusters as parameters instead
+  of re-reading `self.graph` / `self.hierarchy_session`. Closes two
+  TOCTOU windows: a session-clear race where a concurrent
+  `build_rpg`/`update_rpg` could panic on `session.as_mut().unwrap()`,
+  and a graph-replace race where a concurrent `set_project_root`
+  could panic on `graph.as_ref().unwrap()`.
+- `set_project_root` tool description is no longer Claude-Code-specific
+  in its example; reads runtime-neutral.
+- `get_entities_for_lifting` batch-0 dispatch NOTE no longer
+  references "batches 2..N" (off-by-one against the 0-based
+  `batch_index` parameter). Reads "do not request further batches in
+  this context".
+
 ## [0.8.2] - 2026-04-14
 
 ### Added
diff --git a/Cargo.lock b/Cargo.lock
index dff9862..3e069de 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -2679,7 +2679,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-cli"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "anyhow",
  "chrono",
@@ -2700,7 +2700,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-core"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "anyhow",
  "chrono",
@@ -2716,7 +2716,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-encoder"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "anyhow",
  "chrono",
@@ -2736,7 +2736,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-lift"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "globset",
  "indicatif 0.18.3",
@@ -2753,7 +2753,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-mcp"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "anyhow",
  "globset",
@@ -2774,7 +2774,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-nav"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "anyhow",
  "criterion",
@@ -2793,7 +2793,7 @@ dependencies = [
 
 [[package]]
 name = "rpg-parser"
-version = "0.8.2"
+version = "0.8.3"
 dependencies = [
  "anyhow",
  "criterion",
diff --git a/Cargo.toml b/Cargo.toml
index 209878e..2e12004 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -11,7 +11,7 @@ members = [
 ]
 
 [workspace.package]
-version = "0.8.2"
+version = "0.8.3"
 edition = "2024"
 license = "MIT"
 authors = ["userFRM"]
diff --git a/README.md b/README.md
index 38c6ef0..4823aa3 100644
--- a/README.md
+++ b/README.md
@@ -37,6 +37,8 @@ Then open any repo and tell your agent:
 
 Your agent handles everything: indexes entities (seconds), reads each function and adds intent-level features (a few minutes), organizes them into a semantic hierarchy, and commits `.rpg/graph.json` for your team.
 
+For repos with ~100+ entities, `lifting_status` will tell your agent to delegate the lifting loop to a sub-agent or a cheaper model — feature extraction is pattern-matching, not novel reasoning. If your runtime has no sub-agent mechanism, run `rpg-encoder lift --provider anthropic|openai` from the terminal with an API key — the CLI drives an external LLM directly with no agent involvement. After the CLI finishes, call `reload_rpg` in your session to load the updated graph. The CLI lifts entities with no features; re-lifting stale entities (features present but outdated after code changes) is handled by the in-session MCP flow, not the CLI.
+
 Once lifted, try:
 
 - *"What handles authentication?"* — finds code even when nothing is named "auth"
@@ -45,6 +47,30 @@ Once lifted, try:
 
 ---
 
+## Use RPG before `grep`, `cat`, `find`
+
+The server instructions tell your agent to reach for RPG tools FIRST for any
+question about code structure or behavior. That reflex matters — `grep`, `cat`,
+and ad-hoc file reads burn tokens and miss semantic relationships RPG already
+knows.
+
+| If you'd otherwise reach for... | Use this instead |
+|---|---|
+| `grep -r` / `rg` (by intent) | `search_node(query="...")` |
+| `grep -r` / `rg` (by name) | `search_node(query="...", mode="snippets")` |
+| `cat` / reading a function | `fetch_node(entity_id="file:name")` |
+| chained greps for callers/callees | `explore_rpg(entity_id="...", direction="...")` |
+| recursive grep for "what depends on X" | `impact_radius(entity_id="...")` |
+| `wc -l` / `find` / `tree` | `rpg_info` |
+| reading many files for context | `semantic_snapshot` |
+| manual search → fetch → explore chains | `context_pack(query="...")` |
+| "how do I refactor X safely" | `plan_change(goal="...")` |
+
+Fall back to `grep`, `cat`, or file reads only when the query is about literal text
+(string search, comments, TODOs, log messages) — not about structure.
+
+---
+
 ## How It Works
 
 <p align="center">
diff --git a/crates/rpg-mcp/src/hierarchy_helpers.rs b/crates/rpg-mcp/src/hierarchy_helpers.rs
index 4fc775b..5f0ee31 100644
--- a/crates/rpg-mcp/src/hierarchy_helpers.rs
+++ b/crates/rpg-mcp/src/hierarchy_helpers.rs
@@ -1,18 +1,26 @@
 //! Helper functions for sharded hierarchy construction workflow.
 
 use crate::server::RpgServer;
-use rpg_core::graph::{EntityKind, normalize_path};
+use rpg_core::graph::{EntityKind, RPGraph, normalize_path};
 
 impl RpgServer {
-    /// Build batch 0: Domain discovery from representative files across all clusters
+    /// Build batch 0: Domain discovery from representative files across all clusters.
+    ///
+    /// Takes `graph` and `clusters` as parameters rather than re-reading
+    /// `self.graph` / `self.hierarchy_session` inside this function. The
+    /// caller holds the relevant locks at decision time and passes
+    /// snapshots through; that closes two races:
+    ///   1. session-clear: a concurrent build_rpg/update_rpg could clear
+    ///      hierarchy_session between the caller's decision and our re-read.
+    ///   2. graph-replace: a concurrent set_project_root could swap
+    ///      self.graph to `None` between the caller's decision and our
+    ///      re-read, panicking on `unwrap()`.
     pub(crate) async fn build_batch_0_domain_discovery(
         &self,
-        total_clusters: usize,
+        graph: &RPGraph,
+        clusters: &[rpg_encoder::hierarchy::FileCluster],
     ) -> Result<String, String> {
-        let guard = self.graph.read().await;
-        let graph = guard.as_ref().unwrap();
-        let session_guard = self.hierarchy_session.read().await;
-        let session = session_guard.as_ref().unwrap();
+        let total_clusters = clusters.len();
 
         let root = self.project_root().await;
         let repo_name = root
@@ -22,7 +30,7 @@ impl RpgServer {
 
         // Collect representative files from all clusters
         let mut representative_features = String::new();
-        for cluster in &session.clusters {
+        for cluster in clusters {
             for file in &cluster.representatives {
                 // Find Module entity for this file
                 for entity in graph.entities.values() {
@@ -79,17 +87,21 @@ impl RpgServer {
         Ok(output)
     }
 
-    /// Build file assignment batch for a specific cluster
+    /// Build file assignment batch for a specific cluster.
+    ///
+    /// Takes `graph` as a parameter for the same reasons as
+    /// `build_batch_0_domain_discovery` — the caller holds graph.read()
+    /// at decision time and passes the reference through so a concurrent
+    /// `set_project_root` can't swap `self.graph` to `None` and panic us
+    /// during rendering.
     pub(crate) async fn build_cluster_batch(
         &self,
+        graph: &RPGraph,
         batch_num: usize,
         total_batches: usize,
         cluster: &rpg_encoder::hierarchy::FileCluster,
         functional_areas: &[String],
     ) -> Result<String, String> {
-        let guard = self.graph.read().await;
-        let graph = guard.as_ref().unwrap();
-
         let root = self.project_root().await;
         let repo_name = root
             .file_name()
diff --git a/crates/rpg-mcp/src/main.rs b/crates/rpg-mcp/src/main.rs
index 1ee462b..abd1a7b 100644
--- a/crates/rpg-mcp/src/main.rs
+++ b/crates/rpg-mcp/src/main.rs
@@ -9,6 +9,26 @@ mod server;
 mod tools;
 mod types;
 
+/// Entity count above which `lifting_status` and similar dashboards switch to
+/// recommending sub-agent delegation. **This is a heuristic gate, not the
+/// authoritative dispatch decision.** The authoritative signal is the
+/// batch-0 NOTE in `get_entities_for_lifting`, which sees the post-auto-lift
+/// queue and uses the actual token-aware batch count. With user-tuned
+/// `max_batch_tokens` or unusually small/large entities, the two can
+/// diverge — when they do, the batch-0 NOTE wins. Both messages defer to
+/// each other: dashboard says "check the NOTE", batch-0 NOTE is silent
+/// when delegation isn't warranted.
+pub(crate) const LARGE_SCOPE_ENTITIES: usize = 100;
+
+/// Batch count above which `get_entities_for_lifting` emits the batch-0
+/// dispatch note. Derived from `LARGE_SCOPE_ENTITIES` assuming ~10 entities
+/// per token-aware batch at default config (batch_size=25,
+/// max_batch_tokens=8000). Kept as a separate constant because the
+/// auto-lifter shrinks the LLM-needed set before batching, so the ratio is
+/// conservative. Authoritative for the dispatch decision (see
+/// `LARGE_SCOPE_ENTITIES` for why).
+pub(crate) const LARGE_SCOPE_BATCHES: usize = 10;
+
 use anyhow::Result;
 use rmcp::ServiceExt;
 use rpg_core::storage;
@@ -73,6 +93,21 @@ async fn main() -> Result<()> {
                     Ok(s) => {
                         graph.metadata.paradigms = paradigm_names;
                         let _ = storage::save(&server.project_root().await, graph);
+                        // Persist stale entity IDs from the startup sync so
+                        // lifting_status sees them on the first query. Every
+                        // other path that produces a summary feeds
+                        // `modified_entity_ids` into `stale_entity_ids`
+                        // (`auto_sync_if_stale`, `update_rpg`). The startup
+                        // path is the one exception — without this, modified
+                        // entities from between the last lift and this startup
+                        // are silently dropped across the session boundary.
+                        {
+                            let mut stale = server.stale_entity_ids.write().await;
+                            for id in &s.modified_entity_ids {
+                                stale.insert(id.clone());
+                            }
+                            stale.retain(|id| graph.entities.contains_key(id));
+                        }
                         eprintln!(
                             "  Auto-update complete: +{} -{} ~{}",
                             s.entities_added, s.entities_removed, s.entities_modified
@@ -83,9 +118,17 @@ async fn main() -> Result<()> {
             } else {
                 eprintln!("  Graph is up to date.");
             }
-            // Seed auto-sync HEAD so the first query doesn't redundantly re-sync
+            // Seed auto-sync markers for a clean post-startup workdir so
+            // the first query short-circuits at server.rs's (HEAD,
+            // changeset) match instead of redundantly re-running the
+            // workdir diff. Must use the real empty-workdir changeset
+            // hash (not an empty string) for the match to fire.
+            let project_root = server.project_root().await;
             *server.last_auto_sync_head.write().await =
-                rpg_encoder::evolution::get_head_sha(&server.project_root().await).ok();
+                rpg_encoder::evolution::get_head_sha(&project_root).ok();
+            *server.last_auto_sync_changeset.write().await =
+                Some(RpgServer::compute_changeset_hash(&[], &project_root));
+            *server.last_auto_sync_workdir_paths.write().await = std::collections::HashSet::new();
         }
     }
 
diff --git a/crates/rpg-mcp/src/prompts/server_instructions.md b/crates/rpg-mcp/src/prompts/server_instructions.md
index 6375bcc..4610b25 100644
--- a/crates/rpg-mcp/src/prompts/server_instructions.md
+++ b/crates/rpg-mcp/src/prompts/server_instructions.md
@@ -1,6 +1,65 @@
 RPG-Encoder: Repository Planning Graph — semantic understanding of any codebase.
 No API keys or local LLMs needed. YOU are the LLM — you analyze the code directly.
 
+## USE RPG FIRST — BEFORE grep / cat / find / file-reads
+
+Any user question about code structure, behavior, relationships, impact,
+dependencies, or cross-file patterns — reach for RPG tools BEFORE falling back
+to shell commands or file reads. RPG is indexed, semantically organized, and
+gives one-call answers to questions that would otherwise require dozens of
+chained greps.
+
+| If you'd otherwise reach for... | Use this instead |
+|---|---|
+| `grep -r` / `rg` (by intent) | `search_node(query="...")` — finds code by what it DOES |
+| `grep -r` / `rg` (by name/path) | `search_node(query="...", mode="snippets")` |
+| `cat file` / reading a function | `fetch_node(entity_id="file:name")` |
+| chained greps for "what calls X" | `explore_rpg(entity_id="...", direction="upstream")` |
+| recursive grep for "what depends on X" | `impact_radius(entity_id="...")` — with edge paths |
+| `wc -l` / `find` / `tree` | `rpg_info` — counts, hierarchy, inter-area connectivity |
+| reading the whole repo | `semantic_snapshot` — whole-repo view in one call |
+| multi-step search + read + trace | `context_pack(query="...")` — 1 call instead of 3-5 |
+| "how do I refactor X safely" | `plan_change(goal="...")` — ordered entities + blast radius |
+| "find circular dependencies" | `detect_cycles` |
+| "find god objects / unstable code" | `analyze_health` |
+| "shortest path between A and B" | `find_paths(source, target)` |
+| "minimal subgraph connecting these" | `slice_between(entity_ids=[...])` |
+
+**Fall back to grep / cat / file-reads only when the query is about LITERAL TEXT**
+(string search, comments, TODO markers, log messages, license headers) — not
+about structure or semantics. This holds even if your training predisposes you
+toward shell tools; the RPG is cheaper, more accurate, and more complete for
+every structural question.
+
+If a graph does not exist yet (RPG tools error with messages like "No RPG
+found" or "graph: not built"), run `build_rpg` first. If entities are
+unlifted and the scope is large, see the LIFTING FLOW below for delegation
+guidance.
+
+## DRIFT MAINTENANCE — re-lift after code changes
+
+Lifting is stateful. The auto-sync notice at the top of every navigation
+response tells you when entities have drifted:
+
+- `[auto-synced: ... ; N new entities unlifted ...]` — code was added,
+  semantic features are missing for it. `search_node` and `semantic_snapshot`
+  will not surface those entities.
+- `[auto-synced: ... ; N entities have stale features ...]` — code was
+  modified after lift, the cached features no longer reflect the current
+  source. Search results may mislead.
+- `[auto-synced: ... ; N new + M stale features ...]` — both.
+
+**Treat any drift notice as a deferred re-lift task.** Lifting is part of
+"definition of done" for any change that adds or modifies code, the same
+way running tests is. If you write code, you re-lift before reporting
+complete. If a tool response shows drift you didn't introduce (e.g.,
+external commits), call `lifting_status` for the recommended re-lift
+pattern.
+
+The fastest discipline: at the END of any task that wrote code, dispatch
+a sub-agent to re-lift in the background — it costs ~zero of your context
+and keeps semantic search accurate for the next user request.
+
 ## LIFTING FLOW (step by step)
 
 1. `build_rpg` — index the codebase (if no graph exists)
@@ -82,22 +141,44 @@ Process all batches in the current conversation:
 - Call `finalize_lifting` then `get_files_for_synthesis` + `submit_file_syntheses`.
 - Call `build_semantic_hierarchy` + `submit_hierarchy`.
 
-### Large scope (100+ entities): Dispatch parallel subagents
-A single conversation CANNOT lift a large repo — context will overflow.
-Instead, split the work across fresh subagent conversations:
+### Large scope (100+ entities): Delegate, do not lift directly
+
+Each batch returns a large chunk of source code that stays in your context. At the
+default config ~10 batches is already ~80K tokens burned on grunt work. Feature
+extraction is pattern-matching — a cheaper or delegated model handles it fine.
+
+Delegated worker loop (run in a fresh context):
 
-1. Call `lifting_status` to see per-area coverage and unlifted files.
-2. For each area, dispatch a **foreground** subagent (Task tool, subagent_type="general-purpose"):
-   - Each subagent scope: a file glob like `"crates/rpg-core/**"` or `"src/auth/**"`
-   - Each subagent runs: get_entities_for_lifting -> analyze -> submit_lift_results (loop)
-   - Each subagent gets a FRESH context window — no accumulation across areas
-3. After all subagents complete, call `lifting_status` to verify coverage.
-4. Call `finalize_lifting` then `get_files_for_synthesis` + `submit_file_syntheses`.
-5. Call `build_semantic_hierarchy` + `submit_hierarchy`.
+```
+get_entities_for_lifting(scope="*") -> analyze -> submit_lift_results  (repeat)
+finalize_lifting
+```
 
-**Why subagents?** Each `get_entities_for_lifting` batch returns source code that stays in the
-conversation. After ~300 entities, the context fills up and the chat breaks. Subagents solve
-this by giving each chunk its own fresh context window.
+Use whatever sub-agent or cheaper-model mechanism your runtime exposes. The graph
+persists to disk after every submit, so the worker's writes survive. **After the
+worker returns, call `reload_rpg`** to refresh the caller's in-memory graph —
+required if the runtime gave the worker an isolated MCP session, no-op if it
+shared yours.
+
+Fallbacks when no delegation mechanism is available:
+- **Scoped lifting**: narrow each call, e.g. `get_entities_for_lifting(scope="src/auth/**")`,
+  and submit features per batch. Each scope fits in foreground context. Call
+  `finalize_lifting` ONCE after all scopes are complete — calling it mid-flow
+  auto-routes pending entities against incomplete signals and locks the
+  hierarchy in early.
+- **CLI autonomous lift** (unlifted entities only): `rpg-encoder lift --provider anthropic|openai`
+  uses an external API key directly — no agent subscription involvement. **After the CLI
+  finishes, call `reload_rpg` in this session** so the server picks up the updated
+  `.rpg/graph.json` — otherwise subsequent queries will still see the pre-lift state.
+  Note: the CLI lifts entities with no features; it does not re-lift stale entities
+  (features present but outdated after source edits). For stale-entity re-lifting use
+  the MCP loop above (sub-agent dispatch or foreground scoped lifting).
+
+After delegation returns, call `get_files_for_synthesis` + `submit_file_syntheses`,
+then `build_semantic_hierarchy` + `submit_hierarchy`.
+
+Call `lifting_status` whenever you need the NEXT STEP with a concrete recommendation
+for the current state.
 
 ## ERROR RECOVERY
 
@@ -130,7 +211,7 @@ When using the RPG to understand or navigate a codebase (after lifting is comple
 - Use `fetch_node(fields="features,deps")` to skip source code (~80% smaller output)
 - Use `explore_rpg(format="compact")` for ID-preserving machine-readable rows (enables direct follow-up calls)
 - Use `explore_rpg(max_results=N)` to cap large dependency trees
-- Use `context_pack` instead of search→fetch→explore chains (1 call vs 3-5, ~44% fewer tokens)
+- Use `context_pack` instead of search→fetch→explore chains (1 call vs 3-5)
 - Use `impact_radius` for richer reachability analysis with edge paths (1 call vs multi-step explore)
 
 ## HEALTH ANALYSIS
diff --git a/crates/rpg-mcp/src/server.rs b/crates/rpg-mcp/src/server.rs
index 4d56a37..9132753 100644
--- a/crates/rpg-mcp/src/server.rs
+++ b/crates/rpg-mcp/src/server.rs
@@ -35,6 +35,34 @@ impl PromptVersions {
 }
 
 /// The RPG MCP server state.
+///
+/// # Lock order invariant
+///
+/// Several of the fields below are `Arc<RwLock<T>>`. When a code path holds
+/// more than one of them at the same time, locks must be acquired in the
+/// following order (outermost first):
+///
+/// 1. `graph`
+/// 2. `lifting_session` / `hierarchy_session`
+/// 3. `stale_entity_ids`
+/// 4. `pending_routing`
+/// 5. `last_auto_sync_head` / `last_auto_sync_changeset` / `last_auto_sync_workdir_paths`
+/// 6. `config`
+/// 7. `embedding_index`
+/// 8. `project_root_cell`
+///
+/// Paths that touch only one lock at a time are unaffected. Paths that
+/// acquire several locks but release each before acquiring the next
+/// (statement-per-lock pattern in `set_project_root` and `reload_rpg`)
+/// are also unaffected — at no moment do they hold two locks, so no
+/// cycle can form.
+///
+/// The invariant is needed because tokio's `RwLock` is not re-entrant
+/// and writers block readers while waiting: two tasks that each hold
+/// one inner lock and wait for the other's outer lock would deadlock.
+/// Keeping `graph` as the outermost held lock everywhere ensures that
+/// any two nested paths serialize through `graph` and never form a
+/// cycle on the inner locks.
 #[derive(Clone)]
 pub(crate) struct RpgServer {
     /// Active project root. Mutable at runtime via the `set_project_root` tool
@@ -56,6 +84,12 @@ pub(crate) struct RpgServer {
     pub(crate) prompt_versions: PromptVersions,
     /// Last git HEAD SHA at which auto-sync ran. Prevents redundant updates.
     pub(crate) last_auto_sync_head: Arc<RwLock<Option<String>>>,
+    /// Entity IDs whose source was modified after their features were lifted.
+    /// Populated by `auto_sync_if_stale` (from `summary.modified_entity_ids`),
+    /// drained by `submit_lift_results` as entities get re-lifted. Lets
+    /// `lifting_status` surface stale-feature drift even though those entities
+    /// still appear "lifted" in the coverage count.
+    pub(crate) stale_entity_ids: Arc<RwLock<std::collections::HashSet<String>>>,
     /// Hash of the last-synced workdir changeset (dirty files + their stat).
     /// Combined with `last_auto_sync_head` to detect when a re-sync is needed
     /// for uncommitted/staged/unstaged changes.
@@ -84,6 +118,35 @@ impl RpgServer {
         self.project_root_cell.read().await.clone()
     }
 
+    /// Reload `.rpg/config.toml` into the given config slot.
+    /// - File missing → silently use defaults (the no-config-yet case).
+    /// - File present but malformed → log a warning, keep the existing config.
+    /// - File present and valid → swap.
+    pub(crate) async fn reload_config_with_warning(
+        slot: &Arc<RwLock<RpgConfig>>,
+        project_root: &std::path::Path,
+    ) {
+        let config_path = project_root.join(".rpg/config.toml");
+        if !config_path.exists() {
+            // Missing is a normal state — use defaults silently.
+            *slot.write().await = RpgConfig::default();
+            return;
+        }
+        match RpgConfig::load(project_root) {
+            Ok(cfg) => {
+                *slot.write().await = cfg;
+            }
+            Err(e) => {
+                eprintln!(
+                    "rpg: failed to parse {} ({}); keeping previous in-memory config",
+                    config_path.display(),
+                    e
+                );
+                // Do NOT overwrite — leave the previous (working) config in place.
+            }
+        }
+    }
+
     /// Create a new server, loading graph and config from `project_root` if present.
     pub(crate) fn new(project_root: PathBuf) -> Self {
         let graph = storage::load(&project_root).ok();
@@ -107,6 +170,7 @@ impl RpgServer {
             tool_router: Self::create_tool_router(),
             prompt_versions: PromptVersions::new(),
             last_auto_sync_head: Arc::new(RwLock::new(initial_head)),
+            stale_entity_ids: Arc::new(RwLock::new(std::collections::HashSet::new())),
             last_auto_sync_changeset: Arc::new(RwLock::new(None)),
             last_auto_sync_workdir_paths: Arc::new(RwLock::new(std::collections::HashSet::new())),
             lift_in_progress: Arc::new(std::sync::atomic::AtomicBool::new(false)),
@@ -258,6 +322,25 @@ impl RpgServer {
             Ok(summary) => {
                 graph.metadata.paradigms = paradigm_names;
                 let _ = storage::save(&project_root, graph);
+
+                // Persist stale entity IDs so lifting_status can surface
+                // stale-feature drift in subsequent calls. These entities
+                // still count as "lifted" by coverage(), so without this
+                // set, lifting_status would report "100% coverage" while
+                // search_node returns outdated features.
+                //
+                // Each inner write below is statement-per-lock — no two
+                // inner locks are held at once while we're also holding
+                // graph.write(), so the order between them is irrelevant
+                // for correctness (see the lock-order doc on RpgServer).
+                {
+                    let mut stale = self.stale_entity_ids.write().await;
+                    for id in &summary.modified_entity_ids {
+                        stale.insert(id.clone());
+                    }
+                    // Prune entries for entities that no longer exist
+                    stale.retain(|id| graph.entities.contains_key(id));
+                }
                 *self.last_auto_sync_head.write().await = Some(current_head);
                 *self.last_auto_sync_changeset.write().await = Some(current_changeset);
                 *self.last_auto_sync_workdir_paths.write().await = current_paths;
@@ -270,15 +353,50 @@ impl RpgServer {
                 }
 
                 let (lifted, total) = graph.lifting_coverage();
-                let needs_lifting = total - lifted;
-                let needs_relift = summary.modified_entity_ids.len();
+                let total_unlifted = total - lifted;
+                let added_now = summary.entities_added;
+                let stale_now = summary.modified_entity_ids.len();
+                let aggregate_drift = total_unlifted + stale_now;
 
                 let mut notice = format!(
                     "[auto-synced: +{} -{} ~{} entities",
                     summary.entities_added, summary.entities_removed, summary.entities_modified,
                 );
-                if needs_lifting > 0 || needs_relift > 0 {
-                    notice.push_str(&format!("; {} need lifting", needs_lifting + needs_relift));
+
+                if aggregate_drift > 0 {
+                    // Per-update delta — what THIS sync changed
+                    let mut parts: Vec<String> = Vec::new();
+                    if added_now > 0 {
+                        parts.push(format!("+{} added unlifted", added_now));
+                    }
+                    if stale_now > 0 {
+                        parts.push(format!("~{} stale features", stale_now));
+                    }
+                    if !parts.is_empty() {
+                        notice.push_str("; ");
+                        notice.push_str(&parts.join(", "));
+                    }
+                    // Pre-existing backlog (entities that were already unlifted before this update)
+                    let pre_existing = total_unlifted.saturating_sub(added_now);
+                    if pre_existing > 0 {
+                        if parts.is_empty() {
+                            notice.push_str(&format!("; {} unlifted total", pre_existing));
+                        } else {
+                            notice.push_str(&format!(" (+{} pre-existing)", pre_existing));
+                        }
+                    }
+                    // Active recommendation — graded by aggregate severity. The
+                    // batch-0 NOTE in get_entities_for_lifting is authoritative
+                    // for the dispatch decision; this is a heuristic gate.
+                    if aggregate_drift >= crate::LARGE_SCOPE_ENTITIES {
+                        notice.push_str(
+                            " — semantic search is incomplete; call lifting_status for re-lift dispatch guidance",
+                        );
+                    } else {
+                        notice.push_str(
+                            " — semantic search is incomplete; call lifting_status to refresh",
+                        );
+                    }
                 }
                 notice.push_str("]\n\n");
                 notice
@@ -315,7 +433,7 @@ impl RpgServer {
     /// added/modified/renamed file. Deleted files hash their path only.
     /// Same changeset + same stat = same hash = no re-sync. Second save of the
     /// same file changes mtime → different hash → re-sync fires.
-    fn compute_changeset_hash(
+    pub(crate) fn compute_changeset_hash(
         changes: &[rpg_encoder::evolution::FileChange],
         project_root: &std::path::Path,
     ) -> String {
@@ -494,6 +612,19 @@ impl RpgServer {
             ),
         };
 
+        // Stale-feature drift — entities still counted as "lifted" because
+        // they have features, but those features are out of date because
+        // the source was modified after lifting. Tracked across syncs by
+        // auto_sync_if_stale.
+        let stale_features_count = {
+            let stale = self.stale_entity_ids.read().await;
+            // Filter to entities still present in the graph
+            stale
+                .iter()
+                .filter(|id| graph.entities.contains_key(*id))
+                .count()
+        };
+
         let mut out = format!(
             "=== RPG Lifting Status ===\n\
              {}\n\
@@ -501,6 +632,12 @@ impl RpgServer {
              hierarchy: {}\n",
             graph_line, lifted, total, coverage_pct, hierarchy_type,
         );
+        if stale_features_count > 0 {
+            out.push_str(&format!(
+                "stale_features: {} entities modified since last lift (features outdated)\n",
+                stale_features_count,
+            ));
+        }
 
         // Per-area coverage
         let area_cov = graph.area_coverage();
@@ -575,18 +712,99 @@ impl RpgServer {
             }
         }
 
-        // NEXT STEP — state machine guidance, staleness takes priority
+        // NEXT STEP — state machine guidance, staleness takes priority.
+        // `LARGE_SCOPE_ENTITIES` is the threshold above which direct
+        // foreground lifting is not recommended. See also the matching
+        // check in `get_entities_for_lifting` which expresses the same
+        // heuristic in terms of batches.
+        //
+        // The state machine considers two kinds of "work remaining":
+        //   - `remaining` — entities that have never been lifted
+        //   - `stale_features_count` — entities with outdated features after
+        //     a source modification (tracked across syncs via
+        //     `stale_entity_ids`). These look "lifted" in coverage but their
+        //     features no longer reflect the source.
+        // Their sum is what actually needs LLM work.
         out.push('\n');
+        let remaining = total.saturating_sub(lifted);
+        let work_remaining = remaining + stale_features_count;
+
         if stale_detail.is_some() {
-            out.push_str("NEXT STEP: Graph is stale. Call update_rpg to sync with code changes, then lift any new entities.\n");
+            out.push_str("NEXT STEP: Graph is stale. Call update_rpg to sync with code changes, then lift any new or modified entities.\n");
+        } else if work_remaining >= crate::LARGE_SCOPE_ENTITIES {
+            // Large repo — recommend delegating the mechanical loop so the
+            // caller doesn't exhaust its own context. Give the dispatch
+            // pattern *directly* here rather than bouncing the caller through
+            // get_entities_for_lifting first (which would burn batch-0's
+            // source payload in the caller's context, the exact thing we're
+            // trying to avoid).
+            //
+            // Note: `remaining` is the raw unlifted count *before* auto-lift
+            // runs (which happens inside get_entities_for_lifting). Auto-lift
+            // shrinks the LLM-needed set considerably for repos with many
+            // trivial entities (getters, setters, constructors). The agent
+            // can skip the dispatch if, once the worker calls
+            // get_entities_for_lifting batch 0, no delegation NOTE appears —
+            // in that case the queue is small enough to lift in one context.
+            let batch_tokens = self.config.read().await.encoding.max_batch_tokens;
+            let workload_desc = if remaining > 0 && stale_features_count > 0 {
+                format!(
+                    "{} unlifted + {} stale = {} entities",
+                    remaining, stale_features_count, work_remaining,
+                )
+            } else if stale_features_count > 0 {
+                format!(
+                    "{} stale entities to re-lift (modified since last lift)",
+                    stale_features_count,
+                )
+            } else {
+                format!("{} entities unlifted", remaining)
+            };
+            out.push_str(&format!(
+                "NEXT STEP: Likely-large lifting workload — {} (auto-lift may reduce this). Dispatch a sub-agent to run the LOOP below; do not run it in this context — each batch is ~{}K tokens of source and will exhaust caller context over many iterations.\n",
+                workload_desc,
+                batch_tokens.div_ceil(1000),
+            ));
+            out.push_str(
+                "\nLOOP (sub-agent runs this in its own context):\n  \
+                 get_entities_for_lifting(scope=\"*\") -> analyze batch -> submit_lift_results -> repeat until DONE -> finalize_lifting\n\
+                 \nDISPATCH:\n  \
+                 Use whatever sub-agent / cheaper-model mechanism your runtime provides. The MCP graph persists to disk after every submit, so the worker's writes survive. **After the worker returns, call `reload_rpg`** — some runtimes give sub-agents an isolated MCP session, in which case the caller's in-memory graph is stale until reloaded. (No-op if your runtime shares the MCP session.)\n\
+                 \nFALLBACK (no sub-agent mechanism, no API key):\n  \
+                 Scope the lift to one subtree at a time — e.g. get_entities_for_lifting(scope=\"src/auth/**\") — and submit features per batch within that scope. Each scoped batch fits in foreground context. Call finalize_lifting ONCE at the very end after all scopes are complete; calling it mid-flow auto-routes pending entities against incomplete signals and locks the hierarchy in early.\n",
+            );
+            // The CLI fallback only helps when there is unlifted work.
+            // `rpg-encoder lift` resolves `scope="*"` to entities with no
+            // features, so a stale-only backlog (features present, sources
+            // modified) is a no-op for the CLI — surfacing it there would
+            // be a dead-end recipe.
+            if remaining > 0 {
+                out.push_str(
+                    "\nFALLBACK (no sub-agent mechanism, API key available, unlifted entities only):\n  \
+                     Run `rpg-encoder lift --provider anthropic` (or `openai`) from the terminal — the CLI drives an external LLM directly with no agent involvement. After the CLI finishes, call `reload_rpg` in this session so the server picks up the updated graph from disk. Note: the CLI lifts entities with no features; stale entities (features present but outdated) must be re-lifted via the MCP loop above.\n",
+                );
+            }
         } else if lifted == 0 {
             out.push_str(
                 "NEXT STEP: Call get_entities_for_lifting(scope=\"*\") to start lifting.\n",
             );
-        } else if lifted < total {
+        } else if remaining > 0 && stale_features_count > 0 {
+            out.push_str(&format!(
+                "NEXT STEP: {} unlifted + {} stale = {} entities need LLM work. Call get_entities_for_lifting(scope=\"*\") — it returns both unlifted entities and stale ones that need re-lifting in the same batches.\n",
+                remaining, stale_features_count, work_remaining,
+            ));
+        } else if remaining > 0 {
             out.push_str(&format!(
                 "NEXT STEP: {} entities remaining. Call get_entities_for_lifting(scope=\"*\") to continue lifting.\n",
-                total - lifted,
+                remaining,
+            ));
+        } else if stale_features_count > 0 {
+            // All entities have features, but some features are outdated.
+            // The post-sync delta is what matters here — we track modified
+            // entities in stale_entity_ids so agents know to re-lift them.
+            out.push_str(&format!(
+                "NEXT STEP: Coverage is 100% but {} entities have stale features (source modified after lift). Call get_entities_for_lifting(scope=\"*\") to re-lift just those — it surfaces stale entities as if they were unlifted.\n",
+                stale_features_count,
             ));
         } else if !graph.metadata.semantic_hierarchy {
             out.push_str(
diff --git a/crates/rpg-mcp/src/tools.rs b/crates/rpg-mcp/src/tools.rs
index 84e9d7e..74d8190 100644
--- a/crates/rpg-mcp/src/tools.rs
+++ b/crates/rpg-mcp/src/tools.rs
@@ -3,6 +3,8 @@
 //! The `#[tool_router]` proc macro requires every `#[tool]` method to live in one
 //! `impl` block, so this file cannot be split further without upstream changes.
 
+use std::collections::HashSet;
+
 use rmcp::{handler::server::wrapper::Parameters, tool, tool_router};
 use rpg_core::graph::{RPGraph, normalize_path};
 use rpg_core::storage;
@@ -13,7 +15,7 @@ use crate::types::*;
 #[tool_router]
 impl RpgServer {
     #[tool(
-        description = "Search for code entities by intent or keywords. Returns entities with file paths, line numbers, and relevance scores. Use mode='features' for semantic intent search (use behavioral/functional phrases as query), 'snippets' for name/path matching (use file paths, qualified entities, or keywords as query), 'auto' (default) tries both.",
+        description = "PREFER THIS OVER grep/rg FOR ANY QUESTION ABOUT CODE BEHAVIOR OR NAMES. Search for code entities by intent or keywords. Returns entities with file paths, line numbers, and relevance scores. Use mode='features' for semantic intent search (e.g., 'validate user input') — finds code by what it DOES even when names don't match. Use mode='snippets' for name/path matching (e.g., 'FilterGroupManager' or 'src/auth/'). Use mode='auto' (default) to try both. This replaces grep/rg for every structural query.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn search_node(
@@ -156,7 +158,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Fetch detailed metadata and source code for a known entity. Returns the entity's semantic features, dependencies (what it calls, what calls it), hierarchy position, and full source code.",
+        description = "PREFER THIS OVER cat OR WHOLE-FILE READS FOR A SINGLE ENTITY. Fetch detailed metadata and source code for a known entity by ID. Returns the entity's semantic features (what it does), dependencies (what it calls, what calls it), hierarchy position, and full source code. Use this instead of reading the whole file when you only need one function/class/method.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn fetch_node(
@@ -194,7 +196,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Explore the dependency graph starting from an entity. Traverses import, invocation, inheritance, composition, render, state-read/state-write, and dispatch edges. Use direction='downstream' to see what the entity calls, 'upstream' to see what calls it, 'both' for full picture.",
+        description = "PREFER THIS OVER CHAINED GREPS FOR DEPENDENCY QUESTIONS. Explore the dependency graph starting from an entity. Traverses import, invocation, inheritance, composition, render, state-read/state-write, and dispatch edges. Use direction='downstream' to see what the entity calls, 'upstream' to see what calls it, 'both' for full picture. Replaces the manual \"grep for X, then grep each result, then grep those\" loop with one graph walk.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn explore_rpg(
@@ -279,7 +281,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Get RPG statistics: entity count, file count, functional areas, dependency edges, containment edges, and hierarchy overview. Use this first to understand the codebase structure before searching.",
+        description = "PREFER THIS OVER wc/find/tree FOR CODEBASE OVERVIEW. RPG statistics: entity count, file count, functional areas, dependency edges, containment edges, inter-area connectivity, hierarchy overview. Call this first on any new codebase to orient yourself before searching.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn rpg_info(&self) -> Result<String, String> {
@@ -324,7 +326,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Generate a compact, token-efficient snapshot of the entire repository's semantic understanding. Designed for context window injection at session start. Includes: full hierarchy with aggregate features, all entities grouped by functional area with semantic features, condensed dependency skeleton, and coverage stats. Target: ~25-30K tokens for a 1000-entity codebase. Call this FIRST in any session to gain whole-repo awareness before using other tools.",
+        description = "PREFER THIS OVER READING MANY FILES FOR WHOLE-REPO CONTEXT. Compact, token-efficient snapshot of the entire repository's semantic understanding: full hierarchy with aggregate features, all entities grouped by functional area with semantic features, condensed dependency skeleton, and coverage stats. Target: ~25-30K tokens for a 1000-entity codebase. Call this at session start to gain whole-repo awareness in a single tool call — then use search_node/fetch_node for drill-down.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn semantic_snapshot(
@@ -405,7 +407,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Switch the active project root for this session. Use this when you started the session from one directory but want RPG to operate on a different project — for example, launched Claude Code from your home directory but want to work on ~/myproject. The server loads the graph from the new directory's .rpg/graph.json if present, resets all session state (lifting sessions, auto-sync markers, pending routing), and points every subsequent tool call at the new root. Path is tilde-expanded and canonicalized.",
+        description = "Switch the active project root for this session. Use this when the server was started from one directory but you want RPG to operate on a different project — e.g. the MCP server launched in your home directory but you want to work on ~/myproject. The server loads the graph from the new directory's .rpg/graph.json if present, resets all session state (lifting sessions, auto-sync markers, pending routing), and points every subsequent tool call at the new root. Path is tilde-expanded and canonicalized.",
         annotations(
             destructive_hint = false,
             idempotent_hint = true,
@@ -443,6 +445,28 @@ impl RpgServer {
         // Swap the root
         *self.project_root_cell.write().await = canonical.clone();
 
+        // Load the NEW project's config. Unlike `reload_rpg` (same-project
+        // reload, where keeping the previous in-memory config on parse
+        // error is the right call because the old config was project-
+        // valid), a project switch must *not* inherit the old project's
+        // config — that would silently cross-contaminate encoding/batch
+        // settings across unrelated codebases. So on parse failure we
+        // fall back to `RpgConfig::default()` and emit a warning, and on
+        // "file absent" we likewise use defaults.
+        {
+            let new_config = match rpg_core::config::RpgConfig::load(&canonical) {
+                Ok(cfg) => cfg,
+                Err(e) => {
+                    eprintln!(
+                        "rpg: failed to parse .rpg/config.toml in new project ({}); falling back to defaults for this project",
+                        e
+                    );
+                    rpg_core::config::RpgConfig::default()
+                }
+            };
+            *self.config.write().await = new_config;
+        }
+
         // Reset all session + sync state — everything is project-scoped
         *self.lifting_session.write().await = None;
         *self.hierarchy_session.write().await = None;
@@ -452,6 +476,7 @@ impl RpgServer {
         *self.last_auto_sync_head.write().await = None;
         *self.last_auto_sync_changeset.write().await = None;
         *self.last_auto_sync_workdir_paths.write().await = std::collections::HashSet::new();
+        *self.stale_entity_ids.write().await = std::collections::HashSet::new();
         #[cfg(feature = "embeddings")]
         {
             *self.embedding_index.write().await = None;
@@ -691,6 +716,14 @@ impl RpgServer {
         storage::save(project_root, &graph).map_err(|e| format!("Failed to save RPG: {}", e))?;
         let _ = storage::ensure_gitignore(project_root);
 
+        // Capture lifting coverage BEFORE the graph moves into `self.graph`.
+        // `lifting_coverage()` excludes `Module` entities (they get features
+        // via file-level synthesis, not direct lifting), which matches the
+        // semantics of `get_entities_for_lifting`. `meta.total_entities`
+        // includes modules, so it would inflate the "unlifted" count and
+        // trip the delegation threshold too early.
+        let (lifted_non_module, total_non_module) = graph.lifting_coverage();
+
         // Update in-memory state
         let meta = graph.metadata.clone();
         *self.graph.write().await = Some(graph);
@@ -720,6 +753,19 @@ impl RpgServer {
         self.pending_routing.write().await.clear();
         clear_pending_routing(&self.project_root().await);
 
+        // Prune the drift-tracking set against the new graph. build_rpg
+        // preserves features for entities whose IDs survive the rebuild,
+        // so a stale entity that still exists is correctly still stale.
+        // But IDs removed in the rebuild should drop out of the set so it
+        // doesn't accumulate dead references over many rebuilds.
+        {
+            let graph_guard = self.graph.read().await;
+            if let Some(ref g) = *graph_guard {
+                let mut stale = self.stale_entity_ids.write().await;
+                stale.retain(|id| g.entities.contains_key(id));
+            }
+        }
+
         let lang_display = if languages.len() == 1 {
             languages[0].name().to_string()
         } else {
@@ -734,6 +780,11 @@ impl RpgServer {
         } else {
             "structural"
         };
+        // `lifted: X/Y` uses non-module counts (matches `lifting_status` and
+        // `get_entities_for_lifting`) so the numbers agents see here line
+        // up with the numbers they see elsewhere. The `entities: N` line
+        // above is the raw total including modules — those get features
+        // via file-level synthesis, not direct lifting.
         let mut result = format!(
             "RPG built successfully.\n\
              languages: {}\n\
@@ -742,7 +793,7 @@ impl RpgServer {
              functional_areas: {}\n\
              dependency_edges: {}\n\
              containment_edges: {}\n\
-             lifted: {}/{}\n\
+             liftable_entities: {}/{} (modules are aggregated from files, not lifted directly)\n\
              hierarchy: {}",
             lang_display,
             meta.total_entities,
@@ -750,8 +801,8 @@ impl RpgServer {
             meta.functional_areas,
             meta.dependency_edges,
             meta.containment_edges,
-            meta.lifted_entities,
-            meta.total_entities,
+            lifted_non_module,
+            total_non_module,
             hierarchy_label,
         );
 
@@ -778,15 +829,35 @@ impl RpgServer {
                     stats.orphaned,
                     stats.new_entities,
                 ));
-            } else {
-                result.push_str(
-                    "\nTip: use get_entities_for_lifting + submit_lift_results to add semantic features.",
-                );
             }
-        } else {
+        }
+
+        // NEXT STEP — action-oriented, scale-aware. A build_rpg response is
+        // usually the first RPG tool call in a session, so the agent reads
+        // this before asking the user what to do next. A weak "tip" gets
+        // ignored; a directive with sizing guidance gets followed. Semantic
+        // tools (search_node, context_pack, plan_change) are lossy on an
+        // unlifted graph — users notice this as "search doesn't find the
+        // thing I know is there" — so the default is "lift now".
+        let unlifted = total_non_module.saturating_sub(lifted_non_module);
+        if unlifted == 0 {
             result.push_str(
-                "\nTip: use get_entities_for_lifting + submit_lift_results to add semantic features.",
+                "\n\nNEXT STEP: Graph is fully lifted. Semantic tools (search_node, context_pack, plan_change, explore_rpg) are ready — prefer them over grep/cat/find for any structural question.",
             );
+        } else if unlifted >= crate::LARGE_SCOPE_ENTITIES {
+            let batch_tokens = self.config.read().await.encoding.max_batch_tokens;
+            result.push_str(&format!(
+                "\n\nNEXT STEP: {} entities unlifted (of {}). Dispatch a sub-agent now to run the lift loop — don't wait for the user to ask. Each batch is ~{}K tokens of source, so running the loop here would exhaust caller context before any real work begins.\n\
+                 \nLOOP (sub-agent runs this in its own context):\n  \
+                 get_entities_for_lifting(scope=\"*\") -> analyze batch -> submit_lift_results -> repeat until DONE -> finalize_lifting\n\
+                 \nAfter the worker returns, call reload_rpg — some runtimes give sub-agents an isolated MCP session, in which case the caller's in-memory graph is stale until reloaded. Call lifting_status for per-state recommendations at any time.",
+                unlifted, total_non_module, batch_tokens.div_ceil(1000),
+            ));
+        } else {
+            result.push_str(&format!(
+                "\n\nNEXT STEP: {} entities unlifted (of {}). Lift now — don't wait for the user to ask; semantic search/fetch won't find unlifted entities by intent. Call get_entities_for_lifting(scope=\"*\"), analyze the batch, submit via submit_lift_results, repeat until DONE, then finalize_lifting.",
+                unlifted, total_non_module,
+            ));
         }
         Ok(result)
     }
@@ -887,8 +958,27 @@ impl RpgServer {
             let project_root = self.project_root().await.clone();
             let scope_owned = scope.to_string();
 
-            // Run the blocking pipeline on the current thread (tells tokio we're blocking).
-            // This is safe because MCP stdio is serial — no concurrent requests.
+            // Compute the in-scope entity IDs up front so we can drain
+            // them from `stale_entity_ids` after the pipeline runs. For a
+            // non-`*` scope the pipeline freshens features for every
+            // in-scope entity (the `*` scope auto-filters to feature-empty
+            // entities, but explicit scopes don't), so any stale entity
+            // in scope is no longer stale once the pipeline returns. We
+            // drain *unconditionally* by ID rather than diffing features
+            // before/after, because a deterministic re-lift can produce
+            // identical features for a cosmetic source change — the
+            // entity is still freshly lifted, just to the same value.
+            let in_scope_ids: HashSet<String> = {
+                let lift_scope = rpg_encoder::lift::resolve_scope(graph, scope);
+                lift_scope.entity_ids.into_iter().collect()
+            };
+
+            // Run the blocking pipeline on the current thread (tells tokio
+            // we're blocking). Safe because (1) the `lift_in_progress`
+            // atomic above rejects concurrent `auto_lift` calls, and
+            // (2) the graph write lock we hold below serializes against
+            // every other tool that touches the graph for the pipeline's
+            // duration.
             let report = tokio::task::block_in_place(|| {
                 let config = rpg_lift::LiftConfig {
                     provider: provider.as_ref(),
@@ -906,6 +996,14 @@ impl RpgServer {
 
             drop(guard);
 
+            // Drain stale tracking for every in-scope ID. After the
+            // pipeline, those entities have authoritative features (LLM
+            // or auto-lift), regardless of whether the features changed.
+            if !in_scope_ids.is_empty() {
+                let mut stale = self.stale_entity_ids.write().await;
+                stale.retain(|id| !in_scope_ids.contains(id));
+            }
+
             // Clear sessions — entity list changed
             *self.lifting_session.write().await = None;
             *self.hierarchy_session.write().await = None;
@@ -1005,12 +1103,49 @@ impl RpgServer {
             };
 
             if needs_rebuild {
+                // Snapshot stale entity IDs *before* taking graph/session locks so
+                // we don't deadlock on nested writes. Stale entities are ones
+                // whose source changed after they were lifted — their features
+                // still exist but are outdated, so they should be treated as
+                // "needs LLM work" alongside unlifted entities.
+                let stale_snapshot: HashSet<String> = {
+                    let stale = self.stale_entity_ids.read().await;
+                    stale.iter().cloned().collect()
+                };
+
                 // Lock order: graph first, then session (consistent with lifting_status)
                 let mut guard = self.graph.write().await;
                 let mut session = self.lifting_session.write().await;
                 let graph = guard.as_mut().ok_or("No RPG loaded")?;
 
-                let scope = rpg_encoder::lift::resolve_scope(graph, &params.scope);
+                let mut scope = rpg_encoder::lift::resolve_scope(graph, &params.scope);
+
+                // For the "*"/"all" scope, `resolve_scope` filters to entities
+                // with *no* features — which correctly captures unlifted
+                // entities but excludes stale ones (they still have their old
+                // features). Augment the scope with any tracked stale
+                // entities so a single get_entities_for_lifting(scope="*")
+                // call covers both "never lifted" and "lifted-but-outdated".
+                // For other scope kinds (glob, hierarchy path, id list),
+                // `resolve_scope` doesn't filter by lifted state, so any
+                // stale entity matching the scope is already present.
+                let params_scope_trimmed = params.scope.trim();
+                if params_scope_trimmed == "*" || params_scope_trimmed.eq_ignore_ascii_case("all") {
+                    let already: HashSet<&String> = scope.entity_ids.iter().collect();
+                    let to_add: Vec<String> = stale_snapshot
+                        .iter()
+                        .filter(|id| !already.contains(id))
+                        .filter(|id| {
+                            graph
+                                .entities
+                                .get(*id)
+                                .is_some_and(|e| e.kind != rpg_core::graph::EntityKind::Module)
+                        })
+                        .cloned()
+                        .collect();
+                    scope.entity_ids.extend(to_add);
+                }
+
                 if scope.entity_ids.is_empty() {
                     *session = None;
                     return Ok(format!(
@@ -1041,33 +1176,51 @@ impl RpgServer {
                 let mut auto_lifted = 0usize;
                 let mut needs_llm = Vec::new();
                 let mut review_candidates: Vec<(String, Vec<String>)> = Vec::new();
+                // Track which stale entities got re-lifted (either via auto-lift
+                // above or routed into needs_llm below) so we can drain them
+                // from `stale_entity_ids` once they're written. Auto-lift
+                // persists its features directly inside this function, so it
+                // must drain the set itself; needs_llm entities are drained
+                // in `submit_lift_results` after the caller submits features.
+                let mut auto_relifted_stale: Vec<String> = Vec::new();
                 for raw in all_raw_entities {
-                    // Skip entities that already have curated features
-                    let already_lifted = graph
-                        .entities
-                        .get(&raw.id())
-                        .is_some_and(|e| !e.semantic_features.is_empty());
+                    let raw_id = raw.id();
+                    // Stale entities get re-lifted regardless of existing
+                    // features — their features are known-outdated because
+                    // the source was modified after the previous lift.
+                    let is_stale = stale_snapshot.contains(&raw_id);
+                    // Otherwise, skip entities that already have curated features.
+                    let already_lifted = !is_stale
+                        && graph
+                            .entities
+                            .get(&raw_id)
+                            .is_some_and(|e| !e.semantic_features.is_empty());
                     if already_lifted {
                         continue;
                     }
                     match engine.try_lift_with_confidence(&raw) {
                         Some((features, rpg_encoder::lift::LiftConfidence::Accept)) => {
                             // High confidence — apply features directly
-                            if let Some(entity) = graph.entities.get_mut(&raw.id()) {
+                            if let Some(entity) = graph.entities.get_mut(&raw_id) {
                                 entity.semantic_features = features;
                                 entity.feature_source = Some("auto".to_string());
                                 auto_lifted += 1;
+                                if is_stale {
+                                    auto_relifted_stale.push(raw_id.clone());
+                                }
                             }
                         }
                         Some((features, rpg_encoder::lift::LiftConfidence::Review)) => {
                             // Medium confidence — apply features but flag for review
-                            let eid = raw.id();
-                            if let Some(entity) = graph.entities.get_mut(&eid) {
+                            if let Some(entity) = graph.entities.get_mut(&raw_id) {
                                 entity.semantic_features = features.clone();
                                 entity.feature_source = Some("auto".to_string());
                                 auto_lifted += 1;
+                                if is_stale {
+                                    auto_relifted_stale.push(raw_id.clone());
+                                }
                             }
-                            review_candidates.push((eid, features));
+                            review_candidates.push((raw_id, features));
                         }
                         Some((_, rpg_encoder::lift::LiftConfidence::Reject)) | None => {
                             needs_llm.push(raw);
@@ -1083,6 +1236,17 @@ impl RpgServer {
                     }
                 }
 
+                // Drain stale tracking for entities auto-lifter just wrote
+                // fresh features for. Without this, lifting_status would keep
+                // counting them as stale forever because the auto-lift path
+                // skips submit_lift_results entirely.
+                if !auto_relifted_stale.is_empty() {
+                    let mut stale = self.stale_entity_ids.write().await;
+                    for id in &auto_relifted_stale {
+                        stale.remove(id);
+                    }
+                }
+
                 if needs_llm.is_empty() {
                     *session = None;
                     let (lifted, total) = graph.lifting_coverage();
@@ -1156,6 +1320,19 @@ impl RpgServer {
 
         // Only include repo context and full instructions on batch 0 to save context space
         if batch_index == 0 {
+            // Size-aware dispatch hint — if the queue is large, point the
+            // caller at `lifting_status` where the full dispatch guidance
+            // lives (kept there to avoid duplicating detail in the per-batch
+            // response, which ships with every batch's source payload).
+            if total_batches >= crate::LARGE_SCOPE_BATCHES {
+                let batch_tokens = self.config.read().await.encoding.max_batch_tokens;
+                let approx_total_k = (total_batches * batch_tokens).div_ceil(1000);
+                output.push_str(&format!(
+                    "\nNOTE: {} batches queued (~{}K tokens of source total). If your runtime supports sub-agent dispatch or a cheaper model, stop here — do not request further batches in this context — and invoke `lifting_status` for the delegation pattern. Continue the sequential loop only if no dispatch is available.\n\n",
+                    total_batches, approx_total_k,
+                ));
+            }
+
             if auto_lifted_count > 0 {
                 output.push_str(&format!(
                     "AUTO-LIFTED: {} trivial entities (getters/setters/constructors). Override by re-submitting features.\n\n",
@@ -1416,6 +1593,15 @@ impl RpgServer {
 
         graph.refresh_metadata();
 
+        // Re-lifted entities are no longer stale — drain them from the set
+        // tracked by auto-sync so lifting_status reports accurate drift.
+        if !resolved_features.is_empty() {
+            let mut stale = self.stale_entity_ids.write().await;
+            for id in resolved_features.keys() {
+                stale.remove(id);
+            }
+        }
+
         storage::save(&self.project_root().await, graph)
             .map_err(|e| format!("Failed to save RPG: {}", e))?;
 
@@ -1525,11 +1711,39 @@ impl RpgServer {
             }
         }
 
-        // NEXT action
-        if lifted < total {
-            result.push_str("\nNEXT: continue with get_entities_for_lifting, then call finalize_lifting when done.");
-        } else {
+        // NEXT action — scale-aware so the caller doesn't burn its context
+        // grinding through batches when delegation would cost zero of its
+        // tokens. Mirrors the threshold used in lifting_status.
+        //
+        // Work is "unlifted + stale" — stale entities still show as lifted
+        // in coverage but need re-lifting. Emitting DONE on `lifted == total`
+        // alone would tell a stale-only re-lift loop to stop after batch 1
+        // while later batches are still queued.
+        let stale_remaining = {
+            let stale = self.stale_entity_ids.read().await;
+            stale
+                .iter()
+                .filter(|id| graph.entities.contains_key(*id))
+                .count()
+        };
+        let unlifted = total.saturating_sub(lifted);
+        let work_remaining = unlifted + stale_remaining;
+        if work_remaining == 0 {
             result.push_str("\nDONE: all entities lifted. Call finalize_lifting to build the semantic hierarchy.");
+        } else if work_remaining >= crate::LARGE_SCOPE_ENTITIES {
+            let breakdown = if unlifted > 0 && stale_remaining > 0 {
+                format!("{} unlifted + {} stale", unlifted, stale_remaining)
+            } else if stale_remaining > 0 {
+                format!("{} stale", stale_remaining)
+            } else {
+                format!("{} unlifted", unlifted)
+            };
+            result.push_str(&format!(
+                "\nNEXT: {} entities still need LLM work ({}) — call lifting_status for the recommended re-lift dispatch (likely a sub-agent / cheaper model in your runtime). Continue here only if no dispatch mechanism is available.",
+                work_remaining, breakdown,
+            ));
+        } else {
+            result.push_str("\nNEXT: continue with get_entities_for_lifting, then call finalize_lifting when done.");
         }
         Ok(result)
     }
@@ -1568,9 +1782,14 @@ impl RpgServer {
 
         let batch = &pending[start..end];
 
+        // Header kept free of the revision hash so the response prefix stays
+        // stable across graph updates — the LLM prompt cache can then retain
+        // the instructions + entity list even as the revision changes. The
+        // revision itself is emitted below the data, near the NEXT_ACTION
+        // block where the agent actually needs to read it.
         let mut result = format!(
-            "## ROUTING CANDIDATES (batch {} of {}, revision: {})\n\n",
-            batch_index, total_batches, revision,
+            "## ROUTING CANDIDATES (batch {} of {})\n\n",
+            batch_index, total_batches,
         );
 
         // Include routing instructions on batch 0
@@ -1893,6 +2112,18 @@ impl RpgServer {
             rpg_encoder::evolution::get_head_sha(&self.project_root().await).ok();
         *self.last_auto_sync_changeset.write().await = None;
 
+        // Track modified entities so lifting_status and
+        // get_entities_for_lifting(scope="*") surface them as re-lift work.
+        // Without this, the "needs_relift: N" value we report below would
+        // point the caller at a path that returns zero entities.
+        {
+            let mut stale = self.stale_entity_ids.write().await;
+            for id in &summary.modified_entity_ids {
+                stale.insert(id.clone());
+            }
+            stale.retain(|id| g.entities.contains_key(id));
+        }
+
         // Clear sessions — entity list changed
         *self.lifting_session.write().await = None;
         *self.hierarchy_session.write().await = None;
@@ -2004,12 +2235,32 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Reload the RPG graph from disk. Use after external changes to .rpg/graph.json."
+        description = "Reload the RPG graph and config from disk. Use after external changes to .rpg/graph.json or .rpg/config.toml — for example, after the CLI ran `rpg-encoder lift` or after editing batch-size settings."
     )]
     async fn reload_rpg(&self) -> Result<String, String> {
-        match storage::load(&self.project_root().await) {
+        let project_root = self.project_root().await;
+        // Refresh config from disk — if the user edited .rpg/config.toml
+        // or the lifter wrote new settings, pick them up here. Logs a
+        // warning if the file exists but failed to parse, then keeps the
+        // existing config (don't clobber a working in-memory config over
+        // a temporarily broken edit).
+        Self::reload_config_with_warning(&self.config, &project_root).await;
+        match storage::load(&project_root) {
             Ok(g) => {
                 let entities = g.metadata.total_entities;
+                // Prune (don't clear) the drift-tracking set against the
+                // newly-loaded graph. Wholesale clearing was wrong because
+                // the documented CLI / isolated-subagent flow only re-lifts
+                // entities with no features — stale entities (features
+                // present but outdated) survive that flow, so clearing
+                // would let lifting_status report "100% coverage" even
+                // though re-lift work remains. Pruning by entity-existence
+                // keeps that backlog visible while dropping IDs that were
+                // removed in the new graph.
+                {
+                    let mut stale = self.stale_entity_ids.write().await;
+                    stale.retain(|id| g.entities.contains_key(id));
+                }
                 *self.graph.write().await = Some(g);
                 // Sync embedding index incrementally
                 #[cfg(feature = "embeddings")]
@@ -2470,60 +2721,98 @@ impl RpgServer {
 
         // Handle sharded workflow
         if needs_sharding {
-            let mut session_guard = self.hierarchy_session.write().await;
-
-            // Initialize session if it doesn't exist
-            if session_guard.is_none() {
-                let graph_guard = self.graph.read().await;
-                let graph = graph_guard.as_ref().unwrap();
-
-                let clusters = rpg_encoder::hierarchy::cluster_files_for_hierarchy(graph, 70);
-                let total_clusters = clusters.len();
-
-                *session_guard = Some(HierarchySession {
-                    clusters,
-                    functional_areas: None,
-                    assignments: std::collections::HashMap::new(),
-                    batches_completed: 0,
-                });
-
-                drop(graph_guard);
-                drop(session_guard);
-
-                // Return batch 0 (domain discovery)
-                return self.build_batch_0_domain_discovery(total_clusters).await;
+            // Lock order invariant (see RpgServer doc): graph before
+            // hierarchy_session. A concurrent build_rpg/update_rpg/
+            // reload_rpg/set_project_root can clear the session at any
+            // moment before we hold its write lock, so decide whether to
+            // initialize only while holding the write lock — never by
+            // re-trusting an earlier peek. We take graph.read() FIRST
+            // (ordering) so that if we need to initialize, we can compute
+            // clusters from a stable graph and install under the
+            // session.write() that's about to follow.
+            enum Action {
+                EmitBatch0(Vec<rpg_encoder::hierarchy::FileCluster>),
+                EmitBatchN {
+                    batch_idx: usize,
+                    cluster: rpg_encoder::hierarchy::FileCluster,
+                    functional_areas: Vec<String>,
+                    total_batches: usize,
+                },
+                AllDone {
+                    total_batches: usize,
+                },
             }
 
-            // Session exists - continue with next batch
-            let session = session_guard.as_mut().unwrap();
-            let total_batches = session.clusters.len() + 1; // +1 for domain discovery
-            let clusters_len = session.clusters.len();
-
-            if session.batches_completed == 0 {
-                // Still on batch 0 - waiting for functional areas
-                drop(session_guard);
-                return self.build_batch_0_domain_discovery(clusters_len).await;
-            }
+            let graph_guard = self.graph.read().await;
+            let graph = graph_guard.as_ref().unwrap();
+
+            let action = {
+                let mut session_guard = self.hierarchy_session.write().await;
+
+                // Initialize if absent — fresh or cleared-out-from-under-us.
+                if session_guard.is_none() {
+                    let new_clusters =
+                        rpg_encoder::hierarchy::cluster_files_for_hierarchy(graph, 70);
+                    let snapshot = new_clusters.clone();
+                    *session_guard = Some(HierarchySession {
+                        clusters: new_clusters,
+                        functional_areas: None,
+                        assignments: std::collections::HashMap::new(),
+                        batches_completed: 0,
+                    });
+                    Action::EmitBatch0(snapshot)
+                } else {
+                    let session = session_guard.as_mut().unwrap();
+                    let total_batches = session.clusters.len() + 1;
+
+                    if session.batches_completed == 0 {
+                        Action::EmitBatch0(session.clusters.clone())
+                    } else if session.batches_completed > session.clusters.len() {
+                        *session_guard = None;
+                        Action::AllDone { total_batches }
+                    } else {
+                        let batch_idx = session.batches_completed - 1;
+                        Action::EmitBatchN {
+                            batch_idx,
+                            cluster: session.clusters[batch_idx].clone(),
+                            functional_areas: session.functional_areas.clone().unwrap_or_default(),
+                            total_batches,
+                        }
+                    }
+                }
+            };
 
-            if session.batches_completed > session.clusters.len() {
-                // All batches complete
-                *session_guard = None;
-                return Ok(format!(
-                    "All {} batches complete. Hierarchy has been applied.",
-                    total_batches
-                ));
+            // Keep `graph_guard` held across rendering. Both helpers now
+            // take `&RPGraph` so they don't re-read `self.graph`, which
+            // would otherwise expose us to a concurrent `set_project_root`
+            // that could swap the graph to `None` mid-render.
+            match action {
+                Action::EmitBatch0(clusters) => {
+                    return self.build_batch_0_domain_discovery(graph, &clusters).await;
+                }
+                Action::AllDone { total_batches } => {
+                    return Ok(format!(
+                        "All {} batches complete. Hierarchy has been applied.",
+                        total_batches
+                    ));
+                }
+                Action::EmitBatchN {
+                    batch_idx,
+                    cluster,
+                    functional_areas,
+                    total_batches,
+                } => {
+                    return self
+                        .build_cluster_batch(
+                            graph,
+                            batch_idx + 1,
+                            total_batches,
+                            &cluster,
+                            &functional_areas,
+                        )
+                        .await;
+                }
             }
-
-            // Return next file assignment batch
-            let batch_idx = session.batches_completed - 1; // -1 because batch 0 is domain discovery
-            let cluster = session.clusters[batch_idx].clone();
-            let functional_areas = session.functional_areas.clone().unwrap_or_default();
-
-            drop(session_guard);
-
-            return self
-                .build_cluster_batch(batch_idx + 1, total_batches, &cluster, &functional_areas)
-                .await;
         }
 
         // Non-sharded workflow (≤100 files) - original single-shot behavior
@@ -2583,7 +2872,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Build a focused context pack in a single call. Searches for entities matching your query, fetches their details and source code, expands neighbors to the specified depth (default 1), and trims to a token budget. Replaces the typical search→fetch→explore multi-step workflow. Returns primary entities with source + features + deps, plus neighborhood entities for broader context.",
+        description = "PREFER THIS OVER MANUAL search → fetch → explore CHAINS. Single-call context pack: searches for entities matching your query, fetches their details and source code, expands neighbors to the specified depth (default 1), and trims to a token budget. Returns primary entities with source + features + deps, plus neighborhood entities for broader context. Replaces 3-5 chained tool calls with 1.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn context_pack(
@@ -2636,7 +2925,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Compute the impact radius of an entity: find all entities reachable via dependency edges with edge paths. Use direction='upstream' to answer 'what depends on this?', 'downstream' for 'what does this depend on?'. Returns a flat list with depth, edge paths, and features — ideal for change impact analysis."
+        description = "PREFER THIS OVER RECURSIVE GREP FOR \"WHAT BREAKS IF I CHANGE X\". Computes the impact radius of an entity: all entities reachable via dependency edges with edge paths and depth. Use direction='upstream' for 'what depends on this?', 'downstream' for 'what does this depend on?'. Returns a flat list with depth, edge paths, and features — one call replaces a dependency trace you'd otherwise grep manually."
     )]
     async fn impact_radius(
         &self,
@@ -2681,7 +2970,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Find multiple dependency paths between two entities (returns up to max_paths results of equal shortest length). Returns paths with entity IDs and edge kinds."
+        description = "PREFER THIS OVER MANUALLY TRACING CALLS. Finds shortest dependency paths between two entities (returns up to max_paths results of equal shortest length). Answers 'how does A reach B?' or 'is there any call chain from module X to module Y?' with entity IDs and edge kinds. Replaces the grep-follow-grep chain you'd otherwise walk by hand."
     )]
     async fn find_paths(
         &self,
@@ -2760,7 +3049,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Extract minimal connecting subgraph between a set of entities. Returns entities and edges on shortest paths connecting the specified entities."
+        description = "PREFER THIS FOR MULTI-ENTITY SLICE ANALYSIS. Extracts the minimal connecting subgraph between a set of entities — returns entities and edges on shortest paths connecting them. Useful for 'show me just the code that connects A, B, and C' without dragging in the whole graph."
     )]
     async fn slice_between(
         &self,
@@ -2819,7 +3108,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Plan code changes: find relevant entities, compute modification order, assess impact radius. Returns dependency-ordered entity list with blast radius analysis.",
+        description = "PREFER THIS BEFORE ANY REFACTOR OR CROSS-FILE EDIT. Plans code changes: finds relevant entities by intent, computes dependency-safe modification order, assesses impact radius per entity. Returns an ordered list of entities to touch with blast radius analysis so you know the minimal safe change set before you start editing.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn plan_change(
@@ -3207,7 +3496,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Analyze code health metrics including coupling, instability, centrality, and potential god objects. Returns entities with architectural issues and recommendations for refactoring. Set include_duplication=true to detect code clones via Rabin-Karp fingerprinting (reads source files, slower). Set include_semantic_duplication=true to detect conceptual duplicates via Jaccard similarity on lifted features (in-memory, fast; requires entities to be lifted).",
+        description = "PREFER THIS OVER EYEBALLING FOR ARCHITECTURAL SMELLS. Analyzes code health: coupling, instability, centrality, god object detection, optional clone detection. Returns entities with architectural issues and refactoring recommendations. Use `include_duplication=true` for token-level Rabin-Karp clones (reads source, slower). Use `include_semantic_duplication=true` for Jaccard-similarity conceptual duplicates on lifted features (in-memory, fast). Replaces manual review of cross-file patterns.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn analyze_health(
@@ -3242,7 +3531,7 @@ impl RpgServer {
     }
 
     #[tool(
-        description = "Detect circular dependencies (cycles) in the codebase. Cycles are architectural smells where A depends on B, B on C, and C back on A. Returns all detected cycles with their entity chains. First call returns summary + recommendations. Use parameters to filter results.",
+        description = "PREFER THIS OVER MANUAL CYCLE HUNTING. Detects circular dependencies: A→B→C→A chains anywhere in the graph. Returns cycles with entity chains, file counts, and cross-area filtering. First call returns summary + area breakdown; pass filters (area, min_cycle_length, cross_file_only) to get specific cycles. One call replaces hours of import-chain reading.",
         annotations(read_only_hint = true, open_world_hint = false)
     )]
     async fn detect_cycles(
diff --git a/npm/package.json b/npm/package.json
index 1b03f79..46d92cf 100644
--- a/npm/package.json
+++ b/npm/package.json
@@ -1,6 +1,6 @@
 {
   "name": "rpg-encoder",
-  "version": "0.8.2",
+  "version": "0.8.3",
   "mcpName": "io.github.userFRM/rpg-encoder",
   "description": "RPG-Encoder — semantic code graph for AI-assisted code understanding",
   "license": "MIT",
diff --git a/server.json b/server.json
index 0b35c82..ce7ae53 100644
--- a/server.json
+++ b/server.json
@@ -6,12 +6,12 @@
     "url": "https://github.com/userFRM/rpg-encoder",
     "source": "github"
   },
-  "version": "0.8.2",
+  "version": "0.8.3",
   "packages": [
     {
       "registryType": "npm",
       "identifier": "rpg-encoder",
-      "version": "0.8.2",
+      "version": "0.8.3",
       "transport": {
         "type": "stdio"
       }