Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
51efd49
fix: repair broken README doc links and convert docs index to MDX
SimplyLiz Apr 13, 2026
cca6e8f
feat: v2.1.0 — stream_context (token-budgeted RAG context streaming)
SimplyLiz Apr 15, 2026
6126b9b
Merge pull request #8 from nyxCore-Systems/feat/2.1.0-stream-context
SimplyLiz Apr 15, 2026
ef7cc55
docs: mention v2.1 stream_context in README status
SimplyLiz Apr 15, 2026
c1c485f
feat: embed_text — unary text-to-vector embedding
SimplyLiz Apr 15, 2026
e7d43cc
Merge pull request #9 from nyxCore-Systems/feat/embed-text
SimplyLiz Apr 15, 2026
b43e63f
feat(2.1): capability discovery + graceful unknown-variant handling
SimplyLiz Apr 15, 2026
0a443c8
Merge pull request #10 from nyxCore-Systems/feat/capability-discovery
SimplyLiz Apr 15, 2026
cb7cf28
feat(2.1): structured error codes on all Error responses (#11)
SimplyLiz Apr 15, 2026
b3b6721
fix(query_expansion): honor model pin, restrict ranking to matching m…
SimplyLiz Apr 15, 2026
ef453c6
feat(2.1): split ErrorCode conflations, drift-guard supported_message…
SimplyLiz Apr 15, 2026
7139526
feat(2.1): Tier 3 provenance via RegisterTier3Source (#14)
SimplyLiz Apr 15, 2026
f9be039
feat(2.1): lip import --no-provenance opt-out flag (#15)
SimplyLiz Apr 15, 2026
8351cfa
fix(2.1): wire ErrorCode::UnknownModel; pin QueryExpansion contract (…
SimplyLiz Apr 15, 2026
b24a0da
Merge branch 'develop' into release/v2.1.0
SimplyLiz Apr 15, 2026
9b1732b
test(watcher): poll instead of fixed sleep to reduce CI flakes
SimplyLiz Apr 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,47 @@ All notable changes to this project are documented here.

---

## [2.1.0] — 2026-04-15

### Added

**v2.1 — Streaming context + forward-compat primitives**

**Streaming**

- **`StreamContext { file_uri, cursor_position, max_tokens, model? }`** — new streaming wire message. Daemon ranks symbols relevant to the cursor and emits one `SymbolInfo { symbol_info, relevance_score, token_cost }` frame at a time, terminating with exactly one `EndStream { reason, emitted, total_candidates, error? }` frame. Reasons: `budget_reached`, `exhausted`, `error`. Replaces the broken "fetch top-k, locally truncate to prompt budget" pattern with stream-until-full. Spec §9.2.
- **Relevance ordering** (spec §2.3): direct symbol at cursor → callers (from blast-radius CPG walk) → callees / references → related types. Conservative token-cost estimate `ceil((len(signature) + len(documentation)) / 4) + 8` per symbol. Daemon does not buffer ahead of the socket; `BrokenPipe` from a closing client aborts the ranking walk cleanly. `StreamContext` is rejected from `Batch` / `BatchQuery`.
- **`protocol_version` bumped from `1` → `2`** in `HandshakeResult`. Clients can detect streaming support via handshake.
- **`lip stream-context <file_uri> <line:col> --max-tokens N [--model M]`** — new CLI subcommand prints frames as JSON for manual testing.

**New primitives**

- **`EmbedText { text, model? }`** — embed an arbitrary text string and return the raw vector. Closes the gap left by `EmbeddingBatch` (URI-only) and `QueryNearestByText` (embeds internally but discards the vector). Callers re-ranking with their own scoring (centroid arithmetic, federated nearest-neighbour, lexical-then-semantic re-rank) get the embedding directly instead of building a centroid out of nearest-neighbour seeds. Returns `EmbedTextResult { vector: Vec<f32>, embedding_model: String }`. Not permitted inside `BatchQuery` (requires async HTTP).
- **`RegisterTier3Source { source: Tier3Source }`** + **`IndexStatusResult.tier3_sources`** — expose provenance for Tier 3 ingestion batches (SCIP imports). `Tier3Source { source_id, tool_name, tool_version, project_root, imported_at_ms }` records *what* producer generated the symbols and *when* the daemon accepted them. Re-registering the same `source_id` overwrites in place, refreshing `imported_at_ms`. The daemon deliberately does no staleness detection: stale Tier 3 symbols remain in the graph at their original confidence until the caller re-imports. Surfacing provenance lets clients decide when to warn a user that imported data has aged (e.g. `scip-rust imported 3 days ago`). `lip import --push-to-daemon` now sends this before streaming SCIP deltas, with `source_id = sha256(tool_name + ":" + project_root)`. `IndexStatusResult.tier3_sources` is `#[serde(default)]`; older daemons yield an empty vector. Ack'd with `DeltaAck`. Not permitted inside `BatchQuery` (mutation).
- **`lip import --no-provenance`** — opt out of Tier 3 provenance registration for ephemeral or test imports that should not pollute a long-lived daemon's `tier3_sources` list. No effect on the default EventStream-JSON output path.

**Forward-compat & capability discovery**

- **`HandshakeResult.supported_messages: Vec<String>`** — handshake response now lists every `ClientMessage` `type` tag this daemon understands. Lets clients probe for an individual message (e.g. `stream_context`, `embed_text`) without writing "handshake then pray" code or comparing `protocol_version` integers. Field is `#[serde(default)]`; older daemons yield an empty vector, which clients should treat as "fall back to `protocol_version`."
- **`ServerMessage::UnknownMessage { message_type, supported }`** — when a client sends a well-formed JSON envelope whose `type` tag is unknown, the daemon now replies with `UnknownMessage` (carrying the tag plus the same supported list as handshake) *and keeps the socket open*, instead of closing after a generic parse `Error`. Lets forward-compatible clients downgrade gracefully to a supported call instead of reconnecting.
- **`ServerMessage::Error { message, code }`** — `code: ErrorCode` is a stable, machine-readable category. Clients branch on this instead of string-matching `message`. `#[serde(default)]`; older daemons deserialize as `ErrorCode::Internal`.
- **`ErrorCode`** enum — small, stable set: `unknown_message_type`, `unknown_model`, `embedding_not_configured`, `no_embedding`, `cursor_out_of_range`, `index_locked`, `invalid_request`, `internal` (default). Adding a code is non-breaking; renaming or removing one is breaking.
- `embedding_not_configured` — daemon has no embedding service (`LIP_EMBEDDING_URL` unset).
- `no_embedding` — URI has no cached embedding yet; call `EmbeddingBatch` first.
- `unknown_model` — the embedding endpoint rejected the requested model. Emitted by the daemon when the HTTP backend returns 404 or a 4xx body matching `model_not_found` / `"unknown model"` / `"model … not found/invalid/unsupported"`. Transport, rate-limit, and auth errors stay on `internal` — retrying with the same model only makes sense after a real config change. Classification lives in `daemon/embedding.rs::classify_http_error`.
- `invalid_request` — request was well-formed on the wire but used incorrectly (e.g. nested `Batch`, or `StreamContext` inside a `Batch`). Distinct from `internal` so clients can avoid retry loops on caller-side mistakes.

**Drift guard**

- **`ClientMessage::variant_tag`** + `supported_messages_covers_all_variants` test — exhaustive-match helper plus paired test that fails compilation when a new `ClientMessage` variant is added without being advertised in `supported_messages()`. Prevents capability-list drift from silently shrinking the handshake surface.

### Fixed

- **`QueryExpansion` handler contract pinned by a db-level test.** The post-embedding ranking is now encapsulated in `LipDatabase::query_expansion_terms(query_vec, actual_model, top_k)`, which the handler calls in one line. A regression that drops the model filter would cause `query_expansion_terms_rejects_cross_model_scoring` (db.rs) to fail, closing the earlier gap where the fix shipped without a paired assertion.
- **`QueryExpansion` now honors the caller's model pin.** Previously the handler embedded the query with the requested model but then ranked candidates across *all* stored symbol embeddings regardless of which model produced them — cross-model cosine scores are not meaningful, so the returned "expansion terms" were effectively noise whenever the index held mixed-model vectors. Handler now captures the actual model returned by `embed_texts` and passes it through a new `model_filter: Option<&str>` parameter on `LipDatabase::nearest_symbol_by_vector`, restricting candidates to symbols embedded with the same model. `SimilarSymbols` (which resolves from a URI's own cached embedding) keeps the old unfiltered behavior by passing `None`.

---

## [2.0.1] — 2026-04-13

### Changed
Expand Down
6 changes: 3 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ members = [
]

[workspace.package]
version = "2.0.1"
version = "2.1.0"
edition = "2021"
rust-version = "1.78"
authors = ["Lisa Welsch <lisa@nyxcore.cloud>"]
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ Requires Rust 1.78+. No system `protoc` required.

## Status

v2.0 — `ExplainMatch` (chunk-level explanation: which lines in a result file drove the match), model provenance (`FileStatus` exposes the embedding model per file; `IndexStatus` warns when the index contains mixed-model vectors). v1.9: `filter` glob + `min_score` on all NN calls, `GetCentroid`, `QueryStaleEmbeddings`. v1.8: `FindBoundaries`, `SemanticDiff`, `QueryNearestInStore` (cross-repo federation), `QueryNoveltyScore`, `ExtractTerminology`, `PruneDeleted`. v1.7: 6 semantic retrieval primitives. v1.6: `ReindexFiles`, `Similarity`, `QueryExpansion`, `Cluster`, `ExportEmbeddings`. Wire format is JSON.
v2.1 — `StreamContext` (token-budgeted RAG context streaming): callers stream symbols ranked by relevance to a cursor and stop reading when the prompt budget is full instead of fetching top-k and locally truncating; `protocol_version` bumped to `2`. v2.0 — `ExplainMatch` (chunk-level explanation: which lines in a result file drove the match), model provenance (`FileStatus` exposes the embedding model per file; `IndexStatus` warns when the index contains mixed-model vectors). v1.9: `filter` glob + `min_score` on all NN calls, `GetCentroid`, `QueryStaleEmbeddings`. v1.8: `FindBoundaries`, `SemanticDiff`, `QueryNearestInStore` (cross-repo federation), `QueryNoveltyScore`, `ExtractTerminology`, `PruneDeleted`. v1.7: 6 semantic retrieval primitives. v1.6: `ReindexFiles`, `Similarity`, `QueryExpansion`, `Cluster`, `ExportEmbeddings`. Wire format is JSON.

---

Expand Down
3 changes: 2 additions & 1 deletion bindings/rust/benches/framing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criteri
use tokio::runtime::Runtime;

use lip_core::daemon::session::{read_message, write_message};
use lip_core::query_graph::ServerMessage;
use lip_core::query_graph::{ErrorCode, ServerMessage};

fn make_rt() -> Runtime {
tokio::runtime::Builder::new_current_thread()
Expand All @@ -19,6 +19,7 @@ fn make_rt() -> Runtime {
fn make_message(payload_bytes: usize) -> ServerMessage {
ServerMessage::Error {
message: "x".repeat(payload_bytes),
code: ErrorCode::Internal,
}
}

Expand Down
154 changes: 142 additions & 12 deletions bindings/rust/src/daemon/embedding.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,79 @@
//! When `LIP_EMBEDDING_URL` is unset, [`EmbeddingClient::from_env`] returns `None`
//! and all embedding requests return a sensible error to the caller.

use anyhow::Context;
use serde::{Deserialize, Serialize};

/// Classified failure from the embedding HTTP endpoint.
///
/// The variants map directly to [`crate::query_graph::ErrorCode`]
/// categories so the daemon can propagate a precise classification to
/// clients instead of collapsing every endpoint failure into `Internal`.
/// Callers that only need a display string should use the `Display` impl.
#[derive(Debug)]
pub enum EmbedError {
/// The endpoint rejected the requested model name — either 404, or
/// a 4xx whose body names the model. Maps to `ErrorCode::UnknownModel`.
/// Retrying with the same model is pointless.
UnknownModel(String),
/// HTTP transport failure, timeout, or TLS error. Maps to
/// `ErrorCode::Internal`. Retry is often safe.
Transport(String),
/// The endpoint returned a response we could not parse, or the
/// vector count did not match the input count. Maps to
/// `ErrorCode::Internal`. Indicates a backend misconfiguration.
Protocol(String),
/// Non-2xx status that does not clearly match any of the above.
/// Maps to `ErrorCode::Internal`.
Http(String),
}

impl std::fmt::Display for EmbedError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
EmbedError::UnknownModel(m)
| EmbedError::Transport(m)
| EmbedError::Protocol(m)
| EmbedError::Http(m) => f.write_str(m),
}
}
}

impl std::error::Error for EmbedError {}

/// Classify an embedding endpoint's non-2xx response into the narrowest
/// applicable [`EmbedError`] variant.
///
/// Heuristic: 404 is always an unknown-model signal (OpenAI, Ollama, and
/// most compatible backends 404 on an unrecognised model). Other 4xx are
/// classified as `UnknownModel` only when the body mentions the model —
/// OpenAI-compatible errors typically carry `"code":"model_not_found"`
/// or a message containing `"model"` for this case. Everything else
/// (5xx, 4xx without model keyword) falls through to `Http`.
fn classify_http_error(status: reqwest::StatusCode, body: &str) -> EmbedError {
let msg = format!("embedding endpoint returned {status}: {body}");
if status == reqwest::StatusCode::NOT_FOUND {
return EmbedError::UnknownModel(msg);
}
if status.is_client_error() {
let lower = body.to_ascii_lowercase();
if lower.contains("model_not_found") || lower.contains("unknown model") {
return EmbedError::UnknownModel(msg);
}
// Conservative: generic 4xx with "model" mention, treat as model issue
// only when combined with a "not found" / "invalid" / "unsupported" hint,
// to avoid misclassifying auth / rate-limit errors.
let looks_model_shaped = lower.contains("model")
&& (lower.contains("not found")
|| lower.contains("invalid")
|| lower.contains("unsupported")
|| lower.contains("does not exist"));
if looks_model_shaped {
return EmbedError::UnknownModel(msg);
}
}
EmbedError::Http(msg)
}

/// Thin client around a single OpenAI-compatible embedding endpoint.
pub struct EmbeddingClient {
url: String,
Expand Down Expand Up @@ -73,12 +143,14 @@ impl EmbeddingClient {
///
/// # Errors
///
/// Propagates HTTP, serialisation, and API errors.
/// Returns an [`EmbedError`] classified so the daemon can map directly
/// to a [`crate::query_graph::ErrorCode`] without inspecting the
/// message string.
pub async fn embed_texts(
&self,
texts: &[String],
model_override: Option<&str>,
) -> anyhow::Result<(Vec<Vec<f32>>, String)> {
) -> Result<(Vec<Vec<f32>>, String), EmbedError> {
if texts.is_empty() {
return Ok((vec![], self.default_model.clone()));
}
Expand All @@ -93,31 +165,89 @@ impl EmbeddingClient {
.json(&body)
.send()
.await
.context("embedding HTTP request failed")?;
.map_err(|e| EmbedError::Transport(format!("embedding HTTP request failed: {e}")))?;

if !resp.status().is_success() {
let status = resp.status();
let text = resp.text().await.unwrap_or_default();
anyhow::bail!("embedding endpoint returned {status}: {text}");
return Err(classify_http_error(status, &text));
}

let parsed: EmbedResponse = resp
.json()
.await
.context("failed to parse embedding response")?;
.map_err(|e| EmbedError::Protocol(format!("failed to parse embedding response: {e}")))?;

// Re-order by index field to match the input order.
let mut data = parsed.data;
data.sort_by_key(|d| d.index);

anyhow::ensure!(
data.len() == texts.len(),
"embedding endpoint returned {} vectors for {} inputs",
data.len(),
texts.len()
);
if data.len() != texts.len() {
return Err(EmbedError::Protocol(format!(
"embedding endpoint returned {} vectors for {} inputs",
data.len(),
texts.len()
)));
}

let vectors = data.into_iter().map(|d| d.embedding).collect();
Ok((vectors, parsed.model))
}
}

#[cfg(test)]
mod tests {
use super::*;
use reqwest::StatusCode;

#[test]
fn classify_404_is_unknown_model() {
let e = classify_http_error(StatusCode::NOT_FOUND, "model not found");
assert!(matches!(e, EmbedError::UnknownModel(_)));
}

#[test]
fn classify_openai_model_not_found_code() {
// OpenAI API shape.
let body = r#"{"error":{"code":"model_not_found","message":"The model 'foo' does not exist"}}"#;
let e = classify_http_error(StatusCode::BAD_REQUEST, body);
assert!(matches!(e, EmbedError::UnknownModel(_)));
}

#[test]
fn classify_ollama_model_unknown() {
let body = r#"{"error":"model 'nomic-embed-text' not found, try pulling it first"}"#;
let e = classify_http_error(StatusCode::NOT_FOUND, body);
assert!(matches!(e, EmbedError::UnknownModel(_)));
}

#[test]
fn classify_auth_error_stays_http() {
// 401 unauthorized must not be misclassified as UnknownModel just
// because a token payload might mention "model".
let body = "Unauthorized";
let e = classify_http_error(StatusCode::UNAUTHORIZED, body);
assert!(matches!(e, EmbedError::Http(_)));
}

#[test]
fn classify_rate_limit_stays_http() {
let e = classify_http_error(StatusCode::TOO_MANY_REQUESTS, "rate limit");
assert!(matches!(e, EmbedError::Http(_)));
}

#[test]
fn classify_5xx_stays_http() {
let e = classify_http_error(StatusCode::INTERNAL_SERVER_ERROR, "backend died");
assert!(matches!(e, EmbedError::Http(_)));
}

#[test]
fn classify_4xx_mentioning_model_without_not_found_keyword_stays_http() {
// "model temperature too high" would mention "model" but is not
// an unknown-model signal. Conservative classifier keeps it Http.
let body = "model temperature parameter rejected";
let e = classify_http_error(StatusCode::BAD_REQUEST, body);
assert!(matches!(e, EmbedError::Http(_)));
}
}
Loading
Loading