Skip to content
Merged

Dev #52

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 35 additions & 16 deletions rust/examples/flow.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,35 @@ use vectorless::client::{IndexContext, IndexOptions, QueryContext};

/// Sample markdown content for demonstration.
const SAMPLE_MARKDOWN: &str = r#"
# Project Documentation

This document describes the architecture and usage of the vectorless library.
# Vectorless Architecture Guide

## Overview

Vectorless is a document indexing and retrieval library that uses tree-based navigation instead of vector embeddings.
Vectorless is a reasoning-native document intelligence engine that transforms documents into hierarchical semantic trees. Unlike traditional RAG systems that rely on vector embeddings and similarity search, Vectorless uses LLM-powered tree navigation to retrieve relevant content through deep contextual understanding.

The core idea is simple: structured documents already have inherent semantic relationships encoded in their headings, sections, and paragraphs. By preserving this structure as a navigable tree, an LLM can efficiently locate relevant information by following the document's own logical organization.

## Architecture

The system consists of three main components: an indexing pipeline, a storage layer, and a retrieval engine. The indexing pipeline parses documents into tree structures and generates summaries. The storage layer persists indexed documents to disk. The retrieval engine navigates the tree at query time using search algorithms guided by LLM decisions.

### Indexing Pipeline

The indexing pipeline processes documents through multiple stages: parsing, tree building, enhancement (LLM summary generation), and reasoning index construction. Each stage is independently configurable and can be enabled or disabled based on requirements. The pipeline supports incremental re-indexing with content fingerprinting to avoid redundant work when documents haven't changed.

### Retrieval Engine

The retrieval engine supports multiple search strategies including greedy depth-first search, beam search, and MCTS. A Pilot component provides LLM-guided navigation at key decision points during tree traversal. The engine is budget-aware, tracking token usage and making cost-conscious decisions about when to invoke the LLM versus using cheaper heuristic scoring.

## Performance

Under typical workloads, indexing a 50-page document takes approximately 10-30 seconds depending on LLM response latency and the complexity of the document structure. Query latency ranges from 200ms for simple keyword-matched queries to 3-5 seconds for complex multi-hop reasoning queries that require multiple LLM calls during tree navigation.

The system is designed for accuracy over speed. By leveraging document structure and LLM reasoning, it achieves higher retrieval quality than vector-based approaches on structured documents like technical reports, legal contracts, and research papers.
"#;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
async fn main() -> vectorless::Result<()> {
// Initialize tracing for debug output (set RUST_LOG=debug to see more)
tracing_subscriber::fmt::init();

Expand All @@ -39,13 +57,14 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Step 1: Create a Vectorless client
println!("Step 1: Creating Vectorless client...");

let client = EngineBuilder::new()
.with_workspace("./workspace")
.with_key("sk-...")
let engine = EngineBuilder::new()
.with_workspace("./worksspace_flow_example")
.with_key("sk...")
.with_model("gpt-4o")
.with_endpoint("https://api")
.build()
.await
.map_err(|e: vectorless::BuildError| vectorless::Error::Config(e.to_string()))?;
.map_err(|e| vectorless::Error::Config(e.to_string()))?;

println!(" - Client created successfully");
println!();
Expand All @@ -57,7 +76,7 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
let md_path = temp_dir.path().join("sample.md");
tokio::fs::write(&md_path, SAMPLE_MARKDOWN).await?;

let index_result = client
let index_result = engine
.index(IndexContext::from_path(&md_path).with_options(IndexOptions::new().with_summaries()))
.await?;
let doc_id = index_result.doc_id().unwrap().to_string();
Expand All @@ -68,20 +87,20 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {

// Step 3: List indexed documents
println!("Step 3: Indexed documents:");
for doc in client.list().await? {
for doc in engine.list().await? {
println!(" - {} ({})", doc.name, doc.id);
}
println!();

// Step 4: Query the document
println!("Step 4: Querying the document...");

let queries = vec!["What is this project about?"];
let queries = vec!["What is the seconds for complex multi-hop?"];

for query in queries {
println!(" Query: \"{}\"", query);

match client
match engine
.query(QueryContext::new(query).with_doc_id(&doc_id))
.await
{
Expand All @@ -92,7 +111,7 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
} else {
println!(" - Found relevant content:");
let preview = if item.content.len() > 200 {
format!("{}...", &item.content[..200])
format!("{}...", &item.content)
} else {
item.content.clone()
};
Expand All @@ -114,8 +133,8 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Step 5: Cleanup
println!("Step 5: Cleanup...");

client.remove(&doc_id).await?;
println!(" - Document removed");
// engine.remove(&doc_id).await?;
// println!(" - Document removed");

println!("\n=== Example Complete ===");
Ok(())
Expand Down
19 changes: 10 additions & 9 deletions rust/src/client/engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -407,16 +407,16 @@ impl Engine {
let mut failed = Vec::new();

for doc_id in doc_ids {
let tree = match self.get_structure(&doc_id).await {
Ok(t) => t,
let (tree, reasoning_index) = match self.get_structure(&doc_id).await {
Ok((t, ri)) => (t, ri),
Err(e) => {
tracing::warn!("Skipping document {}: {}", doc_id, e);
failed.push(FailedItem::new(&doc_id, e.to_string()));
continue;
}
};

match self.retriever.query(&tree, &ctx.query, &options).await {
match self.retriever.query_with_reasoning_index(&tree, &ctx.query, &options, reasoning_index).await {
Ok(mut result) => {
result.doc_id = doc_id;
items.push(result);
Expand All @@ -431,8 +431,9 @@ impl Engine {
// If everything failed, return error
if items.is_empty() && !failed.is_empty() {
return Err(Error::Config(format!(
"Query failed for all {} document(s)",
failed.len()
"Query failed for all {} document(s): {}",
failed.len(),
failed.iter().map(|f| format!("{} ({})", f.source, f.error)).collect::<Vec<_>>().join("; ")
)));
}

Expand All @@ -455,7 +456,7 @@ impl Engine {
}
};

let tree = self.get_structure(&doc_id).await?;
let (tree, _reasoning_index) = self.get_structure(&doc_id).await?;
let options = ctx.to_retrieve_options(&self.config);

let rx = self
Expand Down Expand Up @@ -529,8 +530,8 @@ impl Engine {
// Internal
// ============================================================

/// Get document structure (tree). Internal use only.
pub(crate) async fn get_structure(&self, doc_id: &str) -> Result<DocumentTree> {
/// Get document structure (tree) and optional reasoning index. Internal use only.
pub(crate) async fn get_structure(&self, doc_id: &str) -> Result<(DocumentTree, Option<crate::document::ReasoningIndex>)> {
let workspace = self
.workspace
.as_ref()
Expand All @@ -541,7 +542,7 @@ impl Engine {
.await?
.ok_or_else(|| Error::DocumentNotFound(format!("Document not found: {}", doc_id)))?;

Ok(doc.tree)
Ok((doc.tree, doc.reasoning_index))
}

/// Resolve QueryScope into a list of document IDs.
Expand Down
23 changes: 19 additions & 4 deletions rust/src/client/retriever.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ use tracing::info;
use super::events::{EventEmitter, QueryEvent};
use super::types::QueryResultItem;
use crate::config::Config;
use crate::document::{DocumentTree, NodeId};
use crate::document::{DocumentTree, NodeId, ReasoningIndex};
use crate::error::{Error, Result};
use crate::retrieval::content::ContentAggregatorConfig;
use crate::retrieval::stream::RetrieveEventReceiver;
use crate::retrieval::{RetrievalResult, RetrieveOptions, RetrieveResponse, Retriever};
use crate::retrieval::{RetrievalResult, RetrieveOptions, RetrieveResponse};

/// Document retrieval client.
///
Expand Down Expand Up @@ -124,17 +124,32 @@ impl RetrieverClient {
tree: &DocumentTree,
question: &str,
options: &RetrieveOptions,
) -> Result<QueryResultItem> {
self.query_with_reasoning_index(tree, question, options, None).await
}

/// Query a document tree with optional reasoning index for fast-path lookup.
///
/// # Errors
///
/// Returns an error if the retrieval pipeline fails.
pub async fn query_with_reasoning_index(
&self,
tree: &DocumentTree,
question: &str,
options: &RetrieveOptions,
reasoning_index: Option<ReasoningIndex>,
) -> Result<QueryResultItem> {
self.events.emit_query(QueryEvent::Started {
query: question.to_string(),
});

info!("Querying: {:?}", question);

// Execute retrieval
// Execute retrieval with reasoning index
let response = self
.retriever
.retrieve(tree, question, options)
.retrieve_with_reasoning_index(tree, question, options, reasoning_index)
.await
.map_err(|e| Error::Retrieval(e.to_string()))?;

Expand Down
2 changes: 1 addition & 1 deletion rust/src/retrieval/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ pub mod sufficiency;

pub use context::{PruningStrategy, TokenEstimation};
pub use pipeline_retriever::PipelineRetriever;
pub use retriever::{RetrievalContext, Retriever};
pub use retriever::RetrievalContext;
pub use types::*;

// Sufficiency exports
Expand Down
24 changes: 18 additions & 6 deletions rust/src/retrieval/pilot/llm_pilot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ use async_trait::async_trait;
use std::sync::Arc;
use tracing::{debug, info, warn};

use crate::document::DocumentTree;
use crate::document::{DocumentTree, NodeId};
use crate::llm::{LlmClient, LlmExecutor};
use crate::memo::{MemoKey, MemoStore, MemoValue};
use crate::utils::fingerprint::Fingerprint;
Expand Down Expand Up @@ -631,8 +631,16 @@ impl Pilot for LlmPilot {
decision
}

async fn guide_start(&self, tree: &DocumentTree, query: &str) -> Option<PilotDecision> {
println!("[DEBUG] LlmPilot::guide_start() called, query='{}'", query);
async fn guide_start(
&self,
tree: &DocumentTree,
query: &str,
start_node: NodeId,
) -> Option<PilotDecision> {
println!(
"[DEBUG] LlmPilot::guide_start() called, query='{}', start_node={:?}",
query, start_node
);

// Check if guide_at_start is enabled
if !self.config.guide_at_start {
Expand All @@ -650,10 +658,14 @@ impl Pilot for LlmPilot {
// Build start context
let context = self.context_builder.build_start_context(tree, query);

// Get root's children as candidates
let node_ids = tree.children(tree.root());
// Get start_node's children as candidates (NOT root's children)
let node_ids = tree.children(start_node);
if node_ids.is_empty() {
debug!("Start node has no children, no guidance needed");
return None;
}
println!(
"[DEBUG] LlmPilot::guide_start() - {} root children candidates",
"[DEBUG] LlmPilot::guide_start() - {} children candidates from start_node",
node_ids.len()
);

Expand Down
11 changes: 8 additions & 3 deletions rust/src/retrieval/pilot/noop.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

use async_trait::async_trait;

use crate::document::DocumentTree;
use crate::document::{DocumentTree, NodeId};

use super::{InterventionPoint, Pilot, PilotConfig, PilotDecision, SearchState};

Expand Down Expand Up @@ -69,7 +69,12 @@ impl Pilot for NoopPilot {
}
}

async fn guide_start(&self, _tree: &DocumentTree, _query: &str) -> Option<PilotDecision> {
async fn guide_start(
&self,
_tree: &DocumentTree,
_query: &str,
_start_node: NodeId,
) -> Option<PilotDecision> {
// No guidance at start
None
}
Expand Down Expand Up @@ -138,7 +143,7 @@ mod tests {
let pilot = NoopPilot::new();
let tree = DocumentTree::new("test", "test content");

let guidance = pilot.guide_start(&tree, "test").await;
let guidance = pilot.guide_start(&tree, "test", tree.root()).await;
assert!(guidance.is_none());
}

Expand Down
10 changes: 9 additions & 1 deletion rust/src/retrieval/pilot/trait.rs
Original file line number Diff line number Diff line change
Expand Up @@ -167,8 +167,16 @@ pub trait Pilot: Send + Sync {
/// Called once at the beginning of search to help determine
/// the starting point and initial direction.
///
/// `start_node` is the node from which the search begins. The pilot
/// should evaluate that node's children (not root's children) as candidates.
///
/// Returns `None` if no guidance is available or needed.
async fn guide_start(&self, tree: &DocumentTree, query: &str) -> Option<PilotDecision>;
async fn guide_start(
&self,
tree: &DocumentTree,
query: &str,
start_node: NodeId,
) -> Option<PilotDecision>;

/// Provide guidance during backtracking.
///
Expand Down
23 changes: 23 additions & 0 deletions rust/src/retrieval/pipeline/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,9 @@ pub struct PipelineContext {
pub accumulated_content: String,
/// Estimated token count.
pub token_count: usize,
/// Fingerprint of candidate node IDs from previous evaluate call.
/// Used to detect stagnant loops (same candidates → same evaluation).
pub prev_candidate_fingerprint: Option<u64>,

// ============ Final Result ============
/// Final retrieval response.
Expand Down Expand Up @@ -307,6 +310,7 @@ impl PipelineContext {
sufficiency: SufficiencyLevel::default(),
accumulated_content: String::new(),
token_count: 0,
prev_candidate_fingerprint: None,
result: None,
stage_results: HashMap::new(),
metrics: RetrievalMetrics::default(),
Expand Down Expand Up @@ -402,6 +406,25 @@ impl PipelineContext {
self.metrics.backtracks += 1;
}

/// Compute a fingerprint of the current candidate node IDs.
fn candidate_fingerprint(&self) -> u64 {
use std::hash::{Hash, Hasher};
let mut hasher = std::collections::hash_map::DefaultHasher::new();
for c in &self.candidates {
format!("{:?}", c.node_id).hash(&mut hasher);
}
hasher.finish()
}

/// Check if candidates changed since the last call, and update the stored fingerprint.
/// Returns `true` if candidates are the same as before (stagnant loop detected).
pub fn check_candidates_stagnant(&mut self) -> bool {
let fp = self.candidate_fingerprint();
let stagnant = self.prev_candidate_fingerprint == Some(fp);
self.prev_candidate_fingerprint = Some(fp);
stagnant
}

/// Check if token limit is reached.
pub fn is_token_limit_reached(&self) -> bool {
self.token_count >= self.options.max_tokens
Expand Down
Loading