Skip to content

Dev#69

Merged
zTgx merged 21 commits intomainfrom
dev
Apr 15, 2026
Merged

Dev#69
zTgx merged 21 commits intomainfrom
dev

Conversation

@zTgx
Copy link
Copy Markdown
Member

@zTgx zTgx commented Apr 15, 2026

No description provided.

zTgx added 21 commits April 15, 2026 10:23
… docs

Remove unnecessary clippy allow attributes that were making the codebase
too permissive with linting rules. Clean up redundant module documentation
while preserving the main crate description.
- Remove is_path, is_content, and is_bytes methods from IndexSource enum
- Remove line_count field from IndexedDocument struct
- Remove with_line_count method from IndexedDocument implementation
- Remove add_page and is_loaded methods from IndexedDocument implementation
- Clean up unused functionality in client module
- Remove IndexerConfig struct and related methods that were no longer used
- Add validation checks using new validation utilities for file, content and bytes
- Replace direct file existence checks with comprehensive validation
- Update tests to reflect configuration removal

feat(utils): add validation module for source validation

- Create new validation module with SourceValidation struct
- Implement validate_file, validate_content, and validate_bytes functions
- Add proper error handling and warning messages for various validation scenarios
- Include PDF magic number checking and file size warnings

refactor(client): simplify workspace client by removing unused features

- Remove WorkspaceClientConfig struct and related methods
- Delete unused batch_remove functionality
- Remove stats, len, and is_empty methods that were not being used
- Clean up tests to match simplified interface

refactor(client): streamline retriever client by removing deprecated methods

- Remove RetrieverClientConfig struct and associated configuration methods
- Delete find_similar and get_node_context methods that were unused
- Remove query method that was replaced by newer implementation
- Update clone implementation to match simplified structure

refactor(index_context): add directory existence check with warning

- Add validation to check if directory exists before processing
- Log warning when directory is not found instead of panicking
- Maintain backward compatibility for existing functionality
BREAKING CHANGE: The workspace parameter has been removed from the
Engine constructor across all examples and implementations. The engine
now uses an automatic workspace path based on the current working
directory.

The workspace is now automatically determined as:
- Linux/macOS: ~/.vectorless/workspaces/{cwd_hash}/
- Windows: %APPDATA%\vectorless\workspaces\{cwd_hash}\

This change affects both Python and Rust implementations, removing
the need for explicit workspace configuration while maintaining
isolated workspaces for different projects.
…rint annotation

Remove the format.rs and timing.rs utility modules as they are no longer
used in the codebase. The format module contained text formatting
utilities like truncate, number formatting, and byte formatting
functions. The timing module provided Timer utilities for performance
measurement.

Also add dead_code attribute to allow unused code in fingerprint module
for future use.
- Replace individual LLM client configurations with centralized LlmPoolConfig
- Update EngineBuilder to apply overrides to LlmPoolConfig instead of
  legacy config sections
- Introduce LlmConfigs conversion from LlmPoolConfig for backward
  compatibility
- Remove deprecated IndexerClient::new method and associated tests
- Update validation logic to resolve API keys from multiple sources
  including new LlmPoolConfig
- Replace direct LLM client instantiation with pool-based approach
  using pool.index() and pool.retrieval()
- Remove redundant test cases that are no longer applicable
- Add dead code allowance attribute to library root
- Add llm_client field to PipelineOrchestrator to store shared client
- Implement with_llm_client method to set the shared client
- Inject the LLM client into IndexContext during pipeline execution
- Update tracing info to reflect additional context injection
Remove the intermediate events module re-exports and update all
imports to directly reference crate::events::EventEmitter instead
of using relative paths like super::events::EventEmitter.

This change simplifies the module structure by eliminating the
unnecessary events submodule in the client directory and ensures
consistent import paths across all client modules.
…single function

BREAKING CHANGE: The IndexContext::from_dir_recursive method has been removed
and replaced with a boolean parameter in the from_dir method.

- Replace separate from_dir and from_dir_recursive methods with unified
  from_dir(path, recursive) method
- Update Python binding to use the new unified method signature
- Update documentation examples to reflect the new API
- Rename test function to match new naming convention
- Add source_path field to IndexItem and DocumentInfo structs
- Replace with_doc_id method with_doc_ids method that accepts
  a vector of document IDs
- Update documentation and examples to reflect the new API
- Modify query_stream to work with the new Documents scope type
- Add getter methods for source_path in Python bindings
…tead of with_doc_id

BREAKING CHANGE: Replace all instances of .with_doc_id() with .with_doc_ids() method
in documentation examples and code samples to support multiple document IDs.

- Updated README.md example
- Updated blog post example
- Updated quick-query documentation
- Updated PDF support documentation
- Updated getting started guide
- Updated intro documentation
- Updated search algorithms documentation
- Updated strategies documentation
- Updated Python SDK documentation
- Updated Rust SDK documentation
- Updated example files (document_management, error_handling, indexing, pdf_indexing)
- Removed deprecated retrieval type exports from rust/src/lib.rs
- Remove unused StrategyPreference import from lib.rs
- Delete entire PyStrategyPreference implementation including all constants
 (AUTO, KEYWORD, LLM, HYBRID, CROSS_DOCUMENT, PAGE_RANGE)
- Remove with_strategy method from PyQueryContext
- Remove StrategyPreference class registration in _vectorless module
- Update Python bindings to exclude StrategyPreference from exports
- Clean up related documentation and type references
Add a TODO comment to consider parallelizing queries across documents
when multiple document IDs are provided, with a concurrency limit.
…xamples

- Remove workspace="./data" parameter from Python Engine constructor in README.md
- Remove workspace="./data" parameter from Rust EngineBuilder in README.md
- Remove workspace="./data" parameter from Python example in python/README.md
- Remove workspace="./data" parameter from quick start example in python/__init__.py
Removed the EventEmitter re-export from client module as it was no
longer needed and caused unnecessary coupling to events module.
- Remove deprecated `.with_workspace()` parameter from Engine initialization
 in all examples and documentation
- Remove deprecated `StrategyPreference` enum usage from query contexts
- Update all code examples to use default auto-selection behavior instead
  of explicit strategy selection
- Clean up import statements by removing unused StrategyPreference imports
- Update retrieval strategies documentation to reflect automatic strategy
  selection as default behavior

BREAKING CHANGE: The workspace parameter and explicit strategy preferences
are no longer required as they are now handled automatically.
…xing

- Add workspace_dir field to Engine struct to store workspace root directory
- Introduce source_hash in IndexContext to track content changes for checkpoint validation
- Implement checkpoint loading and saving mechanism in PipelineOrchestrator
- Add checkpoint validation logic using SHA-256 hash of source content
- Enable pipeline resumption from saved checkpoints when available
- Skip already completed stages during resumed execution
- Clear checkpoints upon successful pipeline completion
- Support both file path and content hashing for checkpoint validation
…features

- Change description in pyproject.toml to reflect reasoning-native approach
- Rename Python bindings to Vectorless Python SDK
- Add QueryContext class for structured querying with method chaining
- Update Engine.query() to accept QueryContext instead of raw parameters
- Rename IndexContext.from_file() to from_path() and from_files() to from_paths()
- Add from_dir() method with recursive parameter option
- Add IndexItem class with document metadata properties
- Add source_path property to DocumentInfo class
- Update Engine initialization to remove workspace parameter
- Revise documentation comments to match new API structure
- Update quick start example to use new QueryContext pattern
- Change description from "Hierarchical, reasoning-native document
  intelligence engine" to "Reasoning-native document intelligence
  engine for AI"
- Removes hierarchical aspect and adds clearer AI context
- Update workspace package version from 0.1.27 to 0.1.28 in Cargo.toml
- Update vectorless package version from 0.1.6 to 0.1.7 in pyproject.toml
Add release workflow that triggers on version tags and handles:
- Publishing Rust crate to crates.io
- Publishing Python package to PyPI using maturin
- Creating GitHub Releases with auto-generated notes
- Using trusted publishers for PyPI authentication
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vectorless Ready Ready Preview, Comment Apr 15, 2026 7:53am

@zTgx zTgx merged commit f5b1f3f into main Apr 15, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant