-
Notifications
You must be signed in to change notification settings - Fork 0
Description
With Upstream's Move to v1.0.0, Recoco will need a complete Rust API
Recoco is building the first and only pure-Rust API for the incremental data processing engine that powers CocoIndex. Upstream has committed fully to Python for their v1 user-facing API, Recoco must provide a native Rust experience: typed operations, proc macros, explicit context management, and zero Python dependencies.
If you're looking for a Rust API for CocoIndex, this is it.
Upstream context
The CocoIndex community has expressed clear demand for a Rust API:
- [FEATURE] Rust API cocoindex-io/cocoindex#1372 —
[FEATURE] Rust APIrequested by @gkgoat1. A founder responded positively but no work has been planned or milestoned. - [FEATURE] Ergonomic Rust SDK cocoindex-io/cocoindex#1667 —
[FEATURE] Ergonomic Rust SDKproposed by @tomz-alt with a detailed, well-received design. A team member provided constructive feedback that improved the proposal. The issue remains open with no milestone or assignee. (While clearly not aligning with the planned direction for v1, which moves all integration to Python)
Upstream's v1 milestone is exclusively Python-focused. Neither Rust API issue is milestoned. The v1 branch has rebuilt all API and integration layers in Python, with the Rust engine serving only as an internal runtime.
Recoco's v1.0.0 branch has full parity with upstream's Rust -- sharing the same high-performance Rust engine (LMDB-backed state, component memoization, target reconciliation, incremental processing). But with upstream's divergence, we need to plan for a proper all-Rust API while keeping in sync and benefiting from upstream's engine development. I would have preferred more alignment, but it is what it is -- their motivations and needs don't align with mine. It makes sense for them, but doesn't give me the foundation I need for Thread
This issue will serve as the tracking and planning issue for Recoco's v1 design and API.
What we're building
Goals:
- Continue to match upstream's engine in
recoco-core, optimizing for Rust where it makes sense, and contributing improvements upstream where directions/needs align. (I've admittedly been bad about this; I need to do better) - Continue our obsessiveness about feature gating and performance optimization.
- Build the Rust API for upstream's engine. This means to the extent practical extending idioms and API surfaces in upstream's python to Rust. There will necessarily be large divergences -- Rust is a very different language and Rustaceans expect different things from their APIs.
- Aim to implement parity with upstream's integrations in Rust (what were functions/sources/targets in pre-v1). Granularly feature gated without compromise. I'm hoping we pick up some community interest to help in this department.
#[recoco::function(memo)]
async fn process_file(ctx: &Ctx, file: FileEntry, table: &TableTarget) -> Result<()> {
let text = file.read_text().await?;
let chunks = RecursiveSplitter::default().split(&text, SplitOptions::default());
let embedder = ctx.use_resource(&EMBEDDER);
for chunk in &chunks {
let embedding = embedder.embed(&chunk.text).await?;
table.declare_row(/* ... */)?;
}
Ok(())
}Programming model: Persistent-state-driven — transformations are plain Rust async fn functions. The engine handles incrementality, memoization, lineage, and fault tolerance transparently. No DSL, no graph builder, no string-based dispatch.
Key features:
#[recoco::function]proc macro withmemo,batching, and code-hash-based cache invalidation (our blake3 implementation vice upstream's blake2)Environment+Appfor LMDB-backed persistent stateCtxwith typedContextKey<T>for explicit resource managementmount_each()for parallel, incremental component processing- Sources as iterators, functions as direct method calls, targets as declarative mounts
- Feature-gated everything — only compile what you use
Design ancestry
Our API design draws directly from @tomz-alt's proposal in cocoindex-io/cocoindex#1667, refined with the feedback from upstream maintainers and adapted for Recoco's pure-Rust context. We diverge where Rust idioms demand it (e.g., ContextKey<T> over type-erased lookup, Environment/App separation, richer target abstractions).
@tomz-alt — if you're interested in contributing to a Rust implementation of the ideas you proposed, we'd welcome you. Your design work and understanding of the engine are exactly what this project needs. You mentioned having a partial implementation; we'd love to build on that foundation.
Current state
| Component | Status | Location |
|---|---|---|
| Engine (LMDB, components, memoization) | Done | v1.0.0 branch, crates/core/ |
| Operations (sources, functions, targets) | Done (old arch) | main branch, crates/recoco-core/src/ops/ |
Concrete RecocoProfile |
Not started | — |
Ctx / ContextKey / Environment / App API |
Not started | — |
#[recoco::function] proc macro |
Not started | — |
| Operation port (old arch → new API) | Not started | — |
The engine is ready. The operations exist but are wired for the pre-v1 FlowBuilder architecture. This epic tracks bridging them with a Rust-native API layer.
Implementation plan
Phase 1: Foundation — Concrete RecocoProfile, Ctx, ContextKey<T>, Environment, App
Phase 2: Proc macros — recoco-macros crate, #[recoco::function(memo, batching)]
Phase 3: Port operations — Sources, functions, targets as standalone Rust types
Phase 4: Transient mode — One-shot execution without LMDB persistence
Phase 5: Query handlers — Search endpoints, graph DB mappings, reusable transforms
Areas requiring deep design
Each of these will get their own issue before implementation begins:
-
RecocoProfileconcrete types — 8 associated types that define the entire API surface - Proc macro design —
#[recoco::function]code generation, error quality, compile-fail tests -
Ctxlifetime and ownership — Thread safety, child contexts, reference ergonomics - Target state reconciliation — Serialization format, row identity, diffing, backend mapping
- Incremental source change detection — Engine integration, CDC, file watching
- Crate structure — Workspace organization, feature flag design, public API paths
- Error handling —
thiserrortypes for public API, propagation through the engine - Operation porting strategy — Priority order, standard patterns, config migration
- Docs - Update v1 docs to reflect the new surface and API.
Related links
Upstream Rust API demand:
- [FEATURE] Rust API cocoindex-io/cocoindex#1372 —
[FEATURE] Rust API - [FEATURE] Ergonomic Rust SDK cocoindex-io/cocoindex#1667 —
[FEATURE] Ergonomic Rust SDK(design proposal)
Upstream v1 direction (Python-only):
- v1 milestone — all Python-focused issues
- v1 branch — Python API, PyO3 bindings
Recoco design documents:
docs/v1-way-ahead.md— Current analysis and recommendations
Metadata
Metadata
Assignees
Labels
Type
Projects
Status