Affected Component
Python/C/other API (and underlying C++ layer)
Current Behavior
The fetch() method currently retrieves all fields of a record when queried by primary key(s). Unlike query(), which supports the output_fields parameter to specify which fields to return, fetch() lacks this capability. As a result, users must download full records—including potentially large fields like document text or embedding vectors—even when only lightweight metadata (e.g., status, source_file, chunk_index) is needed. This leads to unnecessary network I/O, memory usage, and processing overhead.
Desired Improvement
Extend the fetch() method to accept an optional output_fields parameter, consistent with the existing query() interface. When provided, only the specified fields should be returned.
This requires:
- Adding the
output_fields parameter to the Python/C API
- Propagating the field selection down to the storage layer (may require C++ changes)
- Ensuring backward compatibility: when
output_fields is not specified, fetch() should continue to return all fields as before
Impact
- Reduced resource consumption: Avoid transferring and deserializing unused data (e.g., skip large embedding vectors or full document text when only metadata is needed).
- Improved performance: Faster response times for lightweight lookups by primary key.
- Better API consistency: Unified field-selection behavior between
query() and fetch(), reducing cognitive load for developers.
- Enables efficient real-world workflows, such as:
- Checking whether a file has been processed — only metadata fields are needed
- Fetching neighboring chunks — only
source_file and chunk_index are required
- Tracking tunnel connections — only a small set of metadata fields are used for preview
Affected Component
Python/C/other API (and underlying C++ layer)
Current Behavior
The fetch() method currently retrieves all fields of a record when queried by primary key(s). Unlike query(), which supports the output_fields parameter to specify which fields to return, fetch() lacks this capability. As a result, users must download full records—including potentially large fields like document text or embedding vectors—even when only lightweight metadata (e.g., status, source_file, chunk_index) is needed. This leads to unnecessary network I/O, memory usage, and processing overhead.
Desired Improvement
Extend the
fetch()method to accept an optionaloutput_fieldsparameter, consistent with the existingquery()interface. When provided, only the specified fields should be returned.This requires:
output_fieldsparameter to the Python/C APIoutput_fieldsis not specified,fetch()should continue to return all fields as beforeImpact
query()andfetch(), reducing cognitive load for developers.source_fileandchunk_indexare required