Conversation
Signed-off-by: Xuchun Shang <xuchun.shang@linux.alibaba.com>
📝 WalkthroughWalkthroughThis change modifies tensor validation logic in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (3 warnings)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
mooncake-integration/store/store_py.cpp (2)
18-25: Restore validity gating to prevent invalid tensors from being stored.
extract_tensor_inforeturns a zeroed struct on errors (unsupported dtype, invalid dims, exceptions). Withvalid()hardcoded totrue,put_tensor_impland batch paths will store invalid metadata and a null data pointer instead of failing fast.🔧 Suggested fix
struct PyTensorInfo { uintptr_t data_ptr; size_t tensor_size; TensorMetadata metadata; + bool ok{false}; // Check validity - bool valid() const { return true; } + bool valid() const { return ok; } }; PyTensorInfo extract_tensor_info(const py::object &tensor, const std::string &key_name = "") { PyTensorInfo info = { 0, 0, {}, }; @@ if (dtype_enum == TensorDtype::UNKNOWN) { LOG(ERROR) << "Unsupported tensor dtype" << (key_name.empty() ? "" : " for " + key_name); - return {0, 0, {}}; + return info; } @@ if (ndim > 4) { LOG(ERROR) << "Tensor has more than 4 dimensions: " << ndim; - return {0, 0, {}}; + return info; } @@ } catch (const std::exception &e) { LOG(ERROR) << "Error extracting tensor info: " << e.what(); - return {0, 0, {}}; + return info; } - - return info; + info.ok = true; + return info; }
94-116: Guard metadata-only buffers to prevent shape/data mismatches and silent failures.When
total_length == sizeof(TensorMetadata), the resultingtensor_sizebecomes 0. If the metadata declares a non-empty shape (e.g.,shape = [10, 20]), the subsequent reshape at line 163 will fail silently—the exception is caught and only logged, causing the caller to receivepybind11::none()without clear indication that data is missing.This should only be accepted for truly empty tensors:
ndim == 0or at least one shape dimension is 0.Suggested validation
TensorDtype dtype_enum = static_cast<TensorDtype>(metadata.dtype); size_t tensor_size = total_length - sizeof(TensorMetadata); if (dtype_enum == TensorDtype::UNKNOWN) { if (take_ownership) { delete[] exported_data; } LOG(ERROR) << "Unknown tensor dtype"; return pybind11::none(); } + + // Reject metadata-only buffers unless the tensor is truly empty + if (tensor_size == 0 && metadata.ndim > 0) { + bool empty_expected = false; + for (int i = 0; i < metadata.ndim; ++i) { + if (metadata.shape[i] == 0) { + empty_expected = true; + break; + } + } + if (!empty_expected) { + if (take_ownership) { + delete[] exported_data; + } + LOG(ERROR) + << "Invalid tensor metadata: zero data for non-empty tensor"; + return pybind11::none(); + } + }Also applies to: 129-138
Description
Type of Change
How Has This Been Tested?
Checklist
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.