Skip to content

Sparse test#7

Open
XucSh wants to merge 1 commit intomainfrom
sparse
Open

Sparse test#7
XucSh wants to merge 1 commit intomainfrom
sparse

Conversation

@XucSh
Copy link
Owner

@XucSh XucSh commented Jan 17, 2026

Description

Type of Change

  • Types
    • Bug fix
    • New feature
      • Transfer Engine
      • Mooncake Store
      • Mooncake EP
      • Integration
      • P2P Store
      • Python Wheel
    • Breaking change
    • CI/CD
    • Documentation update
    • Other

How Has This Been Tested?

Checklist

  • I have performed a self-review of my own code.
  • I have updated the documentation.
  • I have added tests to prove my changes are effective.

Summary by CodeRabbit

  • Bug Fixes
    • Updated tensor validation logic and boundary condition checking to improve tensor data processing robustness in edge cases.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Xuchun Shang <xuchun.shang@linux.alibaba.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

This change modifies tensor validation logic in store_py.cpp by relaxing boundary condition checks from <= to < for metadata size comparisons and making PyTensorInfo::valid() unconditionally return true instead of checking tensor size. This alters error handling and control flow around tensor data boundary validation.

Changes

Cohort / File(s) Summary
Tensor Validation Logic
mooncake-integration/store/store_py.cpp
Modified PyTensorInfo::valid() to always return true; changed boundary condition checks from <= to < for sizeof(TensorMetadata) comparisons in buffer_to_tensor(), relaxing validation thresholds; removed tensor size as validity gate while preserving UNKNOWN dtype error handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A tensor dances through the gate,
Where boundaries now are loose, not strait!
Valid() claims truth absolute and clear,
Metadata checks relax their sternest fear—
Size and shape in harmony appear! 🐇

🚥 Pre-merge checks | ❌ 3
❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Sparse test' does not match the actual changes, which are modifications to tensor validity checks and boundary conditions in store_py.cpp, not testing-related code. Update the title to reflect the actual changes, such as 'Fix tensor boundary checks and validity logic in buffer_to_tensor' to accurately describe the modifications.
Description check ⚠️ Warning The PR description is entirely empty—all template sections lack content, including the description, type of change selection, testing approach, and checklist completion. Provide a clear description of changes, select the appropriate change type, explain how changes were tested, and complete the developer checklist items.
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
mooncake-integration/store/store_py.cpp (2)

18-25: Restore validity gating to prevent invalid tensors from being stored.

extract_tensor_info returns a zeroed struct on errors (unsupported dtype, invalid dims, exceptions). With valid() hardcoded to true, put_tensor_impl and batch paths will store invalid metadata and a null data pointer instead of failing fast.

🔧 Suggested fix
 struct PyTensorInfo {
     uintptr_t data_ptr;
     size_t tensor_size;
     TensorMetadata metadata;
+    bool ok{false};

     // Check validity
-    bool valid() const { return true; }
+    bool valid() const { return ok; }
 };

 PyTensorInfo extract_tensor_info(const py::object &tensor,
                                  const std::string &key_name = "") {
     PyTensorInfo info = {
         0,
         0,
         {},
     };
@@
         if (dtype_enum == TensorDtype::UNKNOWN) {
             LOG(ERROR) << "Unsupported tensor dtype"
                        << (key_name.empty() ? "" : " for " + key_name);
-            return {0, 0, {}};
+            return info;
         }
@@
         if (ndim > 4) {
             LOG(ERROR) << "Tensor has more than 4 dimensions: " << ndim;
-            return {0, 0, {}};
+            return info;
         }
@@
     } catch (const std::exception &e) {
         LOG(ERROR) << "Error extracting tensor info: " << e.what();
-        return {0, 0, {}};
+        return info;
     }
-
-    return info;
+    info.ok = true;
+    return info;
 }

94-116: Guard metadata-only buffers to prevent shape/data mismatches and silent failures.

When total_length == sizeof(TensorMetadata), the resulting tensor_size becomes 0. If the metadata declares a non-empty shape (e.g., shape = [10, 20]), the subsequent reshape at line 163 will fail silently—the exception is caught and only logged, causing the caller to receive pybind11::none() without clear indication that data is missing.

This should only be accepted for truly empty tensors: ndim == 0 or at least one shape dimension is 0.

Suggested validation
    TensorDtype dtype_enum = static_cast<TensorDtype>(metadata.dtype);
    size_t tensor_size = total_length - sizeof(TensorMetadata);

    if (dtype_enum == TensorDtype::UNKNOWN) {
        if (take_ownership) {
            delete[] exported_data;
        }
        LOG(ERROR) << "Unknown tensor dtype";
        return pybind11::none();
    }
+
+    // Reject metadata-only buffers unless the tensor is truly empty
+    if (tensor_size == 0 && metadata.ndim > 0) {
+        bool empty_expected = false;
+        for (int i = 0; i < metadata.ndim; ++i) {
+            if (metadata.shape[i] == 0) {
+                empty_expected = true;
+                break;
+            }
+        }
+        if (!empty_expected) {
+            if (take_ownership) {
+                delete[] exported_data;
+            }
+            LOG(ERROR)
+                << "Invalid tensor metadata: zero data for non-empty tensor";
+            return pybind11::none();
+        }
+    }

Also applies to: 129-138

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant