[Ascend] Implement Concat and Slice operators#629
[Ascend] Implement Concat and Slice operators#629chenghuaWang merged 3 commits intoUbiquitousLearning:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds Ascend backend implementations and factory registrations for Concat and Slice ops (AscendConcatOp, AscendSliceOp), plus tests validating FP16 concat/slice behavior. Implementations use ATB (Ascend Tensor Boost) and the Ascend memory manager for execution and workspace handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Harness
participant Op as AscendConcatOp / AscendSliceOp
participant Mem as AscendMemoryManager
participant ATB as ATB Runtime
participant Device as Ascend Device / Stream
Test->>Op: call setup(inputs)
Op->>Op: validate shapes / prepare outputs
Test->>Op: call forward(inputs)
Op->>Mem: request workspace (optional)
Mem-->>Op: workspace ptr
Op->>ATB: create & setup op with tensors and workspace
ATB-->>Op: configuration OK
Op->>Device: enqueue ATB execute on stream
Device-->>Op: execution complete (sync)
Op->>Mem: free workspace (if allocated)
Op-->>Test: return outputs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@mllm/backends/ascend/ops/AscendConcatOp.cpp`:
- Around line 46-124: The code does not validate concat_dim or input shapes
before indexing current_shape and calling run_concat; add checks after computing
concat_dim (from options().dim and inputs[0].rank()) to ensure 0 <= concat_dim <
inputs[0].rank(), and before the loop verify all inputs have the same rank as
inputs[0] and that all non-concat dimensions match (use Tensor::shape() and
inputs[i].shape()); if any check fails, return or MLLM_ERROR_EXIT with a clear
message referencing concat_dim, inputs, current_shape, and the run_concat
operation to fail fast and avoid out-of-bounds access.
- Around line 6-92: The code casts workspaceSize (uint64_t) to uint32_t when
calling getAscendMemoryManager().allocateBlock, which can truncate sizes >4GiB;
add `#include` <limits> and before allocateBlock check that workspaceSize <=
std::numeric_limits<uint32_t>::max(), and if it exceeds that limit call
MLLM_ERROR_EXIT (or other existing error path) to fail safely; then cast to
uint32_t only after the check and proceed with allocateBlock(workspace_size_32,
workspace_block_id) and getBlockPtr.
In `@mllm/backends/ascend/ops/AscendConcatOp.hpp`:
- Around line 12-24: Add Doxygen-style comments for the new public API: document
the AscendConcatOp class (purpose and that it implements aops::ConcatOp for
Ascend backend), the explicit constructor AscendConcatOp(const
aops::ConcatOpOptions& options) (describe options parameter), the overridden
methods setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs)
and forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs)
(describe purpose, inputs/outputs expectations and any runtime errors), and the
AscendConcatOpFactory class and its createOpImpl(const aops::ConcatOpOptions&
options) method (describe factory behavior and return value); place brief
Doxygen comments above each declaration using `@class/`@brief/@param/@return tags
as appropriate.
In `@mllm/backends/ascend/ops/AscendSliceOp.cpp`:
- Line 76: The forward() implementation pushes slice lengths into param.size
using `end - start` which can be negative if end < start; update the code in
`forward()` to use a non-negative clamp like `std::max(0, end - start)` (same
logic as `reshape()`), ensuring every push_back to `param.size` uses
`std::max(0, end - start)` so negative sizes are never passed to ATB; locate the
size push in the `forward()` function and apply this change for all dimensions.
In `@mllm/backends/ascend/ops/AscendSliceOp.hpp`:
- Around line 12-25: Add Doxygen-style comments for the public classes and their
methods: document class AscendSliceOp (brief purpose), its constructor
AscendSliceOp(const aops::SliceOpOptions& options) (describe options param), and
the public overrides setup, reshape, and forward (describe inputs/outputs and
behavior). Also document AscendSliceOpFactory (purpose) and its createOpImpl
method (describe returned shared_ptr<BaseOp> and the options parameter). Keep
comments brief, use /// or /** */ style above each declaration, and include
param/return tags where applicable.
🧹 Nitpick comments (2)
mllm/backends/ascend/ops/AscendSliceOp.cpp (2)
26-53: Extract duplicated index normalization logic to a helper function.The index normalization logic (handling
kAll, negative indices, clamping bounds) is duplicated betweenreshape()andforward(). This duplication led to the inconsistency wherereshape()usesstd::max(0, end - start)butforward()does not. Consider extracting this into a private helper method that returns normalized(start, size)pairs for each dimension.♻️ Suggested helper approach
// Add as private helper method in AscendSliceOp class std::vector<std::pair<int32_t, int32_t>> normalizeSliceIndices( const std::vector<int>& shape, const std::vector<aops::SliceIndex>& indices) { std::vector<std::pair<int32_t, int32_t>> result; for (size_t i = 0; i < shape.size(); ++i) { int32_t start = indices[i].start_; int32_t end = indices[i].end_; int32_t dim_size = shape[i]; if (start == kAll) start = 0; if (end == kAll) end = dim_size; if (start < 0) start += dim_size; if (end < 0) end += dim_size; start = std::max(0, std::min(start, dim_size)); end = std::max(0, std::min(end, dim_size)); int32_t size = std::max(0, end - start); result.emplace_back(start, size); } return result; }Then use this helper in both
reshape()andforward()to ensure consistent behavior.Also applies to: 55-77
104-124: Consider RAII or scope guards for ATB operation cleanup on error paths.If
op->Setup()fails at line 106 orop->Execute()fails at line 123,MLLM_ERROR_EXITis called without first destroyingopor freeingworkspace. IfMLLM_ERROR_EXITthrows an exception rather than terminating, this would leak resources.♻️ Suggested approach using a scope guard or unique_ptr
// Use a custom deleter with unique_ptr for automatic cleanup struct AtbOpDeleter { void operator()(atb::Operation* op) const { if (op) atb::DestroyOperation(op); } }; std::unique_ptr<atb::Operation, AtbOpDeleter> op_guard(op); // Then remove the explicit atb::DestroyOperation(op) at the endThis ensures cleanup even if an exception is thrown.
| #include <iostream> | ||
| #include <acl/acl.h> | ||
| #include <atb/atb_infer.h> | ||
| #include <atb/types.h> | ||
| #include <atb/utils.h> | ||
| #include <atb/infer_op_params.h> | ||
|
|
||
| #include "mllm/utils/Common.hpp" | ||
| #include "mllm/core/DataTypes.hpp" | ||
| #include "mllm/core/Tensor.hpp" | ||
| #include "mllm/backends/ascend/memory/AscendMemoryManager.hpp" | ||
| #include "mllm/backends/ascend/AscendCommon.hpp" | ||
|
|
||
| namespace mllm::ascend { | ||
|
|
||
| AscendConcatOp::AscendConcatOp(const aops::ConcatOpOptions& options) : aops::ConcatOp(options) {} | ||
|
|
||
| void AscendConcatOp::setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) { | ||
| BaseOp::setup(inputs, outputs); | ||
| } | ||
|
|
||
| void AscendConcatOp::forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) { | ||
| MLLM_RT_ASSERT(inputs.size() >= 1); | ||
| MLLM_RT_ASSERT_EQ(outputs.size(), 1); | ||
|
|
||
| if (inputs.size() == 1) { | ||
| const size_t data_size = inputs[0].bytes(); | ||
| const void* src_data = inputs[0].ptr<void>(); | ||
| void* dst_data = outputs[0].ptr<void>(); | ||
|
|
||
| if (src_data != dst_data) { | ||
| auto ret = aclrtMemcpy(dst_data, data_size, src_data, data_size, ACL_MEMCPY_DEVICE_TO_DEVICE); | ||
| if (ret != ACL_SUCCESS) { | ||
| MLLM_ACL_CHECK(ret); | ||
| } | ||
| syncGlobalAtbStream(); | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| int32_t concat_dim = options().dim; | ||
| if (concat_dim < 0) { | ||
| concat_dim += static_cast<int32_t>(inputs[0].rank()); | ||
| } | ||
|
|
||
| auto run_concat = [&](const Tensor& left, const Tensor& right, Tensor& out) { | ||
| atb::infer::ConcatParam param; | ||
| param.concatDim = concat_dim; | ||
|
|
||
| atb::Operation* op = nullptr; | ||
| auto st = atb::CreateOperation(param, &op); | ||
| if (st != atb::NO_ERROR || op == nullptr) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB CreateOperation(Concat) failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| atb::Context* atb_ctx = getGlobalAtbContext(); | ||
|
|
||
| atb::SVector<atb::Tensor> inTensors; | ||
| atb::Tensor atb_left; | ||
| atb::Tensor atb_right; | ||
| fillAtbTensor(left, atb_left); | ||
| fillAtbTensor(right, atb_right); | ||
| inTensors.push_back(atb_left); | ||
| inTensors.push_back(atb_right); | ||
|
|
||
| atb::Tensor atb_out; | ||
| fillAtbTensor(out, atb_out); | ||
| atb::SVector<atb::Tensor> outTensors; | ||
| outTensors.push_back(atb_out); | ||
|
|
||
| atb::VariantPack vp; | ||
| vp.inTensors = inTensors; | ||
| vp.outTensors = outTensors; | ||
|
|
||
| uint64_t workspaceSize = 0; | ||
| st = op->Setup(vp, workspaceSize, atb_ctx); | ||
| if (st != atb::NO_ERROR) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB ConcatOp Setup failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| void* workspace = nullptr; | ||
| int workspace_block_id = -1; | ||
| if (workspaceSize > 0) { | ||
| auto& mem_mgr = getAscendMemoryManager(); | ||
| mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id); | ||
| mem_mgr.getBlockPtr(workspace_block_id, workspace); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find allocateBlock signature in AscendMemoryManager
rg -n "allocateBlock" -g'*.hpp' -g'*.cpp' -A 3Repository: UbiquitousLearning/mllm
Length of output: 5388
Add overflow guard for workspace size cast to uint32_t.
workspaceSize (uint64_t) is cast to uint32_t for allocateBlock(). If ATB returns a workspace > 4 GiB, silent truncation will cause under-allocation and potential memory corruption. Add a bounds check before casting:
int workspace_block_id = -1;
if (workspaceSize > 0) {
+ if (workspaceSize > std::numeric_limits<uint32_t>::max()) {
+ MLLM_ERROR_EXIT(ExitCode::kAscendError,
+ "ATB ConcatOp workspace size exceeds uint32_t: {}",
+ workspaceSize);
+ }
auto& mem_mgr = getAscendMemoryManager();
mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id);
mem_mgr.getBlockPtr(workspace_block_id, workspace);
}Also include #include <limits> at the top.
🤖 Prompt for AI Agents
In `@mllm/backends/ascend/ops/AscendConcatOp.cpp` around lines 6 - 92, The code
casts workspaceSize (uint64_t) to uint32_t when calling
getAscendMemoryManager().allocateBlock, which can truncate sizes >4GiB; add
`#include` <limits> and before allocateBlock check that workspaceSize <=
std::numeric_limits<uint32_t>::max(), and if it exceeds that limit call
MLLM_ERROR_EXIT (or other existing error path) to fail safely; then cast to
uint32_t only after the check and proceed with allocateBlock(workspace_size_32,
workspace_block_id) and getBlockPtr.
| int32_t concat_dim = options().dim; | ||
| if (concat_dim < 0) { | ||
| concat_dim += static_cast<int32_t>(inputs[0].rank()); | ||
| } | ||
|
|
||
| auto run_concat = [&](const Tensor& left, const Tensor& right, Tensor& out) { | ||
| atb::infer::ConcatParam param; | ||
| param.concatDim = concat_dim; | ||
|
|
||
| atb::Operation* op = nullptr; | ||
| auto st = atb::CreateOperation(param, &op); | ||
| if (st != atb::NO_ERROR || op == nullptr) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB CreateOperation(Concat) failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| atb::Context* atb_ctx = getGlobalAtbContext(); | ||
|
|
||
| atb::SVector<atb::Tensor> inTensors; | ||
| atb::Tensor atb_left; | ||
| atb::Tensor atb_right; | ||
| fillAtbTensor(left, atb_left); | ||
| fillAtbTensor(right, atb_right); | ||
| inTensors.push_back(atb_left); | ||
| inTensors.push_back(atb_right); | ||
|
|
||
| atb::Tensor atb_out; | ||
| fillAtbTensor(out, atb_out); | ||
| atb::SVector<atb::Tensor> outTensors; | ||
| outTensors.push_back(atb_out); | ||
|
|
||
| atb::VariantPack vp; | ||
| vp.inTensors = inTensors; | ||
| vp.outTensors = outTensors; | ||
|
|
||
| uint64_t workspaceSize = 0; | ||
| st = op->Setup(vp, workspaceSize, atb_ctx); | ||
| if (st != atb::NO_ERROR) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB ConcatOp Setup failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| void* workspace = nullptr; | ||
| int workspace_block_id = -1; | ||
| if (workspaceSize > 0) { | ||
| auto& mem_mgr = getAscendMemoryManager(); | ||
| mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id); | ||
| mem_mgr.getBlockPtr(workspace_block_id, workspace); | ||
| } | ||
|
|
||
| { | ||
| ASCEND_TIME_SCOPE("AscendConcatOp::forward"); | ||
| st = op->Execute(vp, reinterpret_cast<uint8_t*>(workspace), workspaceSize, atb_ctx); | ||
| } | ||
|
|
||
| if (st != atb::NO_ERROR) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB ConcatOp Execute failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| syncGlobalAtbStream(); | ||
|
|
||
| if (workspace_block_id != -1) { | ||
| auto& mem_mgr = getAscendMemoryManager(); | ||
| mem_mgr.freeBlock(workspace_block_id); | ||
| } | ||
|
|
||
| atb::DestroyOperation(op); | ||
| }; | ||
|
|
||
| std::vector<int32_t> current_shape = inputs[0].shape(); | ||
| Tensor current = inputs[0]; | ||
|
|
||
| for (size_t i = 1; i < inputs.size(); ++i) { | ||
| current_shape[concat_dim] += inputs[i].shape()[concat_dim]; | ||
|
|
||
| if (i == inputs.size() - 1) { | ||
| run_concat(current, inputs[i], outputs[0]); | ||
| } else { | ||
| Tensor temp = Tensor::empty(current_shape, outputs[0].dtype(), outputs[0].device()).alloc(); | ||
| run_concat(current, inputs[i], temp); | ||
| current = temp; |
There was a problem hiding this comment.
Validate concat_dim and non-concat shapes before indexing.
concat_dim can still be out of range after adjustment, and mismatched ranks/dims will lead to out-of-bounds indexing on current_shape/shape(). Add bounds and shape checks to fail fast.
✅ Suggested bounds + shape validation
- int32_t concat_dim = options().dim;
- if (concat_dim < 0) {
- concat_dim += static_cast<int32_t>(inputs[0].rank());
- }
+ int32_t concat_dim = options().dim;
+ const int32_t rank = static_cast<int32_t>(inputs[0].rank());
+ if (concat_dim < 0) {
+ concat_dim += rank;
+ }
+ MLLM_RT_ASSERT(concat_dim >= 0 && concat_dim < rank);
- std::vector<int32_t> current_shape = inputs[0].shape();
+ const auto base_shape = inputs[0].shape();
+ std::vector<int32_t> current_shape = base_shape;
Tensor current = inputs[0];
for (size_t i = 1; i < inputs.size(); ++i) {
- current_shape[concat_dim] += inputs[i].shape()[concat_dim];
+ MLLM_RT_ASSERT_EQ(static_cast<int32_t>(inputs[i].rank()), rank);
+ for (int32_t d = 0; d < rank; ++d) {
+ if (d == concat_dim) continue;
+ MLLM_RT_ASSERT_EQ(inputs[i].shape()[d], base_shape[d]);
+ }
+ current_shape[concat_dim] += inputs[i].shape()[concat_dim];🤖 Prompt for AI Agents
In `@mllm/backends/ascend/ops/AscendConcatOp.cpp` around lines 46 - 124, The code
does not validate concat_dim or input shapes before indexing current_shape and
calling run_concat; add checks after computing concat_dim (from options().dim
and inputs[0].rank()) to ensure 0 <= concat_dim < inputs[0].rank(), and before
the loop verify all inputs have the same rank as inputs[0] and that all
non-concat dimensions match (use Tensor::shape() and inputs[i].shape()); if any
check fails, return or MLLM_ERROR_EXIT with a clear message referencing
concat_dim, inputs, current_shape, and the run_concat operation to fail fast and
avoid out-of-bounds access.
| class AscendConcatOp final : public aops::ConcatOp { | ||
| public: | ||
| explicit AscendConcatOp(const aops::ConcatOpOptions& options); | ||
|
|
||
| void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | ||
| void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | ||
| }; | ||
|
|
||
| class AscendConcatOpFactory final : public TypedOpFactory<OpTypes::kConcat, aops::ConcatOpOptions> { | ||
| public: | ||
| std::shared_ptr<BaseOp> createOpImpl(const aops::ConcatOpOptions& options) override { | ||
| return std::make_shared<AscendConcatOp>(options); | ||
| } |
There was a problem hiding this comment.
Add brief API docs for AscendConcatOp and its factory.
These new public classes/methods are missing doc comments; please add short Doxygen-style descriptions for purpose and parameters.
📝 Suggested doc comments
-class AscendConcatOp final : public aops::ConcatOp {
+/// Ascend backend implementation of Concat.
+class AscendConcatOp final : public aops::ConcatOp {
public:
- explicit AscendConcatOp(const aops::ConcatOpOptions& options);
+ /// Constructs the op with Concat options.
+ explicit AscendConcatOp(const aops::ConcatOpOptions& options);
- void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
- void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
+ /// Validates inputs/outputs and prepares resources.
+ void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
+ /// Executes the concat on Ascend.
+ void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
};
-class AscendConcatOpFactory final : public TypedOpFactory<OpTypes::kConcat, aops::ConcatOpOptions> {
+/// Factory for creating AscendConcatOp instances.
+class AscendConcatOpFactory final : public TypedOpFactory<OpTypes::kConcat, aops::ConcatOpOptions> {
public:
+ /// Creates an AscendConcatOp for the given options.
std::shared_ptr<BaseOp> createOpImpl(const aops::ConcatOpOptions& options) override {
return std::make_shared<AscendConcatOp>(options);
}
};As per coding guidelines: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class AscendConcatOp final : public aops::ConcatOp { | |
| public: | |
| explicit AscendConcatOp(const aops::ConcatOpOptions& options); | |
| void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| }; | |
| class AscendConcatOpFactory final : public TypedOpFactory<OpTypes::kConcat, aops::ConcatOpOptions> { | |
| public: | |
| std::shared_ptr<BaseOp> createOpImpl(const aops::ConcatOpOptions& options) override { | |
| return std::make_shared<AscendConcatOp>(options); | |
| } | |
| /// Ascend backend implementation of Concat. | |
| class AscendConcatOp final : public aops::ConcatOp { | |
| public: | |
| /// Constructs the op with Concat options. | |
| explicit AscendConcatOp(const aops::ConcatOpOptions& options); | |
| /// Validates inputs/outputs and prepares resources. | |
| void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| /// Executes the concat on Ascend. | |
| void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| }; | |
| /// Factory for creating AscendConcatOp instances. | |
| class AscendConcatOpFactory final : public TypedOpFactory<OpTypes::kConcat, aops::ConcatOpOptions> { | |
| public: | |
| /// Creates an AscendConcatOp for the given options. | |
| std::shared_ptr<BaseOp> createOpImpl(const aops::ConcatOpOptions& options) override { | |
| return std::make_shared<AscendConcatOp>(options); | |
| } | |
| }; |
🤖 Prompt for AI Agents
In `@mllm/backends/ascend/ops/AscendConcatOp.hpp` around lines 12 - 24, Add
Doxygen-style comments for the new public API: document the AscendConcatOp class
(purpose and that it implements aops::ConcatOp for Ascend backend), the explicit
constructor AscendConcatOp(const aops::ConcatOpOptions& options) (describe
options parameter), the overridden methods setup(const std::vector<Tensor>&
inputs, std::vector<Tensor>& outputs) and forward(const std::vector<Tensor>&
inputs, std::vector<Tensor>& outputs) (describe purpose, inputs/outputs
expectations and any runtime errors), and the AscendConcatOpFactory class and
its createOpImpl(const aops::ConcatOpOptions& options) method (describe factory
behavior and return value); place brief Doxygen comments above each declaration
using `@class/`@brief/@param/@return tags as appropriate.
| class AscendSliceOp final : public aops::SliceOp { | ||
| public: | ||
| explicit AscendSliceOp(const aops::SliceOpOptions& options); | ||
|
|
||
| void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | ||
| void reshape(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | ||
| void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | ||
| }; | ||
|
|
||
| class AscendSliceOpFactory final : public TypedOpFactory<OpTypes::kSlice, aops::SliceOpOptions> { | ||
| public: | ||
| std::shared_ptr<BaseOp> createOpImpl(const aops::SliceOpOptions& options) override { | ||
| return std::make_shared<AscendSliceOp>(options); | ||
| } |
There was a problem hiding this comment.
Add brief API docs for AscendSliceOp and its factory.
These new public classes/methods are missing doc comments; please add short Doxygen-style descriptions for purpose and parameters.
📝 Suggested doc comments
-class AscendSliceOp final : public aops::SliceOp {
+/// Ascend backend implementation of Slice.
+class AscendSliceOp final : public aops::SliceOp {
public:
- explicit AscendSliceOp(const aops::SliceOpOptions& options);
+ /// Constructs the op with Slice options.
+ explicit AscendSliceOp(const aops::SliceOpOptions& options);
- void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
- void reshape(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
- void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
+ /// Validates inputs/outputs and prepares resources.
+ void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
+ /// Infers output shapes from inputs.
+ void reshape(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
+ /// Executes the slice on Ascend.
+ void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override;
};
-class AscendSliceOpFactory final : public TypedOpFactory<OpTypes::kSlice, aops::SliceOpOptions> {
+/// Factory for creating AscendSliceOp instances.
+class AscendSliceOpFactory final : public TypedOpFactory<OpTypes::kSlice, aops::SliceOpOptions> {
public:
+ /// Creates an AscendSliceOp for the given options.
std::shared_ptr<BaseOp> createOpImpl(const aops::SliceOpOptions& options) override {
return std::make_shared<AscendSliceOp>(options);
}
};As per coding guidelines: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class AscendSliceOp final : public aops::SliceOp { | |
| public: | |
| explicit AscendSliceOp(const aops::SliceOpOptions& options); | |
| void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| void reshape(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| }; | |
| class AscendSliceOpFactory final : public TypedOpFactory<OpTypes::kSlice, aops::SliceOpOptions> { | |
| public: | |
| std::shared_ptr<BaseOp> createOpImpl(const aops::SliceOpOptions& options) override { | |
| return std::make_shared<AscendSliceOp>(options); | |
| } | |
| /// Ascend backend implementation of Slice. | |
| class AscendSliceOp final : public aops::SliceOp { | |
| public: | |
| /// Constructs the op with Slice options. | |
| explicit AscendSliceOp(const aops::SliceOpOptions& options); | |
| /// Validates inputs/outputs and prepares resources. | |
| void setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| /// Infers output shapes from inputs. | |
| void reshape(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| /// Executes the slice on Ascend. | |
| void forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) override; | |
| }; | |
| /// Factory for creating AscendSliceOp instances. | |
| class AscendSliceOpFactory final : public TypedOpFactory<OpTypes::kSlice, aops::SliceOpOptions> { | |
| public: | |
| /// Creates an AscendSliceOp for the given options. | |
| std::shared_ptr<BaseOp> createOpImpl(const aops::SliceOpOptions& options) override { | |
| return std::make_shared<AscendSliceOp>(options); | |
| } | |
| }; |
🤖 Prompt for AI Agents
In `@mllm/backends/ascend/ops/AscendSliceOp.hpp` around lines 12 - 25, Add
Doxygen-style comments for the public classes and their methods: document class
AscendSliceOp (brief purpose), its constructor AscendSliceOp(const
aops::SliceOpOptions& options) (describe options param), and the public
overrides setup, reshape, and forward (describe inputs/outputs and behavior).
Also document AscendSliceOpFactory (purpose) and its createOpImpl method
(describe returned shared_ptr<BaseOp> and the options parameter). Keep comments
brief, use /// or /** */ style above each declaration, and include param/return
tags where applicable.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@mllm/backends/ascend/ops/AscendSliceOp.cpp`:
- Around line 55-77: AscendSliceOp::forward is reading options().indices_
per-dimension without checking its size, which can cause out-of-bounds access;
before the loop add a hard validation that options().indices_.size() ==
input.shape().size() (or else throw/LOG(FATAL)/assert) so the method fails fast
on rank mismatch; ensure this check occurs right after obtaining shape and
slice_index and use the same validation style as the reshape assert used
elsewhere in the class.
- Around line 103-115: The code casts workspaceSize (uint64_t) to uint32_t
before calling mem_mgr.allocateBlock, which can truncate sizes > UINT32_MAX;
update the Allocate path in the AscendSliceOp flow by checking if workspaceSize
> std::numeric_limits<uint32_t>::max() after op->Setup (or if workspaceSize ==
0) and handle it deterministically (log and return/exit with an error similar to
MLLM_ERROR_EXIT or use an alternative allocation path) instead of blindly
casting; reference workspaceSize, op->Setup, getAscendMemoryManager(),
mem_mgr.allocateBlock(...) and mem_mgr.getBlockPtr(...) when making the change.
| void AscendSliceOp::forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) { | ||
| atb::infer::SliceParam param; | ||
| auto& input = inputs[0]; | ||
| auto shape = input.shape(); | ||
| auto slice_index = options().indices_; | ||
|
|
||
| for(size_t i=0; i<shape.size(); ++i) { | ||
| int32_t start = slice_index[i].start_; | ||
| int32_t end = slice_index[i].end_; | ||
| int32_t dim_size = shape[i]; | ||
|
|
||
| if (start == kAll) start = 0; | ||
| if (end == kAll) end = dim_size; | ||
|
|
||
| if (start < 0) start += dim_size; | ||
| if (end < 0) end += dim_size; | ||
|
|
||
| start = std::max(0, std::min(start, dim_size)); | ||
| end = std::max(0, std::min(end, dim_size)); | ||
|
|
||
| param.offsets.push_back(start); | ||
| param.size.push_back(std::max(0, end - start)); | ||
| } |
There was a problem hiding this comment.
Add slice index size validation in forward() to prevent OOB.
forward() assumes options().indices_ matches rank; if it doesn’t, slice_index[i] can read past the vector. Mirror the reshape assert here to hard-fail early.
🛠️ Proposed fix
void AscendSliceOp::forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) {
atb::infer::SliceParam param;
auto& input = inputs[0];
auto shape = input.shape();
auto slice_index = options().indices_;
+ MLLM_RT_ASSERT_EQ(slice_index.size(), shape.size());
for(size_t i=0; i<shape.size(); ++i) {🤖 Prompt for AI Agents
In `@mllm/backends/ascend/ops/AscendSliceOp.cpp` around lines 55 - 77,
AscendSliceOp::forward is reading options().indices_ per-dimension without
checking its size, which can cause out-of-bounds access; before the loop add a
hard validation that options().indices_.size() == input.shape().size() (or else
throw/LOG(FATAL)/assert) so the method fails fast on rank mismatch; ensure this
check occurs right after obtaining shape and slice_index and use the same
validation style as the reshape assert used elsewhere in the class.
| uint64_t workspaceSize = 0; | ||
| st = op->Setup(vp, workspaceSize, atb_ctx); | ||
| if (st != atb::NO_ERROR) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB SliceOp Setup failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| void* workspace = nullptr; | ||
| int workspace_block_id = -1; | ||
| if (workspaceSize > 0) { | ||
| auto& mem_mgr = getAscendMemoryManager(); | ||
| mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id); | ||
| mem_mgr.getBlockPtr(workspace_block_id, workspace); | ||
| } |
There was a problem hiding this comment.
Guard against workspaceSize overflow before casting to uint32_t.
If ATB returns a workspace larger than 4GB, the cast will truncate and can under-allocate, risking memory corruption.
🛠️ Proposed fix
+#include <limits>
@@
uint64_t workspaceSize = 0;
st = op->Setup(vp, workspaceSize, atb_ctx);
@@
void* workspace = nullptr;
int workspace_block_id = -1;
if (workspaceSize > 0) {
+ if (workspaceSize > std::numeric_limits<uint32_t>::max()) {
+ MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB SliceOp workspace too large, size={}", workspaceSize);
+ }
auto& mem_mgr = getAscendMemoryManager();
mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id);
mem_mgr.getBlockPtr(workspace_block_id, workspace);
}🤖 Prompt for AI Agents
In `@mllm/backends/ascend/ops/AscendSliceOp.cpp` around lines 103 - 115, The code
casts workspaceSize (uint64_t) to uint32_t before calling mem_mgr.allocateBlock,
which can truncate sizes > UINT32_MAX; update the Allocate path in the
AscendSliceOp flow by checking if workspaceSize >
std::numeric_limits<uint32_t>::max() after op->Setup (or if workspaceSize == 0)
and handle it deterministically (log and return/exit with an error similar to
MLLM_ERROR_EXIT or use an alternative allocation path) instead of blindly
casting; reference workspaceSize, op->Setup, getAscendMemoryManager(),
mem_mgr.allocateBlock(...) and mem_mgr.getBlockPtr(...) when making the change.
Summary
Implemented
ConcatandSliceoperators for Ascend backend using Huawei ATB.Key Changes
AscendConcatOpandAscendSliceOp.AscendBackend.cpp.Testing
Summary by CodeRabbit
New Features
Tests