feat(ascend): initial Ascend backend and add elementwise add op#564
feat(ascend): initial Ascend backend and add elementwise add op#564chenghuaWang merged 16 commits intoUbiquitousLearning:mainfrom
Conversation
WalkthroughAdds a complete Ascend backend: device enumeration, global ATB/ACL context and stream, per-device memory pools and manager, pool-backed allocator, dispatcher, ATB-based ops (Add, X2X), tests, CMake build integration, and documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Dispatcher as AscendDispatcher
participant Backend as AscendBackend
participant MemMgr as AscendMemoryManager
participant MemPool as AscendMemoryPool
participant Alloc as AscendAllocator
participant ATB as ATB/ACL
Client->>Dispatcher: submit Task (kExecuteOp / kExecuteModule)
activate Dispatcher
Dispatcher->>Dispatcher: route & optional trace
Dispatcher->>Backend: forward(inputs, outputs)
activate Backend
Backend->>MemMgr: allocateBlock(workspace_size)
activate MemMgr
MemMgr->>MemPool: allocateBlock(size)
activate MemPool
MemPool-->>MemMgr: block_id + ptr
deactivate MemPool
MemMgr-->>Backend: workspace ptr + id
deactivate MemMgr
Backend->>ATB: create op, fill tensor descs, setup
ATB-->>Backend: op ready
Backend->>ATB: execute op
ATB-->>Backend: done
Backend->>MemMgr: freeBlock(block_id)
Backend-->>Dispatcher: complete
deactivate Backend
Dispatcher-->>Client: result
deactivate Dispatcher
sequenceDiagram
participant Init
participant Register as initAscendBackend()
participant MemMgr as AscendMemoryManager
participant MemPool as AscendMemoryPool
participant Alloc as AscendAllocator
participant Dispatcher as AscendDispatcher
participant Context as GlobalContext
Init->>Register: call initAscendBackend()
activate Register
Register->>MemMgr: createMemoryPool(100MB) per device
activate MemMgr
loop per device
MemMgr->>MemPool: construct pool (aclrtMalloc base region)
MemPool-->>MemMgr: pool ready
end
deactivate MemMgr
Register->>Context: register AscendBackend
Register->>MemMgr: register allocator to memory manager
Register->>Dispatcher: create & register AscendDispatcher
Register-->>Init: done
deactivate Register
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
mllm/backends/ascend/AscendAllocator.hpp (1)
15-44: Delete copy/move for allocator (mutex member + resource tracking) to satisfy Rule of Fiveclass AscendAllocator final : public Allocator { public: AscendAllocator(); - ~AscendAllocator(); + ~AscendAllocator() override; + + AscendAllocator(const AscendAllocator&) = delete; + AscendAllocator& operator=(const AscendAllocator&) = delete; + AscendAllocator(AscendAllocator&&) = delete; + AscendAllocator& operator=(AscendAllocator&&) = delete;
🟠 Major comments (18)
examples/CMakeLists.txt-18-18 (1)
18-18: Add conditional compilation guard for Ascend demo.The Ascend demo is added unconditionally, which will cause build failures on systems without the Ascend SDK. Following the pattern of other backends (e.g., QNN with
MLLM_BUILD_QNN_BACKEND), this should be conditional.Apply this diff to add a conditional guard:
+if(MLLM_BUILD_ASCEND_BACKEND) add_subdirectory(ascend_add_demo) +endif()mllm/backends/ascend/ops/AscnedElewiseOps.cpp-1-4 (1)
1-4: Fix filename typo (Ascned→Ascend) to avoid build/grep confusion.
Even if CMake references it correctly today, this will cause ongoing maintenance issues (and is easy to miss in future edits).mllm/backends/ascend/memory/AscendMemoryPool.cpp-73-104 (1)
73-104: Harden free/get semantics (and fix clang-tidy iterator naming).
getBlockPtr()should setaddr=nullptron failure; otherwise callers can accidentally reuse stale pointers. Also clang-tidy is flaggingitnaming length.void AscendMemoryPool::freeBlock(int block_id) { std::unique_lock<std::mutex> lock(block_mutex_); @@ - auto it = used_blocks_.find(block_id); - if (it != used_blocks_.end()) { - free_blocks_.insert(*it); - used_blocks_.erase(it); + auto used_it = used_blocks_.find(block_id); + if (used_it != used_blocks_.end()) { + free_blocks_.insert(*used_it); + used_blocks_.erase(used_it); } else { MLLM_ERROR("Double free block id {}", block_id); } } void AscendMemoryPool::getBlockPtr(int block_id, void *&addr) { std::unique_lock<std::mutex> lock(block_mutex_); + addr = nullptr; @@ - auto it = used_blocks_.find(block_id); - if (it != used_blocks_.end()) { - addr = it->second.address_; + auto used_it = used_blocks_.find(block_id); + if (used_it != used_blocks_.end()) { + addr = used_it->second.address_; } else { MLLM_ERROR("Get block address error, block id {}", block_id); } }mllm/backends/ascend/AscendAllocator.cpp-23-47 (1)
23-47: Prevent size truncation when allocating blocks (size_t → uint32_t).
AscendMemoryManager::allocateBlocktakesuint32_t;storage->size_is typicallysize_t. Please validate/cast explicitly and fail cleanly ifsize_ > UINT32_MAX.bool AscendAllocator::alloc(Storage* storage) { auto& mem_manager = getAscendMemoryManager(); int block_id = -1; - mem_manager.allocateBlock(storage->size_, block_id); + if (storage->size_ > std::numeric_limits<uint32_t>::max()) { + MLLM_ERROR("Storage size too large for Ascend memory pool: {} bytes", storage->size_); + return false; + } + mem_manager.allocateBlock(static_cast<uint32_t>(storage->size_), block_id);mllm/backends/ascend/ops/AscendElewiseOps.hpp-6-25 (1)
6-25: Make header self-contained (+ add brief API docs).
This file usesstd::shared_ptr/std::make_sharedbut doesn’t include<memory>(and similarly relies on transitive<vector>). Also, public op/factory declarations should have a short comment per guidelines.#pragma once +#include <memory> +#include <vector> + #include "mllm/core/BaseOp.hpp" #include "mllm/core/aops/ElewiseOps.hpp" #include "mllm/core/OpTypes.hpp" namespace mllm::ascend { +// Ascend implementation of elementwise add (currently no-broadcast, same-dtype). class AscendAddOp final : public aops::AddOp { public: explicit AscendAddOp(const aops::AddOpOptions& options); @@ class AscendAddOpFactory final : public TypedOpFactory<OpTypes::kAdd, aops::AddOpOptions> { public: std::shared_ptr<BaseOp> createOpImpl(const aops::AddOpOptions& options) override { return std::make_shared<AscendAddOp>(options); } };Committable suggestion skipped: line range outside the PR's diff.
mllm/backends/ascend/memory/AscendMemoryManager.hpp-6-35 (1)
6-35: Tighten API surface (includes + docs + avoid returning mutable ref).
Add<cstddef>forsize_t, add short comments for public APIs, and prefer returningstd::shared_ptr<AscendMemoryPool>(by value) orconst std::shared_ptr<...>&to avoid external mutation ofmemory_pools_.#pragma once +#include <cstddef> #include <memory> #include <vector> #include <cstdint> #include "AscendMemoryPool.hpp" @@ class AscendMemoryManager { public: AscendMemoryManager(); + // Create a pool for the current device (or specified device if supported by impl). void createMemoryPool(size_t pool_size); @@ - std::shared_ptr<AscendMemoryPool> &getMemoryPool(); + std::shared_ptr<AscendMemoryPool> getMemoryPool();Committable suggestion skipped: line range outside the PR's diff.
mllm/backends/ascend/AscendAllocator.hpp-40-43 (1)
40-43: Fix map key/comment mismatch (void*vs “Storage ptr”)Either change the key to
Storage*(if that’s what you mean), or update the comment to match what the pointer actually represents (e.g.,storage->data()pointer / device ptr).- std::unordered_map<void*, int> storage_to_block_id_; // Storage ptr -> block ID + std::unordered_map<void*, int> storage_to_block_id_; // device ptr (e.g., storage->data()) -> block IDmllm/backends/ascend/memory/AscendMemoryManager.cpp-21-34 (1)
21-34:createMemoryPool: checkaclrtSetDevice, restore previous device, and prevent double-init leaksToday it changes global device context and doesn’t restore it; also repeated calls will keep pushing pools.
void AscendMemoryManager::createMemoryPool(size_t pool_size) { - uint32_t device_count = 0; + uint32_t device_count{}; auto ret = aclrtGetDeviceCount(&device_count); MLLM_ACL_CHECK(ret); + + int32_t prev_device_id{}; + ret = aclrtGetDevice(&prev_device_id); + MLLM_ACL_CHECK(ret); + + if (!memory_pools_.empty()) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "createMemoryPool called more than once"); + } + for (size_t i = 0; i < device_count; i++) { - - aclrtSetDevice(i); + ret = aclrtSetDevice(static_cast<int32_t>(i)); + MLLM_ACL_CHECK(ret); std::shared_ptr<AscendMemoryPool> memory_pool = std::make_shared<AscendMemoryPool>(pool_size); memory_pools_.push_back(memory_pool); MLLM_INFO("create mempool for device {} success", i); } + + ret = aclrtSetDevice(prev_device_id); + MLLM_ACL_CHECK(ret); }mllm/backends/ascend/AscendDispatcher.cpp-34-44 (1)
34-44: Remove deadMLLM_EMPTY_SCOPEstatement in switch block, and consider guardingthiscapture withshared_from_this()for async safetyThe
MLLM_EMPTY_SCOPE;statement in the switch case is a no-op and can be removed. More importantly, the lambda[this, task]captures a raw pointer to the dispatcher. While the design assumesDispatcherManagerkeeps the dispatcher alive viashared_ptr, usingshared_from_this()would provide defensive safety: if the dispatcher is ever removed from the manager while async work is pending, the rawthispointer would become invalid. Consider using[self = shared_from_this(), task]instead, which requires addingenable_shared_from_this<AscendDispatcher>to the class hierarchy.mllm/backends/ascend/AscendCommon.cpp-124-185 (1)
124-185: RevisitaclInit/aclFinalizeinsideAscendDeviceMetaInfoctor
CallingaclInit()/aclFinalize()inside a singleton constructor (Line 127, 174) can conflict with other runtime init (includinggetGlobalAtbContext()which does ACL calls but doesn’t init). Prefer a single, well-defined ACL lifetime owner (or make enumeration reuse the existing initialized runtime and avoid finalizing here).mllm/backends/ascend/AscendCommon.hpp-44-50 (1)
44-50: Default-initializeAscendDeviceInfofields to satisfy member-init checksstruct AscendDeviceInfo { - std::string name; - unsigned int id; - size_t total_memory; // bytes - size_t free_memory; // bytes - std::string soc_version; + std::string name{}; + unsigned int id{0}; + size_t total_memory{0}; // bytes + size_t free_memory{0}; // bytes + std::string soc_version{}; };mllm/backends/ascend/AscendCommon.cpp-14-41 (1)
14-41: Guard against null Tensor internals and potential shared-storage corruption inrelease()
tensor_.impl()->storage()->ptr_ = nullptr;(Line 39) assumes impl/storage are non-null and that you “own” that storage uniquely. If Tensor storage is shared, nulling the pointer can break other tensors.At minimum, add null checks; ideally, avoid writing Tensor internals directly (provide a Tensor API / storage owner API).
void AscendTensorHandle::release() { if (block_id_ >= 0) { auto& mem_mgr = getAscendMemoryManager(); mem_mgr.freeBlock(block_id_); block_id_ = -1; - tensor_.impl()->storage()->ptr_ = nullptr; + if (auto* impl = tensor_.impl(); impl && impl->storage()) { + impl->storage()->ptr_ = nullptr; + } } }mllm/backends/ascend/AscendCommon.hpp-69-89 (1)
69-89: Initializetensor_and add[[nodiscard]]to query/accessors
Addresses clang-tidy findings and avoids accidentally ignoring ownership state.struct AscendTensorHandle { AscendTensorHandle() = default; @@ - bool valid() const { return block_id_ >= 0; } // Check if handle owns a valid memory block + [[nodiscard]] bool valid() const { return block_id_ >= 0; } - Tensor& tensor() { return tensor_; } // Access tensor - const Tensor& tensor() const { return tensor_; } // Access tensor (const) - int blockId() const { return block_id_; } // Get memory block ID + [[nodiscard]] Tensor& tensor() { return tensor_; } + [[nodiscard]] const Tensor& tensor() const { return tensor_; } + [[nodiscard]] int blockId() const { return block_id_; } private: - Tensor tensor_; + Tensor tensor_{}; int block_id_{-1}; };mllm/backends/ascend/CMakeLists.txt-1-2 (1)
1-2: Add conditional guards for Ascend library linking based onRUN_MODE
RUN_MODEis defined (Line 1) but only used to conditionally apply the-gflag (Line 59); the Ascend libraries (ascendcl,opapi,nnopbase,atb—Lines 45–51) are linked unconditionally regardless of RUN_MODE. Ifcpuandsimmodes should build without NPU-specific libraries, wrap thetarget_link_libraries()call (Lines 45–51) and related include/link directory setup (Lines 35–42) withif(RUN_MODE STREQUAL "npu") ... endif(). Additionally, the unconditional-O2flag (Line 60) should align with the build type; consider applying it conditionally or only for Release builds.mllm/backends/ascend/AscendCommon.hpp-53-66 (1)
53-66: Make special member intent explicit for the singleton (AscendDeviceMetaInfo)
Clang-tidy will keep complaining unless you explicitly delete moves / default dtor.class AscendDeviceMetaInfo { public: AscendDeviceMetaInfo(); + ~AscendDeviceMetaInfo() = default; @@ AscendDeviceMetaInfo(const AscendDeviceMetaInfo&) = delete; AscendDeviceMetaInfo& operator=(const AscendDeviceMetaInfo&) = delete; + AscendDeviceMetaInfo(AscendDeviceMetaInfo&&) = delete; + AscendDeviceMetaInfo& operator=(AscendDeviceMetaInfo&&) = delete;mllm/backends/ascend/AscendCommon.cpp-43-77 (1)
43-77: Avoiduint32_tsizing / add allocation failure handling / reduce “swappable params” risk
bytesisuint32_t(Line 58): can overflow for large tensors; prefersize_t.allocateBlock(bytes, block_id)(Line 60): no visible error handling if allocation fails/throws.- Signature
(..., int batch, int size)(Line 43–45) is easy to swap.- const uint32_t bytes = static_cast<uint32_t>(expected_elements * sizeof(half_float::half)); + const size_t bytes = expected_elements * sizeof(half_float::half); mem_mgr.allocateBlock(bytes, block_id); + MLLM_RT_ASSERT(block_id >= 0);Consider changing API to something like
prepareAscendTensor(const std::vector<float>& host_data, std::array<int, 2> shape)orprepareAscendTensor(..., int batch, int features).mllm/backends/ascend/AscendCommon.cpp-113-122 (1)
113-122: Add bounds validation and use tensor dtype mappingThe function unconditionally writes to
desc.shape.dims[i]without checking the array capacity—add a bounds check to ensureshape.size()doesn't exceed ATB's dimension limit (assert or validate the rank).The hardcoded
ACL_FLOAT16dtype ignorest.dtype()which is available on the Tensor class. Implement dtype mapping fromDataTypesto ACL dtype constants (e.g.,kFloat32→ACL_FLOAT,kFloat16→ACL_FLOAT16,kInt8→ACL_INT8).mllm/backends/ascend/AscendCommon.hpp-17-28 (1)
17-28: Replace function-like error-check macros with inline functionsBoth macros can trigger clang-tidy warnings (macro-usage) and reduce type safety. Replace with inline functions:
-#define MLLM_ACL_CHECK(err) \ - if (err != ACL_SUCCESS) { \ - MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ACL error code {}: {}", int(err), aclGetRecentErrMsg()); \ - } - -#define MLLM_ATB_CHECK(err) \ - if (err != atb::NO_ERROR) { \ - MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ATB error code {}", int(err)); \ - } +namespace mllm::ascend { +inline void aclCheck(aclError err) { + if (err != ACL_SUCCESS) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ACL error code {}: {}", int(err), aclGetRecentErrMsg()); + } +} +inline void atbCheck(atb::ErrorCode err) { + if (err != atb::NO_ERROR) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ATB error code {}", int(err)); + } +} +} // namespace mllm::ascendUpdate call sites:
MLLM_ACL_CHECK(ret)→mllm::ascend::aclCheck(ret)(same for ATB).
🟡 Minor comments (4)
examples/ascend_add_demo/main.cpp-117-117 (1)
117-117: Replace Chinese comment with English.Comments in source files should be in English per the coding guidelines.
Apply this diff:
- // 清理内存池中的块 + // Release memory pool blocks x_handle.release();examples/ascend_add_demo/main.cpp-52-52 (1)
52-52: Remove debug print statement.This appears to be a leftover debug statement from development.
Apply this diff:
auto& ctx = Context::instance(); - std::cout << "context over" <<std::endl; auto z_ascend = ctx.buildOpAndSubmitTask(examples/ascend_add_demo/CMakeLists.txt-11-11 (1)
11-11: Replace Chinese comment with English.Comments in source files should be in English per the coding guidelines.
Apply this diff:
- ascendcl # 添加 ACL 库,因为 main.cpp 中直接使用了 aclrtMemcpy + ascendcl # ACL library required for aclrtMemcpy used in main.cppmllm/backends/ascend/AscendCommon.cpp-1-187 (1)
1-187: Remove trailing whitespace at line 81 of AscendCommon.cppVerification found trailing whitespace at line 81 that violates the repository's encoding and formatting rules.
🧹 Nitpick comments (11)
tasks/build_arm_ascend.yaml (2)
9-9: Android API level 28 is outdated; document the rationale or consider updating.Android API level 28 (Android 9.0, released in 2018) is quite old. Modern Android projects typically target API 31+. If API 28 is intentional for broad compatibility, document this decision. If not, consider updating to a more recent level.
12-12: Hard-coded install prefix may limit portability.The install prefix
/root/mllm-install-android-arm64-v8aassumes/root/is accessible and writable. While the current CI/CD workflow runs with elevated permissions, this will be problematic if the build configuration is used in non-root environments or in containers with restricted permissions.Consider using a relative path for better portability:
- - "-DCMAKE_INSTALL_PREFIX=/root/mllm-install-android-arm64-v8a" + - "-DCMAKE_INSTALL_PREFIX=./mllm-install-android-arm64-v8a"This aligns with the pattern used in other build configurations like
tasks/build_sdk_android.yamland allows the install directory to be placed relative to the build directory regardless of the execution environment.mllm/backends/ascend/Register.cpp (2)
19-19: Consider making memory pool size configurable.The 100MB pool size is hardcoded. While the comment mentions "can be adjusted as needed," exposing this as a configuration parameter (environment variable or function argument) would provide better flexibility for different deployment scenarios.
Consider this approach:
// Add parameter with default void initAscendBackend(size_t pool_size = 100 * 1024 * 1024) { auto& ctx = Context::instance(); ascend::getAscendMemoryManager().createMemoryPool(pool_size); MLLM_INFO("Ascend memory pool initialized with {} MB", pool_size / (1024 * 1024)); // ... rest of function }
36-38: Remove or document commented code scaffolding.The commented custom ops registration is placeholder code. Either remove it or add a TODO comment explaining when/how it should be used.
If keeping for documentation purposes:
// TODO: Example of registering custom ops: // ctx.registerCustomizedOp(kAscend, "CustomOpName", // std::make_shared<CustomOpFactory>());mllm/backends/ascend/AscendDispatcher.hpp (1)
17-38: Add brief API comments for new public types (required by repo guidelines)Public header introduces new API surface (options, dispatcher, factory) but has no purpose/behavior comments (params/returns/errors).
mllm/backends/ascend/memory/AscendMemoryPool.hpp (2)
20-24: Prefersize_tfor allocation sizes to avoid truncation
uint32_t sizecan truncate large allocations on 64-bit builds; considersize_t.
15-25: Add brief comments for public methods (purpose / params / errors)This is a new public header API and should describe allocation policy, failure behavior, and thread-safety expectations.
mllm/backends/ascend/CMakeLists.txt (2)
17-29: Avoid barefile(GLOB ...)for sources (or addCONFIGURE_DEPENDS)
WithoutCONFIGURE_DEPENDS, adding/removing files won’t reliably trigger a CMake reconfigure.-file(GLOB MLLM_ASCEND_CORE_FILES ${CMAKE_CURRENT_SOURCE_DIR}/Ascend*.cpp) -file(GLOB MLLM_ASCEND_MEMORY_FILES ${CMAKE_CURRENT_SOURCE_DIR}/memory/*.cpp) -file(GLOB MLLM_ASCEND_OPS_FILES ${CMAKE_CURRENT_SOURCE_DIR}/ops/*.cpp) +file(GLOB CONFIGURE_DEPENDS MLLM_ASCEND_CORE_FILES "${CMAKE_CURRENT_SOURCE_DIR}/Ascend*.cpp") +file(GLOB CONFIGURE_DEPENDS MLLM_ASCEND_MEMORY_FILES "${CMAKE_CURRENT_SOURCE_DIR}/memory/*.cpp") +file(GLOB CONFIGURE_DEPENDS MLLM_ASCEND_OPS_FILES "${CMAKE_CURRENT_SOURCE_DIR}/ops/*.cpp")
30-42: Quote env-derived paths; consider avoiding PUBLIC exposure of machine-local include dirs
Unquoted$ENV{...}breaks if the path contains spaces. Also,PUBLICinclude dirs propagate into dependents (and can leak build-machine paths).- target_include_directories(MllmAscendBackend PUBLIC $ENV{ASCEND_HOME_PATH}/include) - target_link_directories(MllmAscendBackend PRIVATE $ENV{ASCEND_HOME_PATH}/lib64) + target_include_directories(MllmAscendBackend PUBLIC "$ENV{ASCEND_HOME_PATH}/include") + target_link_directories(MllmAscendBackend PRIVATE "$ENV{ASCEND_HOME_PATH}/lib64") @@ - target_include_directories(MllmAscendBackend PUBLIC $ENV{ATB_HOME_PATH}/include) - target_link_directories(MllmAscendBackend PRIVATE $ENV{ATB_HOME_PATH}/lib) + target_include_directories(MllmAscendBackend PUBLIC "$ENV{ATB_HOME_PATH}/include") + target_link_directories(MllmAscendBackend PRIVATE "$ENV{ATB_HOME_PATH}/lib")mllm/backends/ascend/AscendCommon.hpp (2)
41-42: Renamet→tensor(identifier-length rule)
fillAtbTensorDesc(const Tensor& t, ...)(Line 41) is flagged; rename for readability + clang-tidy compliance.
92-95: Reduce “easily swappable parameters” forprepareAscendTensor
(host_data, int batch, int size)is easy to call incorrectly. PreferprepareAscendTensor(host_data, /*shape=*/{batch, size})or renamesize→features(or similar).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
examples/CMakeLists.txt(1 hunks)examples/ascend_add_demo/CMakeLists.txt(1 hunks)examples/ascend_add_demo/main.cpp(1 hunks)mllm/backends/ascend/AscendAllocator.cpp(1 hunks)mllm/backends/ascend/AscendAllocator.hpp(2 hunks)mllm/backends/ascend/AscendBackend.cpp(1 hunks)mllm/backends/ascend/AscendCommon.cpp(1 hunks)mllm/backends/ascend/AscendCommon.hpp(1 hunks)mllm/backends/ascend/AscendDispatcher.cpp(1 hunks)mllm/backends/ascend/AscendDispatcher.hpp(1 hunks)mllm/backends/ascend/CMakeLists.txt(2 hunks)mllm/backends/ascend/Register.cpp(1 hunks)mllm/backends/ascend/memory/AscendMemoryManager.cpp(1 hunks)mllm/backends/ascend/memory/AscendMemoryManager.hpp(1 hunks)mllm/backends/ascend/memory/AscendMemoryPool.cpp(1 hunks)mllm/backends/ascend/memory/AscendMemoryPool.hpp(1 hunks)mllm/backends/ascend/memory/MemoryBlock.hpp(1 hunks)mllm/backends/ascend/ops/AscendElewiseOps.hpp(1 hunks)mllm/backends/ascend/ops/AscnedElewiseOps.cpp(1 hunks)mllm/core/DeviceTypes.hpp(2 hunks)mllm/engine/Dispatcher.hpp(1 hunks)mllm/utils/Common.hpp(1 hunks)tasks/build_arm_ascend.yaml(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/engine/Dispatcher.hppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/AscendCommon.hppmllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/MemoryBlock.hppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/utils/Common.hppmllm/core/DeviceTypes.hppmllm/backends/ascend/Register.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendAllocator.hppmllm/backends/ascend/CMakeLists.txt
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/engine/Dispatcher.hppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/AscendCommon.hppmllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/MemoryBlock.hppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/utils/Common.hppmllm/core/DeviceTypes.hppmllm/backends/ascend/Register.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendAllocator.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/engine/Dispatcher.hppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/AscendCommon.hppmllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/MemoryBlock.hppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/utils/Common.hppmllm/core/DeviceTypes.hppmllm/backends/ascend/Register.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendAllocator.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/Register.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/AscendCommon.cpp
{mllm,mllm-cli,pymllm}/**/*.{toml,json,txt,md}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
When suggesting new dependencies, evaluate their maturity, licensing, and maintenance status. Prefer well-established and actively maintained libraries.
Files:
mllm/backends/ascend/CMakeLists.txt
🧬 Code graph analysis (7)
mllm/backends/ascend/memory/AscendMemoryManager.hpp (1)
mllm/backends/ascend/memory/AscendMemoryPool.hpp (3)
size(20-20)block_id(22-22)block_id(24-24)
mllm/backends/ascend/AscendBackend.cpp (4)
mllm/backends/ascend/AscendBackend.hpp (1)
AscendBackend(13-13)mllm/backends/ascend/AscendAllocator.cpp (2)
createAscendAllocator(104-104)createAscendAllocator(104-104)mllm/backends/ascend/AscendAllocator.hpp (1)
createAscendAllocator(46-46)mllm/backends/ascend/AscendCommon.hpp (1)
instance(57-60)
mllm/backends/ascend/memory/AscendMemoryPool.hpp (1)
mllm/backends/ascend/memory/AscendMemoryPool.cpp (2)
AscendMemoryPool(14-26)AscendMemoryPool(28-36)
mllm/backends/ascend/AscendCommon.hpp (1)
mllm/backends/ascend/AscendCommon.cpp (5)
AscendTensorHandle(14-15)AscendTensorHandle(17-17)AscendTensorHandle(19-22)other(24-32)other(24-24)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (1)
mllm/backends/ascend/ops/AscendElewiseOps.hpp (5)
AscendAddOp(14-14)options(22-24)options(22-22)inputs(16-16)inputs(17-17)
mllm/backends/ascend/Register.cpp (5)
mllm/backends/ascend/AscendCommon.hpp (1)
instance(57-60)mllm/backends/ascend/memory/AscendMemoryManager.hpp (1)
pool_size(18-18)mllm/engine/DispatcherManager.hpp (1)
dispatcher(49-49)mllm/backends/ascend/AscendDispatcher.cpp (2)
createAscendDispatcher(84-87)createAscendDispatcher(84-85)mllm/backends/ascend/AscendDispatcher.hpp (1)
createAscendDispatcher(36-37)
mllm/backends/ascend/AscendAllocator.hpp (1)
mllm/backends/ascend/AscendAllocator.cpp (2)
AscendAllocator(11-13)AscendAllocator(15-21)
🪛 Clang (14.0.6)
mllm/engine/Dispatcher.hpp
[error] 28-28: invalid case style for global constant 'ascend_dispatcher_id'
(readability-identifier-naming,-warnings-as-errors)
examples/ascend_add_demo/main.cpp
[error] 98-98: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 111-111: if with identical then and else branches
(bugprone-branch-clone,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryPool.hpp
[error] 15-15: class 'AscendMemoryPool' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.hpp
[error] 18-18: function-like macro 'MLLM_ACL_CHECK' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 19-19: macro argument should be enclosed in parentheses
(bugprone-macro-parentheses,-warnings-as-errors)
[error] 24-24: function-like macro 'MLLM_ATB_CHECK' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 25-25: macro argument should be enclosed in parentheses
(bugprone-macro-parentheses,-warnings-as-errors)
[error] 41-41: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 44-44: constructor does not initialize these fields: name, id, total_memory, free_memory, soc_version
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 53-53: class 'AscendDeviceMetaInfo' defines a copy constructor and a copy assignment operator but does not define a destructor, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 69-69: constructor does not initialize these fields: tensor_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 80-80: function 'valid' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 83-83: function 'tensor' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 84-84: function 'blockId' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
mllm/backends/ascend/AscendDispatcher.hpp
[error] 13-13: constructor does not initialize these fields: MLLM_EMPTY_SCOPE
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 21-21: parameter name 'id' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryPool.cpp
[error] 14-14: constructor does not initialize these fields: block_mutex_, used_blocks_, free_blocks_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 42-42: method 'allocateBlock' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 45-45: variable 'align_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 59-59: variable 'cur_mem_ptr_align' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 73-73: method 'freeBlock' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 81-81: variable name 'it' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 90-90: method 'getBlockPtr' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 98-98: variable name 'it' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/ascend/memory/MemoryBlock.hpp
[error] 11-11: constructor does not initialize these fields: block_id_, block_size_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryManager.cpp
[error] 13-13: variable 'g_ascendMemoryManager' is non-const and globally accessible, consider making it const
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 13-13: invalid case style for variable 'g_ascendMemoryManager'
(readability-identifier-naming,-warnings-as-errors)
[error] 15-15: use '= default' to define a trivial default constructor
(modernize-use-equals-default,-warnings-as-errors)
[error] 21-21: method 'createMemoryPool' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 23-23: variable 'device_count' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 38-38: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 46-46: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
mllm/backends/ascend/Register.cpp
[error] 19-19: variable 'pool_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.cpp
[error] 12-12: variable 'g_atb_stream' is non-const and globally accessible, consider making it const
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 43-43: 3 adjacent parameters of 'prepareAscendTensor' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 46-46: variable 'expected_elements' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 58-58: variable 'bytes' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 80-80: variable 'ctx' provides global access to a non-const object; consider making the pointed-to data 'const'
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 113-113: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/ascend/AscendAllocator.hpp
[error] 15-15: class 'AscendAllocator' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
🔇 Additional comments (9)
mllm/utils/Common.hpp (1)
34-34: LGTM! Well-integrated enum value addition.The
kAscendErrorenum value is correctly placed alongside other backend-specific error codes and follows the established naming convention.mllm/core/DeviceTypes.hpp (1)
36-37: LGTM! Ascend device type integration follows existing patterns.The additions to both conversion functions are consistent with existing device types and properly handle the new Ascend and AscendHost device types.
Also applies to: 52-55
mllm/engine/Dispatcher.hpp (1)
28-28: LGTM! Dispatcher ID follows existing naming conventions.The naming style is consistent with all other dispatcher IDs in the file (cpu_dispatcher_id, cuda_dispatcher_id, etc.). While the static analysis tool flags the naming convention, changing only this identifier would create inconsistency with the established codebase pattern.
mllm/backends/ascend/AscendBackend.cpp (1)
13-21: LGTM! Backend initialization follows established patterns.The constructor properly registers the operation factory and logs device information. The lambda for byte-to-MB conversion is clear and the logging will be helpful for users to verify their Ascend hardware configuration.
examples/ascend_add_demo/main.cpp (1)
17-21: LGTM! Excellent error handling and resource management.The demo demonstrates good practices:
- Comprehensive error handling with try-catch
- Proper cleanup on both success and error paths (lines 70-72, 118-119)
- Clear verification logic with tolerance checking
- Informative output for users
Also applies to: 63-73, 111-121, 123-126
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (2)
22-35: The filemllm/backends/ascend/ops/AscnedElewiseOps.cppdoes not exist in the repository. The review comment references a code location that cannot be found in the codebase. Verify the correct file path and file name before proceeding with this review.Likely an incorrect or invalid review comment.
37-116: The filemllm/backends/ascend/ops/AscnedElewiseOps.cppdoes not exist in the repository, and the code snippet shown in the review comment cannot be located through any search. This means either:
- The code is being added in a pending change not yet in the repository, or
- The file path is incorrect (note the typo "Ascned" instead of "Ascend")
Without access to the actual file being reviewed, the substantive concerns cannot be definitively verified. However, the review comment raises legitimate C++ safety concerns that are generally valid:
- null-pointer checks for context objects are prudent defensive programming
- RAII guards for resource management (operation handles) align with C++ best practices per the coding guidelines
- Explicit handling of memory allocation failures is required for production-ready code
To properly verify this review comment, please:
- Confirm the correct file path
- Provide context on whether this is code in an uncommitted change or pending PR
- Ensure the file is accessible in the current repository state
mllm/backends/ascend/memory/AscendMemoryPool.cpp (1)
12-26: ****The file
mllm/backends/ascend/memory/AscendMemoryPool.cppmentioned in this review comment could not be located in the current repository state. Manual verification is required to:
- Confirm the file exists at the specified path or locate its actual location
- Verify the header file
AscendMemoryPool.hppand whether it declares a default argument for the constructor- Validate the presence of trailing whitespace on the lines mentioned
- Confirm clang-tidy member-initialization warnings with
-Werrorflag in the CI configuration- Review the actual class member declarations to ensure the proposed initializer list in the diff matches all members
The review comment's claims regarding default argument placement, clang-tidy warnings, and whitespace violations cannot be verified without access to the actual source files.
mllm/backends/ascend/AscendAllocator.hpp (1)
20-20: Double-checkctrlByMemManager()semantics now that allocations are “memory pool support”If this flag is used by the engine to decide who owns lifecycle/free behavior, returning
falsemay be wrong after switching to a global memory manager + block IDs.
| case TaskTypes::kExecuteModule: { | ||
| auto moduleName = static_cast<nn::Module*>(task->custom_context_ptr)->getModuleName(); | ||
| #ifdef MLLM_PERFETTO_ENABLE | ||
| MLLM_PERF_TRACE_EVENT("mllm.ascend.execute.", perfetto::DynamicString{moduleName}, | ||
| [&](perfetto::EventContext ctx) { | ||
| int cnt = 0; | ||
| for (auto& i : task->inputs) { | ||
| ctx.AddDebugAnnotation(perfetto::DynamicString{"inputs-" + std::to_string(cnt++)}, | ||
| i.shape()); | ||
| } | ||
| }); | ||
| #endif | ||
| auto ascendBackend = std::static_pointer_cast<AscendBackend>(Context::instance().getBackend(kAscend)); | ||
|
|
||
| task->outputs = ((nn::Module*)(task->custom_context_ptr))->forward(task->inputs, task->args); | ||
|
|
||
| // TODO: | ||
| // ascendBackend->graphExecute(moduleName, task->inputs, task->outputs); | ||
| break; | ||
| } | ||
| default: NYI("AscendDispatcher::process not supported task type"); | ||
| } |
There was a problem hiding this comment.
Avoid unchecked custom_context_ptr cast + fix unused ascendBackend (-Werror risk)
task->custom_context_ptris used asnn::Module*without null/type validation.ascendBackendis currently unused (likely -Wunused-variable).
case TaskTypes::kExecuteModule: {
- auto moduleName = static_cast<nn::Module*>(task->custom_context_ptr)->getModuleName();
+ auto* module = static_cast<nn::Module*>(task->custom_context_ptr);
+ if (module == nullptr) {
+ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Task custom_context_ptr is null (expected nn::Module*)");
+ }
+ auto moduleName = module->getModuleName();
#ifdef MLLM_PERFETTO_ENABLE
MLLM_PERF_TRACE_EVENT("mllm.ascend.execute.", perfetto::DynamicString{moduleName},
[&](perfetto::EventContext ctx) {
int cnt = 0;
for (auto& i : task->inputs) {
ctx.AddDebugAnnotation(perfetto::DynamicString{"inputs-" + std::to_string(cnt++)},
i.shape());
}
});
#endif
- auto ascendBackend = std::static_pointer_cast<AscendBackend>(Context::instance().getBackend(kAscend));
+ [[maybe_unused]] auto ascend_backend =
+ std::static_pointer_cast<AscendBackend>(Context::instance().getBackend(kAscend));
- task->outputs = ((nn::Module*)(task->custom_context_ptr))->forward(task->inputs, task->args);
+ task->outputs = module->forward(task->inputs, task->args);
- // TODO:
- // ascendBackend->graphExecute(moduleName, task->inputs, task->outputs);
+ // TODO: Wire graph execution path (if/when supported on Ascend backend).
+ // ascend_backend->graphExecute(moduleName, task->inputs, task->outputs);
break;
}🤖 Prompt for AI Agents
In mllm/backends/ascend/AscendDispatcher.cpp around lines 56 to 77, avoid the
unchecked static cast of task->custom_context_ptr to nn::Module* and resolve the
unused ascendBackend variable: first validate that task->custom_context_ptr is
non-null and is a valid nn::Module instance (use a safe cast such as
dynamic_cast on a properly-typed pointer or check a runtime type field) and
handle the error path (log and bail out) if the pointer is null or the cast
fails; replace the static_cast with that safe cast and use the resulting
nn::Module* to call forward; for ascendBackend either remove the variable if not
needed or mark it used (e.g., call the TODO graphExecute path or explicitly
suppress unused-warning) so there is no -Wunused-variable.
| std::shared_ptr<AscendMemoryPool> &AscendMemoryManager::getMemoryPool() | ||
| { | ||
| size_t device_id = static_cast<size_t>(getDeviceId()); | ||
| if (device_id >= memory_pools_.size()) { | ||
| MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id); | ||
| } | ||
| return memory_pools_[device_id]; | ||
| } |
There was a problem hiding this comment.
getMemoryPool: handle negative device id and “not initialized” state explicitly
Casting -1 to size_t creates a huge index and hides the real error.
std::shared_ptr<AscendMemoryPool> &AscendMemoryManager::getMemoryPool()
{
- size_t device_id = static_cast<size_t>(getDeviceId());
+ if (memory_pools_.empty()) {
+ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Ascend memory pools not initialized");
+ }
+ const int32_t device_id_i32 = getDeviceId();
+ if (device_id_i32 < 0) {
+ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id_i32);
+ }
+ const size_t device_id = static_cast<size_t>(device_id_i32);
if (device_id >= memory_pools_.size()) {
MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id);
}
return memory_pools_[device_id];
}🧰 Tools
🪛 Clang (14.0.6)
[error] 46-46: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/backends/ascend/memory/AscendMemoryManager.cpp around lines 44 to 51,
the current code casts getDeviceId() to size_t which hides negative values
(e.g., -1) by producing a huge unsigned index; change to read the device id as a
signed integer (auto or int), explicitly check for device_id < 0 and call
MLLM_ERROR_EXIT with a clear "device not initialized" or "invalid device id"
message, then check the non-negative device_id against memory_pools_.size()
before returning memory_pools_[device_id]; this ensures negative/uninitialized
ids are handled distinctly from out-of-range indices.
| void AscendMemoryPool::allocateBlock(uint32_t size, int &block_id) { | ||
| std::unique_lock<std::mutex> lock(block_mutex_); | ||
|
|
||
| size_t align_size = ((size + 31) & ~31) + 32; | ||
|
|
||
| for (auto it = free_blocks_.begin(); it != free_blocks_.end(); it++) { | ||
| if (it->second.block_size_ >= align_size) { | ||
| block_id = it->second.block_id_; | ||
| used_blocks_.insert(*it); | ||
| free_blocks_.erase(it); | ||
| MLLM_INFO("find free block id {} to allocate", block_id); | ||
| return; | ||
| } | ||
| } | ||
|
|
||
| if (remain_size_ > align_size) { | ||
| block_id = generateBlocksId(); | ||
| uint64_t cur_mem_ptr_align = (reinterpret_cast<uint64_t>(cur_mem_ptr_) + 63) & ~63; | ||
| remain_size_ -= (cur_mem_ptr_align - reinterpret_cast<uint64_t>(cur_mem_ptr_)); | ||
| cur_mem_ptr_ = reinterpret_cast<void *>(cur_mem_ptr_align); | ||
|
|
||
| MemoryBlock block = {block_id, align_size, cur_mem_ptr_}; | ||
| used_blocks_.insert({block_id, block}); | ||
| remain_size_ -= align_size; | ||
| cur_mem_ptr_ = reinterpret_cast<uint8_t *>(cur_mem_ptr_) + align_size; | ||
| MLLM_INFO("allocate block id {} for size {}", block_id, align_size); | ||
| return; | ||
| } | ||
| MLLM_ERROR("allocate block fail"); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 1. Find AscendMemoryPool and related files
echo "=== Finding Ascend memory-related files ==="
find . -type f -name "*AscendMemory*" -o -name "*AscendAllocator*" 2>/dev/null | head -20
# 2. Look for AscendAllocator::alignSize
echo -e "\n=== Searching for AscendAllocator::alignSize ==="
rg "alignSize" --type cpp --type h -B 2 -A 5 2>/dev/null | head -60
# 3. Check current memory pool implementation
echo -e "\n=== Current AscendMemoryPool.cpp (lines around 42-71) ==="
wc -l mllm/backends/ascend/memory/AscendMemoryPool.cpp 2>/dev/nullRepository: UbiquitousLearning/mllm
Length of output: 4294
Fix pool capacity check + alignment math (padding underflow / exact-fit bug).
The allocation check uses remain_size_ > align_size but fails to account for pointer alignment padding, causing exact-fit allocations to be rejected and potential underflow of remain_size_ in edge cases. For example, with remain_size_ = 96 and align_size = 96, the check 96 > 96 fails even though the allocation is possible. Additionally, with padding to a 64B boundary, a value like remain_size_ = 144 passes the check but underflows after subtracting both padding (64 bytes) and the block size (96 bytes), resulting in -16. The alignment formula ((size + 31) & ~31) + 32 also uses an inconsistent approach compared to pointer alignment (ptr + 63) & ~63, and the uint32_t size parameter can overflow in the arithmetic.
Use uintptr_t and unify to 64B alignment with constexpr size_t kAlign = 64. Cast the size parameter to size_t to prevent overflow. Change the capacity check to remain_size_ >= padding + align_size to account for alignment overhead and allow exact-fit allocations. Rename the iterator variable to free_it for clarity. Set block_id = -1 on allocation failure and enhance the error log with allocation parameters for debugging.
void AscendMemoryPool::allocateBlock(uint32_t size, int &block_id) {
std::unique_lock<std::mutex> lock(block_mutex_);
- size_t align_size = ((size + 31) & ~31) + 32;
+ constexpr size_t kAlign = 64;
+ const size_t req = static_cast<size_t>(size);
+ const size_t align_size = (req + (kAlign - 1)) & ~(kAlign - 1);
- for (auto it = free_blocks_.begin(); it != free_blocks_.end(); it++) {
- if (it->second.block_size_ >= align_size) {
- block_id = it->second.block_id_;
- used_blocks_.insert(*it);
- free_blocks_.erase(it);
+ for (auto free_it = free_blocks_.begin(); free_it != free_blocks_.end(); ++free_it) {
+ if (free_it->second.block_size_ >= align_size) {
+ block_id = free_it->second.block_id_;
+ used_blocks_.insert(*free_it);
+ free_blocks_.erase(free_it);
MLLM_INFO("find free block id {} to allocate", block_id);
return;
}
}
- if (remain_size_ > align_size) {
+ const uintptr_t cur = reinterpret_cast<uintptr_t>(cur_mem_ptr_);
+ const uintptr_t aligned = (cur + (kAlign - 1)) & ~(static_cast<uintptr_t>(kAlign - 1));
+ const size_t padding = static_cast<size_t>(aligned - cur);
+ if (remain_size_ >= padding + align_size) {
block_id = generateBlocksId();
- uint64_t cur_mem_ptr_align = (reinterpret_cast<uint64_t>(cur_mem_ptr_) + 63) & ~63;
- remain_size_ -= (cur_mem_ptr_align - reinterpret_cast<uint64_t>(cur_mem_ptr_));
- cur_mem_ptr_ = reinterpret_cast<void *>(cur_mem_ptr_align);
+ remain_size_ -= padding;
+ cur_mem_ptr_ = reinterpret_cast<void *>(aligned);
MemoryBlock block = {block_id, align_size, cur_mem_ptr_};
used_blocks_.insert({block_id, block});
remain_size_ -= align_size;
cur_mem_ptr_ = reinterpret_cast<uint8_t *>(cur_mem_ptr_) + align_size;
MLLM_INFO("allocate block id {} for size {}", block_id, align_size);
return;
}
- MLLM_ERROR("allocate block fail");
+ block_id = -1;
+ MLLM_ERROR("allocate block fail: req={}, aligned={}, remain={}", req, align_size, remain_size_);
}🧰 Tools
🪛 Clang (14.0.6)
[error] 42-42: method 'allocateBlock' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 45-45: variable 'align_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 59-59: variable 'cur_mem_ptr_align' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/backends/ascend/memory/AscendMemoryPool.cpp around lines 42 to 71, the
allocation logic does incorrect alignment math and capacity checking: change to
use constexpr size_t kAlign = 64 and cast the uint32_t size parameter to size_t
to avoid overflow, compute align_size as a 64-byte-aligned size (use the same
64B alignment rule you use for the pointer), compute padding =
((uintptr_t(cur_mem_ptr_) + (kAlign-1)) & ~(kAlign-1)) -
uintptr_t(cur_mem_ptr_), and then check remain_size_ >= padding + align_size
(allowing exact-fit and preventing underflow); when advancing pointer use
uintptr_t for alignment and update remain_size_ by subtracting padding before
subtracting align_size; rename the free-block iterator to free_it for clarity;
on failure set block_id = -1 and make the error log include current size,
align_size, padding and remain_size_ for debugging.
| class AscendMemoryPool { | ||
| public: | ||
| explicit AscendMemoryPool(size_t pool_size); | ||
| ~AscendMemoryPool(); | ||
|
|
||
| void allocateBlock(uint32_t size, int& block_id); | ||
|
|
||
| void freeBlock(int block_id); | ||
|
|
||
| void getBlockPtr(int block_id, void*& addr); | ||
|
|
||
| private: | ||
| uint64_t generateBlocksId(); | ||
|
|
||
| std::atomic<uint64_t> id_ = 0; | ||
| std::mutex block_mutex_; | ||
|
|
||
| void* base_mem_ptr_ = nullptr; | ||
| void* cur_mem_ptr_ = nullptr; | ||
| int64_t remain_size_ = 0; | ||
|
|
||
| std::unordered_map<int, MemoryBlock> used_blocks_; | ||
| std::unordered_map<int, MemoryBlock> free_blocks_; | ||
| }; |
There was a problem hiding this comment.
Define copy/move semantics (Rule of Five) for pool owning raw device memory
Current type is copyable by default but owns base_mem_ptr_ and has a mutex/maps → copying is unsafe.
class AscendMemoryPool {
public:
explicit AscendMemoryPool(size_t pool_size);
~AscendMemoryPool();
+
+ AscendMemoryPool(const AscendMemoryPool&) = delete;
+ AscendMemoryPool& operator=(const AscendMemoryPool&) = delete;
+ AscendMemoryPool(AscendMemoryPool&&) = delete;
+ AscendMemoryPool& operator=(AscendMemoryPool&&) = delete;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class AscendMemoryPool { | |
| public: | |
| explicit AscendMemoryPool(size_t pool_size); | |
| ~AscendMemoryPool(); | |
| void allocateBlock(uint32_t size, int& block_id); | |
| void freeBlock(int block_id); | |
| void getBlockPtr(int block_id, void*& addr); | |
| private: | |
| uint64_t generateBlocksId(); | |
| std::atomic<uint64_t> id_ = 0; | |
| std::mutex block_mutex_; | |
| void* base_mem_ptr_ = nullptr; | |
| void* cur_mem_ptr_ = nullptr; | |
| int64_t remain_size_ = 0; | |
| std::unordered_map<int, MemoryBlock> used_blocks_; | |
| std::unordered_map<int, MemoryBlock> free_blocks_; | |
| }; | |
| class AscendMemoryPool { | |
| public: | |
| explicit AscendMemoryPool(size_t pool_size); | |
| ~AscendMemoryPool(); | |
| AscendMemoryPool(const AscendMemoryPool&) = delete; | |
| AscendMemoryPool& operator=(const AscendMemoryPool&) = delete; | |
| AscendMemoryPool(AscendMemoryPool&&) = delete; | |
| AscendMemoryPool& operator=(AscendMemoryPool&&) = delete; | |
| void allocateBlock(uint32_t size, int& block_id); | |
| void freeBlock(int block_id); | |
| void getBlockPtr(int block_id, void*& addr); | |
| private: | |
| uint64_t generateBlocksId(); | |
| std::atomic<uint64_t> id_ = 0; | |
| std::mutex block_mutex_; | |
| void* base_mem_ptr_ = nullptr; | |
| void* cur_mem_ptr_ = nullptr; | |
| int64_t remain_size_ = 0; | |
| std::unordered_map<int, MemoryBlock> used_blocks_; | |
| std::unordered_map<int, MemoryBlock> free_blocks_; | |
| }; |
🧰 Tools
🪛 Clang (14.0.6)
[error] 15-15: class 'AscendMemoryPool' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
mllm/backends/ascend/AscendDispatcher.cpp (1)
55-73: Uncheckedcustom_context_ptrcast and unusedascendBackendvariable.The
task->custom_context_ptris cast tonn::Module*without null validation (lines 56, 69), which could cause undefined behavior. Additionally,ascendBackendon line 67 is unused, which will trigger-Wunused-variablewarnings.Apply this diff to add validation and suppress the unused warning:
case TaskTypes::kExecuteModule: { //TODO: execute module - auto moduleName = static_cast<nn::Module*>(task->custom_context_ptr)->getModuleName(); + auto* module = static_cast<nn::Module*>(task->custom_context_ptr); + if (module == nullptr) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Task custom_context_ptr is null"); + } + auto moduleName = module->getModuleName(); #ifdef MLLM_PERFETTO_ENABLE MLLM_PERF_TRACE_EVENT("mllm.ascend.execute.", perfetto::DynamicString{moduleName}, [&](perfetto::EventContext ctx) { int cnt = 0; for (auto& i : task->inputs) { ctx.AddDebugAnnotation(perfetto::DynamicString{"inputs-" + std::to_string(cnt++)}, i.shape()); } }); #endif - auto ascendBackend = std::static_pointer_cast<AscendBackend>(Context::instance().getBackend(kAscend)); + [[maybe_unused]] auto ascendBackend = + std::static_pointer_cast<AscendBackend>(Context::instance().getBackend(kAscend)); - task->outputs = ((nn::Module*)(task->custom_context_ptr))->forward(task->inputs, task->args); + task->outputs = module->forward(task->inputs, task->args);mllm/backends/ascend/AscendDispatcher.hpp (1)
13-15:AscendDispatcherOptionsusingMLLM_EMPTY_SCOPEcauses uninitialized field warning.Static analysis reports that the implicit constructor doesn't initialize fields introduced by
MLLM_EMPTY_SCOPE. If the struct is currently empty, prefer a truly empty struct body or a comment placeholder.struct AscendDispatcherOptions { - MLLM_EMPTY_SCOPE; + // Reserved for future options. };mllm/backends/ascend/memory/AscendMemoryPool.hpp (1)
15-38: Define copy/move semantics (Rule of Five) - class owns raw device memory.The class defines a non-default destructor that frees
base_mem_ptr_, but copy/move operations are implicitly generated. Copying this class would cause double-free of device memory. The mutex and maps also make copying semantically incorrect.class AscendMemoryPool { public: explicit AscendMemoryPool(size_t pool_size); ~AscendMemoryPool(); + + AscendMemoryPool(const AscendMemoryPool&) = delete; + AscendMemoryPool& operator=(const AscendMemoryPool&) = delete; + AscendMemoryPool(AscendMemoryPool&&) = delete; + AscendMemoryPool& operator=(AscendMemoryPool&&) = delete; void allocateBlock(uint32_t size, int& block_id);mllm/backends/ascend/memory/AscendMemoryManager.cpp (1)
43-50: CRITICAL: Handle negative device ID before casting tosize_t.This issue was flagged in a previous review but remains unaddressed. Casting a negative device ID (e.g., -1) to
size_tproduces a very large unsigned value that could bypass the bounds check or cause undefined behavior.Apply the fix suggested in the previous review:
std::shared_ptr<AscendMemoryPool> &AscendMemoryManager::getMemoryPool() { - size_t device_id = static_cast<size_t>(getDeviceId()); + if (memory_pools_.empty()) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Ascend memory pools not initialized"); + } + const int32_t device_id_i32 = getDeviceId(); + if (device_id_i32 < 0) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id_i32); + } + const size_t device_id = static_cast<size_t>(device_id_i32); if (device_id >= memory_pools_.size()) { MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id); } return memory_pools_[device_id]; }
🧹 Nitpick comments (6)
examples/ascend_add_demo/main.cpp (3)
22-41: Consider extracting repetitive print logic into a helper function.The loops for printing vectors (lines 29-34 and 36-41) follow the same pattern and could be refactored into a reusable helper:
void printVector(const std::string& label, const std::vector<float>& vec) { std::cout << " " << label << ": ["; for (size_t i = 0; i < vec.size(); ++i) { std::cout << vec[i]; if (i < vec.size() - 1) std::cout << ", "; } std::cout << "]" << std::endl; }This would reduce duplication and improve readability.
82-109: Consider extracting verification logic for reusability.The verification loop (lines 100-109) and the print loops (lines 83-88, 90-95) could be extracted into helper functions to reduce duplication and improve testability:
bool verifyResults(const std::vector<float>& actual, const std::vector<float>& expected, float tolerance) { bool correct = true; for (size_t i = 0; i < actual.size(); ++i) { float diff = std::abs(actual[i] - expected[i]); if (diff > tolerance) { correct = false; std::cout << " ✗ Mismatch at index " << i << ": expected " << expected[i] << ", got " << actual[i] << " (diff: " << diff << ")" << std::endl; } } return correct; }
98-98: Consider using uppercase literal suffix for consistency.Static analysis suggests using
0.1Finstead of0.1ffor consistency with coding standards, though this is a minor stylistic preference.mllm/backends/ascend/AscendDispatcher.cpp (1)
55-55: Fix TODO comment format.The TODO comment should have a space after
//per coding guidelines.- case TaskTypes::kExecuteModule: { //TODO: execute module + case TaskTypes::kExecuteModule: { // TODO: execute modulemllm/backends/ascend/AscendCommon.cpp (1)
50-52: Adjacent parameters of similar type (batch,size) are easily swapped.Both
batchandsizeareintparameters in the same position, making accidental swaps at call sites likely. Consider using a struct or named parameters for clarity.+struct AscendTensorShape { + int batch; + int size; +}; + -AscendTensorHandle prepareAscendTensor(const std::vector<float>& host_data, - int batch, - int size) { +AscendTensorHandle prepareAscendTensor(const std::vector<float>& host_data, + const AscendTensorShape& shape) { + const int batch = shape.batch; + const int size = shape.size;Alternatively, reorder to
(int batch, int size, const std::vector<float>& host_data)to separate integer params from the vector, reducing swap risk.mllm/backends/ascend/AscendAllocator.cpp (1)
82-91: Use consistent logging macros instead ofstd::cout.The rest of the file uses
MLLM_INFO,MLLM_WARN, andMLLM_ERRORfor logging. Usingstd::couthere is inconsistent and bypasses the logging infrastructure.bool AscendAllocator::generalAlloc(void** ptr, size_t cap, size_t align) { - //we don't support generalAlloc , therefore return false - std::cout << "generalAlloc is not supported in AscendAllocator" << std::endl; + // generalAlloc is not supported in AscendAllocator + MLLM_WARN("generalAlloc is not supported in AscendAllocator"); return false; } void AscendAllocator::generalFree(void* ptr) { - //we don't support generalFree , therefore do nothing - std::cout << "generalFree is not supported in AscendAllocator" << std::endl; + // generalFree is not supported in AscendAllocator + MLLM_WARN("generalFree is not supported in AscendAllocator"); }This also allows removing
#include <iostream>from line 6.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
examples/ascend_add_demo/main.cpp(1 hunks)mllm/backends/ascend/AscendAllocator.cpp(1 hunks)mllm/backends/ascend/AscendCommon.cpp(1 hunks)mllm/backends/ascend/AscendDispatcher.cpp(1 hunks)mllm/backends/ascend/AscendDispatcher.hpp(1 hunks)mllm/backends/ascend/memory/AscendMemoryManager.cpp(1 hunks)mllm/backends/ascend/memory/AscendMemoryPool.hpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/memory/AscendMemoryPool.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/memory/AscendMemoryPool.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/AscendCommon.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/memory/AscendMemoryPool.hpp
🧠 Learnings (3)
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/* : No line may end with trailing whitespace.
Applied to files:
mllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/memory/AscendMemoryPool.hpp
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh} : TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Applied to files:
mllm/backends/ascend/AscendDispatcher.cpp
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi} : Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Applied to files:
mllm/backends/ascend/AscendCommon.cpp
🧬 Code graph analysis (4)
mllm/backends/ascend/AscendDispatcher.cpp (3)
mllm/backends/ascend/AscendDispatcher.hpp (4)
AscendDispatcher(21-22)task(24-24)task(26-26)task(28-28)mllm/engine/DispatcherManager.hpp (1)
thread_pool_(37-37)mllm/backends/ascend/AscendCommon.hpp (1)
instance(57-60)
mllm/backends/ascend/AscendAllocator.cpp (3)
mllm/backends/ascend/AscendAllocator.hpp (2)
AscendAllocator(17-17)AscendAllocator(18-18)mllm/backends/ascend/memory/AscendMemoryManager.cpp (2)
getAscendMemoryManager(15-18)getAscendMemoryManager(15-15)mllm/backends/ascend/memory/AscendMemoryPool.hpp (2)
block_id(22-22)block_id(24-24)
mllm/backends/ascend/AscendCommon.cpp (3)
mllm/backends/ascend/AscendCommon.hpp (10)
AscendTensorHandle(70-70)AscendTensorHandle(71-71)AscendTensorHandle(72-72)AscendTensorHandle(74-74)AscendTensorHandle(76-76)other(77-77)tensor_(82-82)tensor_(83-83)block_id_(80-80)block_id_(84-84)mllm/backends/ascend/memory/AscendMemoryPool.hpp (3)
block_id(22-22)block_id(24-24)size(20-20)mllm/backends/ascend/memory/AscendMemoryManager.cpp (2)
getAscendMemoryManager(15-18)getAscendMemoryManager(15-15)
mllm/backends/ascend/memory/AscendMemoryPool.hpp (2)
mllm/backends/ascend/memory/AscendMemoryManager.hpp (4)
pool_size(18-18)size(24-24)block_id(26-26)block_id(28-28)mllm/backends/ascend/memory/AscendMemoryPool.cpp (2)
AscendMemoryPool(14-26)AscendMemoryPool(28-36)
🪛 Clang (14.0.6)
examples/ascend_add_demo/main.cpp
[error] 98-98: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 111-111: if with identical then and else branches
(bugprone-branch-clone,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryManager.cpp
[error] 20-20: method 'createMemoryPool' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 22-22: variable 'device_count' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 37-37: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 45-45: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.cpp
[error] 50-50: 3 adjacent parameters of 'prepareAscendTensor' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 53-53: variable 'expected_elements' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 64-64: variable 'device_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 65-65: variable 'bytes' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 80-80: variable 'ctx' provides global access to a non-const object; consider making the pointed-to data 'const'
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 116-116: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/ascend/AscendDispatcher.hpp
[error] 13-13: constructor does not initialize these fields: MLLM_EMPTY_SCOPE
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryPool.hpp
[error] 15-15: class 'AscendMemoryPool' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
🔇 Additional comments (13)
examples/ascend_add_demo/main.cpp (6)
1-12: LGTM!The includes are appropriate for an Ascend backend demo, covering ACL operations, tensor management, context execution, and elementwise operations.
17-20: LGTM!Ascend backend initialization is properly sequenced and wrapped in exception handling.
43-48: LGTM!Tensor preparation correctly uses the
prepareAscendTensorhelper and obtains tensor references for subsequent operations.
60-80: LGTM with note on resource management.The result copying and fp16-to-float conversion is correctly implemented. The error path (lines 68-73) releases
x_handleandy_handlebut notz_ascend, which appears intentional assuming the Tensor class uses RAII for automatic cleanup.
111-115: LGTM!The pass/fail reporting logic is correct. Note that the static analysis hint claiming "identical then and else branches" at line 111 is a false positive—the branches clearly differ (PASSED vs FAILED messages).
117-125: LGTM!Resource cleanup and exception handling are properly implemented. The explicit release of tensor handles before return and the catch block for reporting errors are appropriate.
mllm/backends/ascend/AscendCommon.cpp (2)
19-48: LGTM - AscendTensorHandle RAII implementation.The move semantics are correctly implemented with proper resource cleanup in
release(). The self-assignment check in move assignment and invalidation of the source handle are good practices.
79-101: Thread-safe initialization pattern looks good.The use of
std::once_flagwithstd::call_onceensures thread-safe one-time initialization of the global ATB context and stream. This addresses the previous review concern about race conditions.mllm/backends/ascend/AscendAllocator.cpp (1)
23-47: Memory pool allocation implementation looks correct.The allocation flow properly:
- Allocates a block from the memory manager
- Gets the pointer for the block
- Handles failures by freeing allocated blocks
- Tracks the mapping from pointer to block ID
The mutex protection for the map access ensures thread safety.
mllm/backends/ascend/memory/AscendMemoryPool.hpp (1)
20-24: Memory pool API design looks appropriate.The interface provides the essential block management operations:
allocateBlock,freeBlock, andgetBlockPtr. Using output parameters for block_id and addr aligns with the C-style ACL APIs being wrapped.mllm/backends/ascend/memory/AscendMemoryManager.cpp (3)
13-18: LGTM! Singleton pattern correctly implemented.The function-local static singleton pattern is thread-safe per C++11 and addresses the past review feedback about avoiding file-scope globals.
35-41: LGTM! Proper error handling and defensive initialization.The function correctly initializes
device_idto -1 and validates the ACL call result.
52-65: LGTM! Clean delegation pattern.The block management methods appropriately delegate to the per-device memory pool.
| } | ||
| } | ||
|
|
||
| void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc) { |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Rename short parameter t to satisfy identifier length requirements.
The parameter name t is too short per readability-identifier-length guidelines. Rename to a descriptive identifier.
-void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc) {
+void fillAtbTensorDesc(const Tensor& tensor, atb::TensorDesc& desc) {
desc.dtype = ACL_FLOAT16; // Currently hardcoded as per demo, can be expanded later
desc.format = ACL_FORMAT_ND;
- auto shape = t.shape();
+ auto shape = tensor.shape();Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Clang (14.0.6)
[error] 116-116: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/backends/ascend/AscendCommon.cpp around line 116, the function parameter
named `t` is too short; rename it to a descriptive identifier such as `tensor`
(or `inputTensor`) and update all references inside the function body to the new
name; also adjust the function declaration/signature in any corresponding header
or other translation units to match the new parameter name to satisfy the
readability-identifier-length rule.
| MLLM_ACL_CHECK(ret); | ||
| for (size_t i = 0; i < device_count; i++) { | ||
|
|
||
| aclrtSetDevice(i); |
There was a problem hiding this comment.
Check the return value of aclrtSetDevice.
If aclrtSetDevice fails, the memory pool will be created with an incorrect device context, leading to device mismatch errors in subsequent operations.
Apply this diff to check the return value:
- aclrtSetDevice(i);
+ ret = aclrtSetDevice(i);
+ MLLM_ACL_CHECK(ret);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| aclrtSetDevice(i); | |
| ret = aclrtSetDevice(i); | |
| MLLM_ACL_CHECK(ret); |
🤖 Prompt for AI Agents
In mllm/backends/ascend/memory/AscendMemoryManager.cpp around line 27, the call
to aclrtSetDevice(i) currently ignores its return value; modify the code to
check the returned aclError (or appropriate status type), handle non-success
results by logging an error and returning or throwing (depending on existing
error handling patterns) before proceeding to create the memory pool, and ensure
the device is only used when aclrtSetDevice reports success to avoid
device-context mismatches.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
examples/ascend_add_demo/main.cpp (1)
52-52: Remove debug print statement.This debug print
"context over"should be removed before merging.
🧹 Nitpick comments (5)
examples/ascend_add_demo/main.cpp (1)
60-88: Fix inconsistent indentation.Lines 60-88 use 2-space indentation while the surrounding code in the same
tryblock uses 4-space indentation. Align all code within thetryblock to consistent 4-space indentation.mllm/backends/ascend/AscendCommon.hpp (4)
19-28: Parenthesize macro arguments for safety.The macro arguments should be enclosed in parentheses to prevent operator precedence issues when complex expressions are passed.
#define MLLM_ACL_CHECK(err) \ - if (err != ACL_SUCCESS) { \ + if ((err) != ACL_SUCCESS) { \ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ACL error code {}: {}", int(err), aclGetRecentErrMsg()); \ } #define MLLM_ATB_CHECK(err) \ - if (err != atb::NO_ERROR) { \ + if ((err) != atb::NO_ERROR) { \ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ATB error code {}", int(err)); \ }
45-51: Consider default-initializing struct members.
AscendDeviceInfomembers are uninitialized by default. Adding default values prevents undefined behavior if a field is not explicitly set.struct AscendDeviceInfo { std::string name; - unsigned int id; - size_t total_memory; // bytes - size_t free_memory; // bytes + unsigned int id{0}; + size_t total_memory{0}; // bytes + size_t free_memory{0}; // bytes std::string soc_version; };
81-85: Consider adding[[nodiscard]]to accessor functions.Functions like
valid(),tensor(), andblockId()return values that should typically not be ignored.- void release(); // Manually release memory block and invalidate handle - bool valid() const { return block_id_ >= 0; } // Check if handle owns a valid memory block + void release(); + [[nodiscard]] bool valid() const { return block_id_ >= 0; } - Tensor& tensor() { return tensor_; } // Access tensor - const Tensor& tensor() const { return tensor_; } // Access tensor (const) - int blockId() const { return block_id_; } // Get memory block ID + [[nodiscard]] Tensor& tensor() { return tensor_; } + [[nodiscard]] const Tensor& tensor() const { return tensor_; } + [[nodiscard]] int blockId() const { return block_id_; }
117-127: Delete copy/move operations forAscendTimer.Since
AscendTimeris an RAII timer that manages timing state, it should not be copied or moved. Explicitly delete these operations to prevent misuse.class AscendTimer { public: explicit AscendTimer(const char* tag, bool sync_before = true, bool sync_after = true); ~AscendTimer(); + AscendTimer(const AscendTimer&) = delete; + AscendTimer& operator=(const AscendTimer&) = delete; + AscendTimer(AscendTimer&&) = delete; + AscendTimer& operator=(AscendTimer&&) = delete; + private:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/ascend_backend/core_design.rst(1 hunks)examples/ascend_add_demo/main.cpp(1 hunks)mllm/backends/ascend/AscendCommon.cpp(1 hunks)mllm/backends/ascend/AscendCommon.hpp(1 hunks)mllm/backends/ascend/ops/AscnedElewiseOps.cpp(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/ascend_backend/core_design.rst
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendCommon.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendCommon.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendCommon.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendCommon.hpp
🧬 Code graph analysis (4)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (1)
mllm/backends/ascend/AscendCommon.cpp (6)
getGlobalAtbContext(173-195)getGlobalAtbContext(173-173)fillAtbTensorDesc(210-219)fillAtbTensorDesc(210-210)syncGlobalAtbStream(202-208)syncGlobalAtbStream(202-202)
mllm/backends/ascend/AscendCommon.cpp (3)
mllm/backends/ascend/AscendCommon.hpp (12)
AscendTensorHandle(71-71)AscendTensorHandle(72-72)AscendTensorHandle(73-73)AscendTensorHandle(75-75)AscendTensorHandle(77-77)other(78-78)tensor_(83-83)tensor_(84-84)block_id_(81-81)block_id_(85-85)AscendDeviceMetaInfo(56-56)AscendDeviceMetaInfo(63-63)mllm/backends/ascend/memory/AscendMemoryPool.hpp (3)
block_id(22-22)block_id(24-24)size(20-20)mllm/backends/ascend/memory/AscendMemoryManager.cpp (2)
getAscendMemoryManager(15-18)getAscendMemoryManager(15-15)
mllm/backends/ascend/AscendCommon.hpp (1)
mllm/backends/ascend/AscendCommon.cpp (7)
AscendTensorHandle(23-24)AscendTensorHandle(26-26)AscendTensorHandle(28-31)other(33-41)other(33-33)AscendTimer(154-162)AscendTimer(164-171)
examples/ascend_add_demo/main.cpp (3)
mllm/backends/ascend/AscendCommon.cpp (6)
prepareAscendTensor(54-81)prepareAscendTensor(54-56)verifyAscendTensor(103-142)verifyAscendTensor(103-108)verifyAscendTensor(144-152)verifyAscendTensor(144-149)mllm/backends/ascend/AscendCommon.hpp (3)
prepareAscendTensor(93-95)verifyAscendTensor(101-106)verifyAscendTensor(109-114)mllm/backends/qnn/op/QNNElewiseOp.hpp (1)
kAdd(36-38)
🪛 Clang (14.0.6)
mllm/backends/ascend/AscendCommon.cpp
[error] 54-54: 3 adjacent parameters of 'prepareAscendTensor' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 57-57: variable 'expected_elements' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 68-68: variable 'device_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 69-69: variable 'bytes' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 87-87: variable 'elem_cnt' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 103-103: 3 adjacent parameters of 'verifyAscendTensor' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 103-103: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 122-122: variable name 'ok' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 144-144: 3 adjacent parameters of 'verifyAscendTensor' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 144-144: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 174-174: variable 'ctx' provides global access to a non-const object; consider making the pointed-to data 'const'
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 210-210: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.hpp
[error] 19-19: function-like macro 'MLLM_ACL_CHECK' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 20-20: macro argument should be enclosed in parentheses
(bugprone-macro-parentheses,-warnings-as-errors)
[error] 25-25: function-like macro 'MLLM_ATB_CHECK' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 26-26: macro argument should be enclosed in parentheses
(bugprone-macro-parentheses,-warnings-as-errors)
[error] 42-42: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 45-45: constructor does not initialize these fields: name, id, total_memory, free_memory, soc_version
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 54-54: class 'AscendDeviceMetaInfo' defines a copy constructor and a copy assignment operator but does not define a destructor, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 70-70: constructor does not initialize these fields: tensor_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 81-81: function 'valid' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 84-84: function 'tensor' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 85-85: function 'blockId' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 101-101: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 103-103: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 103-103: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 104-104: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 104-104: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 109-109: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 111-111: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 111-111: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 112-112: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 112-112: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 117-117: class 'AscendTimer' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 130-130: function-like macro 'ASCEND_TIME_SCOPE' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 131-131: function-like macro 'ASCEND_TIME_SCOPE_NOSYNC' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
examples/ascend_add_demo/main.cpp
[error] 62-62: variable 'correct' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 84-84: if with identical then and else branches
(bugprone-branch-clone,-warnings-as-errors)
🔇 Additional comments (6)
examples/ascend_add_demo/main.cpp (1)
14-99: Demo structure looks good.The demo properly initializes the backend, prepares test data, executes the Add operation, verifies results with tolerances, and handles errors via try/catch. The use of
AscendTensorHandleRAII for resource management is appropriate.mllm/backends/ascend/ops/AscnedElewiseOps.cpp (1)
37-118: Forward implementation looks correct.The ATB workflow follows the expected pattern: create operation, setup tensors, configure VariantPack, setup workspace, execute, synchronize, and cleanup. Error handling is comprehensive with appropriate
MLLM_ERROR_EXITcalls.mllm/backends/ascend/AscendCommon.cpp (3)
23-52: AscendTensorHandle RAII implementation is correct.Move semantics properly transfer ownership by resetting
block_id_to -1 in the source. Therelease()method correctly handles both managed blocks (via memory manager) and unmanaged tensors.
54-81: Note: Returned handle hasblock_id = -1.This is intentional since memory is allocated via
Tensor::empty/allocrather than the memory manager. Therelease()method correctly handles this case by callingtensor.delete_().
173-195: Thread-safe ATB context initialization is correct.Using
std::call_onceensures the context is initialized exactly once, even under concurrent access. The initialization sequence (set device → create context → create stream → set execute stream) follows ATB requirements.mllm/backends/ascend/AscendCommon.hpp (1)
1-133: Header structure and API documentation are good.The header provides clear documentation for each function and class. The namespace organization and include structure are appropriate for a backend utility header.
| AscendDeviceMetaInfo::AscendDeviceMetaInfo() { | ||
| #ifndef ASCENDC_CPU_DEBUG | ||
| // Initialize ACL to query devices | ||
| auto ret = aclInit(nullptr); | ||
| if (ret != ACL_SUCCESS) { | ||
| MLLM_ERROR("Failed to initialize ACL for device enumeration: {}", ret); | ||
| return; | ||
| } | ||
|
|
||
| // Get device count | ||
| uint32_t device_count = 0; | ||
| ret = aclrtGetDeviceCount(&device_count); | ||
| if (ret != ACL_SUCCESS) { | ||
| MLLM_ERROR("Failed to get Ascend device count: {}", ret); | ||
| aclFinalize(); | ||
| return; | ||
| } | ||
|
|
||
| // Collect info for each device | ||
| for (uint32_t i = 0; i < device_count; ++i) { | ||
| AscendDeviceInfo info; | ||
| info.id = i; | ||
| info.name = "Ascend Device " + std::to_string(i); | ||
|
|
||
| // Set device to query its properties | ||
| ret = aclrtSetDevice(i); | ||
| if (ret == ACL_SUCCESS) { | ||
| // Get memory information | ||
| size_t free_mem = 0, total_mem = 0; | ||
| ret = aclrtGetMemInfo(ACL_HBM_MEM, &free_mem, &total_mem); | ||
| if (ret == ACL_SUCCESS) { | ||
| info.total_memory = total_mem; | ||
| info.free_memory = free_mem; | ||
| } else { | ||
| info.total_memory = 0; | ||
| info.free_memory = 0; | ||
| } | ||
|
|
||
| // SOC version - platform specific, set to unknown for now | ||
| info.soc_version = "Unknown"; | ||
| } else { | ||
| info.total_memory = 0; | ||
| info.free_memory = 0; | ||
| info.soc_version = "Unknown"; | ||
| } | ||
|
|
||
| devices.push_back(info); | ||
| } | ||
|
|
||
| // Finalize ACL after enumeration | ||
| aclFinalize(); | ||
| #else | ||
| // In CPU debug mode, add a dummy device | ||
| AscendDeviceInfo info; | ||
| info.id = 0; | ||
| info.name = "Ascend CPU Debug Device"; | ||
| info.total_memory = 0; | ||
| info.free_memory = 0; | ||
| info.soc_version = "CPU_DEBUG"; | ||
| devices.push_back(info); | ||
| #endif | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for AscendDeviceMetaInfo and getGlobalAtbContext usage
rg -n 'AscendDeviceMetaInfo::instance|getGlobalAtbContext' --type=cppRepository: UbiquitousLearning/mllm
Length of output: 513
🏁 Script executed:
#!/bin/bash
# First, locate the AscendCommon files and understand the structure
find . -name "*.cpp" -o -name "*.h" | grep -i ascend | head -20Repository: UbiquitousLearning/mllm
Length of output: 502
🏁 Script executed:
#!/bin/bash
# Search for usage of both functions
rg -n 'AscendDeviceMetaInfo|getGlobalAtbContext' --type cpp --type hppRepository: UbiquitousLearning/mllm
Length of output: 94
🏁 Script executed:
#!/bin/bash
# Look at the file structure and find relevant files
ls -la mllm/backends/ascend/ 2>/dev/null || echo "Path not found, searching..."
find . -path "*/ascend/*" -name "*.cpp" -o -path "*/ascend/*" -name "*.h" | head -20Repository: UbiquitousLearning/mllm
Length of output: 1585
🏁 Script executed:
#!/bin/bash
# Get the file content to understand the context better
find . -type f \( -name "*.cpp" -o -name "*.h" \) -path "*ascend*" 2>/dev/null | head -5Repository: UbiquitousLearning/mllm
Length of output: 269
🏁 Script executed:
#!/bin/bash
# Search for the usage pattern
rg 'AscendDeviceMetaInfo::instance' . --type cpp --type hpp -A 3 2>/dev/null | head -40Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
#!/bin/bash
# Search for getGlobalAtbContext
rg 'getGlobalAtbContext' . --type cpp --type hpp -A 3 2>/dev/null | head -40Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
#!/bin/bash
# Check where aclFinalize is called
rg 'aclFinalize' . --type cpp --type hpp -B 2 -A 2 2>/dev/nullRepository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
#!/bin/bash
stat mllm/backends/ascend/AscendCommon.cpp 2>/dev/null && echo "File exists" || echo "File not found"
ls -la mllm/backends/ascend/ 2>/dev/null | head -10Repository: UbiquitousLearning/mllm
Length of output: 667
🏁 Script executed:
#!/bin/bash
# Read AscendCommon.cpp to see the full context of both functions
cat -n mllm/backends/ascend/AscendCommon.cpp | sed -n '150,290p'Repository: UbiquitousLearning/mllm
Length of output: 4570
🏁 Script executed:
#!/bin/bash
# Read AscendBackend.cpp to see the usage pattern
cat -n mllm/backends/ascend/AscendBackend.cppRepository: UbiquitousLearning/mllm
Length of output: 1239
🏁 Script executed:
#!/bin/bash
# Look at the getGlobalAtbContext implementation more closely
rg -A 30 'atb::Context\* getGlobalAtbContext' mllm/backends/ascend/AscendCommon.cppRepository: UbiquitousLearning/mllm
Length of output: 783
🏁 Script executed:
#!/bin/bash
# Also check if aclInit and aclFinalize are called anywhere else
rg 'aclInit|aclFinalize|aclrtSetDevice' mllm/backends/ascend/ --type cppRepository: UbiquitousLearning/mllm
Length of output: 628
Remove aclFinalize() call from AscendDeviceMetaInfo constructor or defer it until backend shutdown.
Calling aclFinalize() at line 271 after device enumeration terminates the ACL runtime. Since getGlobalAtbContext() (used later in operations) requires aclrtSetDevice() to function, subsequent ACL operations will fail if invoked after this destructor. Either:
- Remove the
aclFinalize()call and let backend shutdown handle cleanup - Ensure
getGlobalAtbContext()is initialized before device enumeration
🤖 Prompt for AI Agents
In mllm/backends/ascend/AscendCommon.cpp around lines 221 to 282, the
constructor calls aclFinalize() after enumerating devices which shuts down the
ACL runtime and breaks subsequent ACL calls (e.g.,
aclrtSetDevice()/getGlobalAtbContext()); remove the aclFinalize() call from this
constructor (do not finalize ACL here) and instead ensure ACL is finalized once
during backend shutdown/cleanup; update error paths to avoid finalizing ACL
prematurely and document that ACL initialization lifetime spans device
enumeration through backend shutdown.
| #define ASCEND_TIME_SCOPE(tag) ::mllm::ascend::AscendTimer timer_scope_##__LINE__(tag, true, true) | ||
| #define ASCEND_TIME_SCOPE_NOSYNC(tag) ::mllm::ascend::AscendTimer timer_scope_##__LINE__(tag, false, false) |
There was a problem hiding this comment.
Fix __LINE__ concatenation in macros.
__LINE__ requires a two-level macro expansion to work correctly in token concatenation. Currently, the variable name will literally be timer_scope___LINE__ rather than timer_scope_123.
+#define ASCEND_TIME_SCOPE_CONCAT_INNER(a, b) a##b
+#define ASCEND_TIME_SCOPE_CONCAT(a, b) ASCEND_TIME_SCOPE_CONCAT_INNER(a, b)
-#define ASCEND_TIME_SCOPE(tag) ::mllm::ascend::AscendTimer timer_scope_##__LINE__(tag, true, true)
-#define ASCEND_TIME_SCOPE_NOSYNC(tag) ::mllm::ascend::AscendTimer timer_scope_##__LINE__(tag, false, false)
+#define ASCEND_TIME_SCOPE(tag) ::mllm::ascend::AscendTimer ASCEND_TIME_SCOPE_CONCAT(timer_scope_, __LINE__)(tag, true, true)
+#define ASCEND_TIME_SCOPE_NOSYNC(tag) ::mllm::ascend::AscendTimer ASCEND_TIME_SCOPE_CONCAT(timer_scope_, __LINE__)(tag, false, false)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #define ASCEND_TIME_SCOPE(tag) ::mllm::ascend::AscendTimer timer_scope_##__LINE__(tag, true, true) | |
| #define ASCEND_TIME_SCOPE_NOSYNC(tag) ::mllm::ascend::AscendTimer timer_scope_##__LINE__(tag, false, false) | |
| #define ASCEND_TIME_SCOPE_CONCAT_INNER(a, b) a##b | |
| #define ASCEND_TIME_SCOPE_CONCAT(a, b) ASCEND_TIME_SCOPE_CONCAT_INNER(a, b) | |
| #define ASCEND_TIME_SCOPE(tag) ::mllm::ascend::AscendTimer ASCEND_TIME_SCOPE_CONCAT(timer_scope_, __LINE__)(tag, true, true) | |
| #define ASCEND_TIME_SCOPE_NOSYNC(tag) ::mllm::ascend::AscendTimer ASCEND_TIME_SCOPE_CONCAT(timer_scope_, __LINE__)(tag, false, false) |
🧰 Tools
🪛 Clang (14.0.6)
[error] 130-130: function-like macro 'ASCEND_TIME_SCOPE' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 131-131: function-like macro 'ASCEND_TIME_SCOPE_NOSYNC' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/backends/ascend/AscendCommon.hpp around lines 130-131, the macros use
token concatenation with __LINE__ directly which prevents proper expansion;
change them to use a two-level helper macro (e.g., a CONCAT/CONCAT_INNER pair)
so __LINE__ is expanded before pasting, then use the outer concatenation macro
to build the variable name (e.g., CONCAT(timer_scope_, __LINE__)) and keep the
rest of the macro arguments the same; update both ASCEND_TIME_SCOPE and
ASCEND_TIME_SCOPE_NOSYNC to call the outer helper to ensure the variable becomes
timer_scope_<lineNumber>.
| // Copyright (c) MLLM Team. | ||
| // Licensed under the MIT License. | ||
|
|
There was a problem hiding this comment.
Filename typo: "AscnedElewiseOps.cpp" should be "AscendElewiseOps.cpp".
The filename has a typo - "Ascned" instead of "Ascend". This should be corrected to maintain consistency with the header file name AscendElewiseOps.hpp.
🤖 Prompt for AI Agents
In mllm/backends/ascend/ops/AscnedElewiseOps.cpp lines 1-3, the filename
contains a typo ("AscnedElewiseOps.cpp"); rename the file to
AscendElewiseOps.cpp to match the header AscendElewiseOps.hpp and the project
naming, then update all references (includes, CMakeLists.txt / build files,
source lists, and any #include or forward declarations) to use the corrected
filename; ensure any case-sensitive platforms and CI are accounted for and run a
build to verify there are no remaining broken references.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (1)
95-108: Workspace memory leak on error path.If
op->Execute()fails at line 107, the workspace allocated at line 99 is never freed becauseMLLM_ERROR_EXITprevents reaching the cleanup code at lines 113-116. This is the same issue flagged in the previous review.Use RAII or a scope guard pattern to ensure workspace is freed even on error:
struct WorkspaceGuard { AscendMemoryManager& mem_mgr_; int block_id_; WorkspaceGuard(AscendMemoryManager& mgr, int id) : mem_mgr_(mgr), block_id_(id) {} ~WorkspaceGuard() { if (block_id_ != -1) mem_mgr_.freeBlock(block_id_); } WorkspaceGuard(const WorkspaceGuard&) = delete; WorkspaceGuard& operator=(const WorkspaceGuard&) = delete; };Then use:
void* workspace = nullptr; int workspace_block_id = -1; if (workspaceSize > 0) { auto& mem_mgr = getAscendMemoryManager(); mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id); mem_mgr.getBlockPtr(workspace_block_id, workspace); } WorkspaceGuard ws_guard(getAscendMemoryManager(), workspace_block_id); // Automatic cleanup
🧹 Nitpick comments (1)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (1)
22-36: Remove commented-out code.The commented-out allocation code (lines 23-34) should be removed entirely rather than left in the source. If this code serves as documentation of a previous approach, consider adding a brief comment explaining why
BaseOp::setup()is used instead, or move the explanation to documentation.Apply this diff to clean up:
void AscendAddOp::setup(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) { - // for (auto& t : outputs) { - // if (!t.isNil()) { - // auto& mem_mgr = getAscendMemoryManager(); - // int block_id = -1; - // void* device_ptr = nullptr; - - // mem_mgr.allocateBlock(static_cast<uint32_t>(t.bytes()), block_id); - // mem_mgr.getBlockPtr(block_id, device_ptr); - - // t.impl()->storage()->ptr_ = device_ptr; - // } - // } BaseOp::setup(inputs, outputs); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/ascend/ops/AscnedElewiseOps.cpp
🧬 Code graph analysis (1)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (2)
mllm/backends/ascend/AscendCommon.hpp (3)
getGlobalAtbContext(33-33)fillAtbTensorDesc(42-42)syncGlobalAtbStream(39-39)mllm/backends/ascend/AscendCommon.cpp (6)
getGlobalAtbContext(173-195)getGlobalAtbContext(173-173)fillAtbTensorDesc(210-219)fillAtbTensorDesc(210-210)syncGlobalAtbStream(202-208)syncGlobalAtbStream(202-202)
🔇 Additional comments (4)
mllm/backends/ascend/ops/AscnedElewiseOps.cpp (4)
1-16: LGTM: Copyright header and includes are appropriate.The MIT license header follows project conventions, and the included headers (ATB/ACL libraries, memory manager, common utilities) are appropriate for implementing the Ascend Add operation.
18-21: LGTM: Constructor correctly delegates to base class.The constructor properly forwards the options to the base
aops::AddOpclass.
38-51: LGTM: Input validation is thorough and appropriate.The validation checks ensure:
- Correct number of inputs/outputs
- Dtype consistency across tensors
- No-broadcast constraint for this initial implementation
The use of
NYIfor unsupported features clearly communicates current limitations.
110-119: Cleanup logic is correct but vulnerable to error paths.The cleanup sequence (sync stream, free workspace, destroy operation) is correct. However, as flagged in previous comments, this cleanup is bypassed if
MLLM_ERROR_EXITis called earlier. Once RAII guards are added for workspace and operation, this code can remain as the happy-path cleanup with the guards handling error paths.
| atb::infer::ElewiseParam addParam; | ||
| addParam.elewiseType = atb::infer::ElewiseParam::ELEWISE_ADD; | ||
|
|
||
| atb::Operation* op = nullptr; | ||
| auto st = atb::CreateOperation(addParam, &op); | ||
| if (st != atb::NO_ERROR || op == nullptr) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB CreateOperation(ELEWISE_ADD) failed, status={}", static_cast<int>(st)); | ||
| } | ||
|
|
||
| atb::Context* atb_ctx = getGlobalAtbContext(); | ||
|
|
||
| atb::Tensor atb_x; | ||
| atb::Tensor atb_y; | ||
| atb::Tensor atb_z; | ||
|
|
||
| fillAtbTensorDesc(x, atb_x.desc); | ||
| fillAtbTensorDesc(y, atb_y.desc); | ||
| fillAtbTensorDesc(z, atb_z.desc); | ||
|
|
||
| atb_x.deviceData = reinterpret_cast<uint8_t*>(x.ptr<void>()); | ||
| atb_x.dataSize = x.bytes(); | ||
| atb_y.deviceData = reinterpret_cast<uint8_t*>(y.ptr<void>()); | ||
| atb_y.dataSize = y.bytes(); | ||
| atb_z.deviceData = reinterpret_cast<uint8_t*>(z.ptr<void>()); | ||
| atb_z.dataSize = z.bytes(); | ||
|
|
||
| atb::SVector<atb::Tensor> inTensors; | ||
| atb::SVector<atb::Tensor> outTensors; | ||
| inTensors.push_back(atb_x); | ||
| inTensors.push_back(atb_y); | ||
| outTensors.push_back(atb_z); | ||
|
|
||
| atb::VariantPack vp; | ||
| vp.inTensors = inTensors; | ||
| vp.outTensors = outTensors; | ||
|
|
||
| uint64_t workspaceSize = 0; | ||
| st = op->Setup(vp, workspaceSize, atb_ctx); | ||
| if (st != atb::NO_ERROR) { | ||
| MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB AddOp Setup failed, status={}", static_cast<int>(st)); | ||
| } |
There was a problem hiding this comment.
Resource leak: ATB operation not destroyed on error.
If op->Setup() fails at line 92, MLLM_ERROR_EXIT is called before reaching atb::DestroyOperation(op) at line 118. The ATB operation is leaked.
Consider using RAII or a scope guard to ensure cleanup:
// Add at the top of forward() or create a helper class
struct AtbOperationGuard {
atb::Operation* op_;
explicit AtbOperationGuard(atb::Operation* op) : op_(op) {}
~AtbOperationGuard() { if (op_) atb::DestroyOperation(op_); }
AtbOperationGuard(const AtbOperationGuard&) = delete;
AtbOperationGuard& operator=(const AtbOperationGuard&) = delete;
};Then use:
auto st = atb::CreateOperation(addParam, &op);
if (st != atb::NO_ERROR || op == nullptr) {
MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB CreateOperation(ELEWISE_ADD) failed, status={}", static_cast<int>(st));
}
AtbOperationGuard op_guard(op); // Automatic cleanup🤖 Prompt for AI Agents
In mllm/backends/ascend/ops/AscnedElewiseOps.cpp around lines 53-93, the ATB
Operation created by atb::CreateOperation is not cleaned up if op->Setup(...)
fails, causing a resource leak; after successful CreateOperation (and null
check) instantiate a small RAII guard or scope-guard object that holds the
atb::Operation* and calls atb::DestroyOperation in its destructor (delete
copy/move as appropriate), and place this guard immediately after the success
path so any early returns or error exits (like MLLM_ERROR_EXIT) will still
destroy the operation; ensure the guard only runs when op is non-null and that
the normal end-of-function cleanup does not double-destroy (e.g., release or
reset the guard before the existing explicit DestroyOperation call).
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
mllm/backends/ascend/AscendCommon.cpp (1)
220-281: Critical:aclFinalize()call breaks subsequent ACL operations.Calling
aclFinalize()at line 270 terminates the ACL runtime immediately after device enumeration. SincegetGlobalAtbContext()(used by operators) requires a valid ACL runtime and callsaclrtSetDevice(), any subsequent ACL operations will fail after this constructor completes.This issue was previously flagged in past review comments and remains unresolved.
Recommended solution:
Remove theaclFinalize()call from this constructor and defer ACL finalization to backend shutdown or application exit. ACL should remain initialized for the lifetime of the backend.🔎 Proposed fix
devices.push_back(info); } - // Finalize ACL after enumeration - aclFinalize(); + // Note: ACL should remain initialized for the lifetime of the backend. + // Finalization will be handled during backend shutdown. #else // In CPU debug mode, add a dummy device
🧹 Nitpick comments (7)
docs/ascend_backend/index.rst (1)
8-9: Consider removing extra trailing empty line.The file ends with two empty lines. Per coding guidelines, it's recommended to have only a single trailing newline.
🔎 Proposed fix
core_design - - +tests/ascend/KernelTestHelper.hpp (1)
7-17: Consider removing empty SetUp/TearDown methods if unused.The
SetUp()andTearDown()methods are currently empty. If no per-test setup or cleanup is needed, these methods can be removed entirely, as GoogleTest provides default implementations.🔎 Proposed simplification
class KernelTest : public testing::Test { public: KernelTest() = default; ~KernelTest() override = default; - - // If the constructor and destructor are not enough for setting up - // and cleaning up each test, you can define the following methods: - void SetUp() override {} - - void TearDown() override {} };Note: The static analysis warning about special member functions is a false positive for test fixtures and can be safely ignored.
tests/CMakeLists.txt (1)
14-16: Consider consistency with other backend patterns.The Ascend backend uses an environment variable check (
ENV{ASCEND_HOME_PATH}), while the CUDA backend uses a CMake option (MLLM_BUILD_CUDA_BACKEND). This may be intentional due to SDK location requirements, but consider whether a CMake option with a separateASCEND_HOME_PATHvariable would provide more consistent build configuration.mllm/backends/ascend/ops/AscendX2XOp.cpp (2)
34-44: Simplify error handling by removing redundant check.The
if (ret != ACL_SUCCESS)check beforeMLLM_ACL_CHECK(ret)is redundant. The macro should handle error checking internally, so you can call it directly after the memcpy operation.🔎 Proposed simplification
auto ret = aclrtMemcpy( dst_data, data_size, src_data, data_size, ACL_MEMCPY_HOST_TO_DEVICE); - - if (ret != ACL_SUCCESS) { - MLLM_ACL_CHECK(ret); - } + MLLM_ACL_CHECK(ret); syncGlobalAtbStream();Apply the same simplification to lines 54-61 and 74-81 for consistency.
73-85: Skip synchronization when source and destination pointers are identical.When
src_data == dst_data, the memcpy is correctly skipped, but the function still callssyncGlobalAtbStream()at line 83 (outside the if-block). For consistency and to avoid unnecessary synchronization overhead, move the sync call inside the if-block.🔎 Proposed fix
if (src_data != dst_data) { auto ret = aclrtMemcpy( dst_data, data_size, src_data, data_size, ACL_MEMCPY_DEVICE_TO_DEVICE); - - if (ret != ACL_SUCCESS) { - MLLM_ACL_CHECK(ret); - } - + MLLM_ACL_CHECK(ret); syncGlobalAtbStream(); } return;mllm/backends/ascend/AscendCommon.cpp (2)
209-218: Rename short parameter name for better readability.The parameter name
tat line 209 is too short per coding guidelines. Rename it to a more descriptive identifier such astensor.Based on coding guidelines, parameter names should be at least 3 characters long for clarity.
🔎 Proposed fix
-void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc) { +void fillAtbTensorDesc(const Tensor& tensor, atb::TensorDesc& desc) { desc.dtype = ACL_FLOAT16; // Currently hardcoded as per demo, can be expanded later desc.format = ACL_FORMAT_ND; - auto shape = t.shape(); + auto shape = tensor.shape(); desc.shape.dimNum = static_cast<uint64_t>(shape.size()); for (uint64_t i = 0; i < desc.shape.dimNum; ++i) { desc.shape.dims[i] = static_cast<int64_t>(shape[i]); } }Note: This change should also be reflected in the corresponding header file declaration.
121-135: Rename short variable name for better readability.The variable name
okat line 121 is too short per coding guidelines. Consider renaming it to a more descriptive identifier such asis_validorall_match.🔎 Proposed fix
- bool ok = true; + bool is_valid = true; for (size_t i = 0; i < actual.size(); ++i) { const float diff = std::abs(actual[i] - expected[i]); const float thr = atol + rtol * std::abs(expected[i]); if (diff > thr) { - ok = false; + is_valid = false; if (verbose) { std::cout << "[AscendVerify] idx " << i << " expected " << expected[i] << " got " << actual[i] << " diff " << diff << " thr " << thr << "\n"; } } } if (verbose) { - std::cout << (ok ? "[AscendVerify] OK" : "[AscendVerify] FAIL") << "\n"; + std::cout << (is_valid ? "[AscendVerify] OK" : "[AscendVerify] FAIL") << "\n"; } - return ok; + return is_valid;
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
docs/ascend_backend/core_design.rst(1 hunks)docs/ascend_backend/index.rst(1 hunks)docs/index.rst(1 hunks)examples/CMakeLists.txt(1 hunks)mllm/backends/ascend/AscendBackend.cpp(1 hunks)mllm/backends/ascend/AscendCommon.cpp(1 hunks)mllm/backends/ascend/ops/AscendX2XOp.cpp(1 hunks)mllm/backends/ascend/ops/AscendX2XOp.hpp(1 hunks)tests/CMakeLists.txt(1 hunks)tests/ascend/AscendKernelTest.hpp(1 hunks)tests/ascend/CMakeLists.txt(1 hunks)tests/ascend/KernelTest.cpp(1 hunks)tests/ascend/KernelTestHelper.hpp(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- examples/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/ascend/ops/AscendX2XOp.hppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/AscendCommon.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/ascend/ops/AscendX2XOp.hppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/AscendCommon.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/ascend/ops/AscendX2XOp.hppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/AscendCommon.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/AscendCommon.cpp
🧬 Code graph analysis (1)
mllm/backends/ascend/ops/AscendX2XOp.cpp (1)
mllm/backends/ascend/ops/AscendX2XOp.hpp (4)
AscendX2XOp(14-14)options(21-23)options(21-21)inputs(16-16)
🪛 Clang (14.0.6)
tests/ascend/AscendKernelTest.hpp
[error] 12-12: class 'AscendKernelTest' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 18-18: method 'AddFloat16Test' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
tests/ascend/KernelTestHelper.hpp
[error] 7-7: class 'KernelTest' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.cpp
[error] 54-54: 3 adjacent parameters of 'prepareAscendTensor' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 57-57: variable 'expected_elements' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 68-68: variable 'device_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 69-69: variable 'bytes' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 92-92: variable 'elem_cnt' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 95-95: variable 'src' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 102-102: 3 adjacent parameters of 'verifyAscendTensor' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 102-102: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 121-121: variable name 'ok' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 143-143: 3 adjacent parameters of 'verifyAscendTensor' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 143-143: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 173-173: variable 'ctx' provides global access to a non-const object; consider making the pointed-to data 'const'
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 209-209: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
tests/ascend/KernelTest.cpp
[error] 16-16: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
[error] 21-21: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 21-21: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 22-22: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 22-22: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 23-23: 32 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 23-23: 32 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
🔇 Additional comments (8)
docs/index.rst (1)
354-358: LGTM!The Ascend backend documentation entry is properly integrated into the main documentation index with consistent formatting.
tests/ascend/KernelTest.cpp (2)
16-26: LGTM!The test systematically validates the Add operation across various tensor shapes, from small ({2,3}) to larger ({32,32}) dimensions. The static analysis warnings about magic numbers are false positives—these values represent test tensor dimensions and are self-documenting in this context.
28-44: LGTM!The main function properly orchestrates the test lifecycle: initializes GoogleTest, sets up the Ascend backend and runtime context, executes tests, reports memory usage for leak detection, and performs cleanup. The sequence is correct and demonstrates good test hygiene.
tests/ascend/CMakeLists.txt (1)
1-26: LGTM!The CMake configuration properly sets up the Ascend kernel test target with:
- Conditional SDK path inclusion when
ASCEND_HOME_PATHis defined- Correct library dependencies (GoogleTest, runtime, backends, Ascend CL)
- Appropriate C++20 standard requirement
- GoogleTest integration via
include(GoogleTest)The configuration follows CMake best practices and aligns with the project's build structure.
tests/ascend/AscendKernelTest.hpp (2)
18-50: Test implementation is sound with minor style consideration.The test properly validates the Add operation by:
- Generating random FP16 inputs on CPU
- Computing a reference result with element-wise FP16 addition
- Executing the operation on Ascend hardware
- Validating results with
allClose(tolerance 1e-2f, appropriate for FP16 precision)Minor note: Line 19 uses
using namespace mllmwithin the method. While marked withNOLINT, consider using explicitmllm::prefixes to maintain namespace hygiene (purely stylistic for tests).The static analysis suggestions can be safely ignored:
- The "special member functions" warning is a false positive for test fixtures
- Making the method static would prevent it from accessing inherited test fixture state if needed in the future
28-34: FP16 arithmetic is correctly supported.The
mllm_fp16_ttype is defined as eitherfloat16_t(native ARM floating-point type) orhalf_float::half(established library type with full operator support). Both implementations properly support the+operator for floating-point addition. The same pattern is used consistently in CPU ElementwiseKernelTest and is known to work correctly.mllm/backends/ascend/AscendBackend.cpp (1)
14-23: Verify device enumeration success in constructor.The constructor properly registers operation factories and logs device information during initialization. However,
AscendDeviceMetaInfo::instance().devicesis not guaranteed to be populated—device enumeration can fail silently ifaclInit()oraclrtGetDeviceCount()fails. While errors are logged in the singleton constructor, the Backend constructor does not validate that devices were successfully enumerated. If initialization fails, thedevicesvector will be empty and the logging loop will silently not execute, leaving users unaware of the failure.Consider adding a check after accessing
devicesto ensure enumeration succeeded, or log a warning if no devices were found.mllm/backends/ascend/AscendCommon.cpp (1)
102-107: Verify the parameter order to prevent accidental swaps.Static analysis warns that parameters
atol,rtol, andverboseat line 102 have similar or convertible types and could be easily swapped. While the current usage appears correct, consider using strongly-typed parameters or adding runtime validation to catch misuse.You can add a comment documenting the parameter order or consider using named parameters in C++20 if available:
// Parameters: atol (absolute tolerance), rtol (relative tolerance), verbose (logging flag) bool verifyAscendTensor(const Tensor& tensor, const std::vector<float>& expected, float atol, float rtol, bool verbose, std::vector<float>* actual_out = nullptr);
tests/ascend/CMakeLists.txt
Outdated
| MllmRT | ||
| MllmCPUBackend | ||
| MllmAscendBackend | ||
| ascendcl |
There was a problem hiding this comment.
ascendcl should be linked into MllmAscendBackend
tests/ascend/KernelTest.cpp
Outdated
| testing::InitGoogleTest(&argc, argv); | ||
|
|
||
| // Initialize Ascend backend | ||
| mllm::initAscendBackend(); |
There was a problem hiding this comment.
init mllm context before init ascend backend
tests/CMakeLists.txt
Outdated
| if(MLLM_BUILD_CUDA_BACKEND) | ||
| add_subdirectory(cuda) | ||
| endif() | ||
| if(DEFINED ENV{ASCEND_HOME_PATH}) |
There was a problem hiding this comment.
if (MLLM_BUILD_ASCEND_BACKEND)
mllm/backends/ascend/CMakeLists.txt
Outdated
| set(ASCEND_CANN_PACKAGE_PATH "/usr/local/Ascend/ascend-toolkit/latest" | ||
| CACHE STRING "ASCEND CANN package installation directory" | ||
| ) | ||
| set(SOC_VERSION "Ascend310B1" CACHE STRING "system on chip type") |
There was a problem hiding this comment.
Make this configurable in mllm's top cmake files
mllm/backends/ascend/CMakeLists.txt
Outdated
| set(SOC_VERSION "Ascend310B1" CACHE STRING "system on chip type") | ||
|
|
||
| if(NOT CMAKE_BUILD_TYPE) | ||
| set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "Build type Release/Debug (default Debug)" FORCE) |
There was a problem hiding this comment.
indent should use 2 blank
mllm/backends/ascend/CMakeLists.txt
Outdated
|
|
||
| if(DEFINED ENV{ASCEND_HOME_PATH}) | ||
| target_include_directories(MllmAscendBackend PUBLIC $ENV{ASCEND_HOME_PATH}/include) | ||
| target_link_directories(MllmAscendBackend PRIVATE $ENV{ASCEND_HOME_PATH}/lib64) |
mllm/backends/ascend/CMakeLists.txt
Outdated
|
|
||
| if(DEFINED ENV{ATB_HOME_PATH}) | ||
| target_include_directories(MllmAscendBackend PUBLIC $ENV{ATB_HOME_PATH}/include) | ||
| target_link_directories(MllmAscendBackend PRIVATE $ENV{ATB_HOME_PATH}/lib) |
mllm/backends/ascend/CMakeLists.txt
Outdated
| endif() | ||
|
|
||
|
|
||
| target_link_libraries(MllmAscendBackend PRIVATE |
| @@ -0,0 +1,121 @@ | |||
| // Copyright (c) MLLM Team. | |||
There was a problem hiding this comment.
The file name has typo Ascned -> Ascend
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
mllm/backends/ascend/ops/AscendElewiseOps.cpp (1)
44-48: Resource leak: ATB operation and workspace not destroyed on error paths.If
op->Setup()(line 79) orop->Execute()(line 95) fails,MLLM_ERROR_EXITis called beforeatb::DestroyOperation(op)at line 106, leaking the ATB operation. Additionally, ifExecute()fails after workspace allocation, the workspace is also leaked.Consider using RAII for cleanup:
🔎 Proposed fix using scope guards
+#include <utility> // for std::exchange if needed + +namespace { +struct AtbOpGuard { + atb::Operation* op_ = nullptr; + explicit AtbOpGuard(atb::Operation* op) : op_(op) {} + ~AtbOpGuard() { if (op_) atb::DestroyOperation(op_); } + void release() { op_ = nullptr; } + AtbOpGuard(const AtbOpGuard&) = delete; + AtbOpGuard& operator=(const AtbOpGuard&) = delete; +}; +} // namespace + void AscendAddOp::forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) { // ... validation code ... atb::Operation* op = nullptr; auto st = atb::CreateOperation(addParam, &op); if (st != atb::NO_ERROR || op == nullptr) { MLLM_ERROR_EXIT(ExitCode::kAscendError, "ATB CreateOperation(ELEWISE_ADD) failed, status={}", static_cast<int>(st)); } + AtbOpGuard op_guard(op); // ... rest of forward() ... syncGlobalAtbStream(); if (workspace_block_id != -1) { auto& mem_mgr = getAscendMemoryManager(); mem_mgr.freeBlock(workspace_block_id); } - atb::DestroyOperation(op); + op_guard.release(); + atb::DestroyOperation(op); }Also applies to: 77-81, 94-96
🧹 Nitpick comments (2)
CMakeLists.txt (1)
48-51: Consider adding validation for allowed values.The new cache variables are well-documented. For improved user experience, consider constraining allowed values using
set_property:set(MLLM_RUN_MODE "npu" CACHE STRING "Run mode for mLLM backends: cpu/sim/npu") set_property(CACHE MLLM_RUN_MODE PROPERTY STRINGS "cpu" "sim" "npu") set(MLLM_SOC_VERSION "Ascend310B1" CACHE STRING "SOC version for Ascend backend")This provides dropdown selection in CMake GUIs and catches invalid values early.
tests/ascend/CMakeLists.txt (1)
24-24: Unusedinclude(GoogleTest)without test discovery.The
GoogleTestmodule is included butgtest_discover_tests()is never called, so the test won't be registered with CTest. Either add discovery or remove the include:🔎 Option A: Add test discovery
include(GoogleTest) +gtest_discover_tests(Mllm-Test-AscendKernel)🔎 Option B: Remove unused include
-include(GoogleTest)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CMakeLists.txtmllm/backends/ascend/CMakeLists.txtmllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/cpu/kernels/arm/rmsnorm.cppmllm/backends/cpu/kernels/arm/softmax.cpptests/CMakeLists.txttests/ascend/CMakeLists.txttests/ascend/KernelTest.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- mllm/backends/ascend/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/cpu/kernels/arm/rmsnorm.cppmllm/backends/cpu/kernels/arm/softmax.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/cpu/kernels/arm/rmsnorm.cppmllm/backends/cpu/kernels/arm/softmax.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/cpu/kernels/arm/rmsnorm.cppmllm/backends/cpu/kernels/arm/softmax.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/cpu/kernels/arm/rmsnorm.cppmllm/backends/cpu/kernels/arm/softmax.cpp
🧬 Code graph analysis (2)
mllm/backends/cpu/kernels/arm/rmsnorm.cpp (1)
mllm/backends/cpu/kernels/arm/math.hpp (2)
vsquare_mean_fp32(270-284)vsquare_mean_fp32(270-270)
mllm/backends/cpu/kernels/arm/softmax.cpp (2)
mllm/nn/Functional.cpp (2)
exp(81-81)exp(81-81)mllm/nn/Functional.hpp (1)
exp(115-115)
🪛 Clang (14.0.6)
tests/ascend/KernelTest.cpp
[error] 16-16: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
[error] 21-21: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 21-21: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 22-22: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 22-22: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 23-23: 32 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 23-23: 32 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
🔇 Additional comments (8)
mllm/backends/cpu/kernels/arm/rmsnorm.cpp (2)
20-20: LGTM! Improved consistency.Replacing
std::sqrtfwithstd::sqrtaligns this function with the rest of the file (rmsnorm_fp32_inplaceat line 71 andrmsnorm_fp16_inplaceat line 160 already usestd::sqrt). The genericstd::sqrtis more idiomatic in modern C++ and produces identical results for float arguments.
109-109: LGTM! Improved consistency.Replacing
std::sqrtfwithstd::sqrtaligns this function withrmsnorm_fp16_inplace(line 160). This change ensures all four RMS normalization functions in this file consistently use the genericstd::sqrt, which is the more idiomatic C++ approach.mllm/backends/cpu/kernels/arm/softmax.cpp (1)
28-28: Clarify the rationale for this change.This modification to the CPU ARM softmax kernel appears unrelated to the PR's stated objective of adding an Ascend backend. Could you explain why these changes were made? If this is intended as a consistency improvement, please note that it introduces a performance regression (see previous comments).
Also applies to: 115-115, 166-166, 232-232
tests/ascend/KernelTest.cpp (2)
16-26: Test case structure looks good.The magic numbers flagged by static analysis (8, 16, 32) are test input dimensions representing a range of sizes for FP16 add validation. In test code, explicit literal dimensions are appropriate since they serve as test case data rather than configuration values that need naming.
The test correctly validates multiple dimension pairs for the AddFloat16 operation.
28-44: Initialization and cleanup sequence is correct.The
main()function properly initializes the MLLM context before the Ascend backend, runs tests, reports memory usage, and shuts down the context. This is the correct order of operations.tests/CMakeLists.txt (1)
14-16: Consistent pattern with existing backends.The conditional block follows the same pattern as the CUDA backend inclusion, correctly guarding Ascend test compilation behind the
MLLM_BUILD_ASCEND_BACKENDoption.mllm/backends/ascend/ops/AscendElewiseOps.cpp (2)
22-24: Setup delegates correctly to base class.The
setup()method properly delegates toBaseOp::setup()for tensor shape inference.
26-39: Good validation and clear NYI markers.The input validation correctly asserts tensor counts, dtype consistency, and element count matching. The
NYImarkers clearly document the current limitations (same dtype, no broadcasting).
| if (workspaceSize > 0) { | ||
| auto& mem_mgr = getAscendMemoryManager(); | ||
| mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id); | ||
| mem_mgr.getBlockPtr(workspace_block_id, workspace); |
There was a problem hiding this comment.
Narrowing conversion: uint64_t to uint32_t may truncate large workspace sizes.
workspaceSize is uint64_t but allocateBlock takes uint32_t. For large workspaces exceeding 4GB, this truncates silently. Consider either:
- Checking if
workspaceSizeexceedsUINT32_MAXand erroring - Updating
allocateBlockto acceptuint64_t
🔎 Proposed bounds check
if (workspaceSize > 0) {
+ if (workspaceSize > static_cast<uint64_t>(UINT32_MAX)) {
+ MLLM_ERROR_EXIT(ExitCode::kAscendError, "Workspace size {} exceeds 32-bit limit", workspaceSize);
+ }
auto& mem_mgr = getAscendMemoryManager();
mem_mgr.allocateBlock(static_cast<uint32_t>(workspaceSize), workspace_block_id);
mem_mgr.getBlockPtr(workspace_block_id, workspace);
}🤖 Prompt for AI Agents
In mllm/backends/ascend/ops/AscendElewiseOps.cpp around lines 85 to 88,
workspaceSize is a uint64_t but allocateBlock accepts uint32_t which can
truncate values > UINT32_MAX; add a guard that checks if workspaceSize >
UINT32_MAX and handle it (log an error and return/fail or throw) before calling
allocateBlock, or alternatively update the memory manager API to accept uint64_t
and propagate the type change through allocateBlock and its callers; if keeping
allocateBlock as uint32_t, cast only after the bounds check and ensure
workspace_block_id and workspace are handled consistently on the error path.
| float sum = 0.f; | ||
| for (int i = 0; i < len; ++i) { | ||
| auto tmp = std::expf(X[i * stride] - max_value); | ||
| auto tmp = std::exp(X[i * stride] - max_value); |
There was a problem hiding this comment.
Performance regression: prefer std::expf for single-precision computations.
Replacing std::expf with std::exp introduces unnecessary double-precision computation in the scalar fallback paths. std::exp(float_arg) promotes the float to double, computes the exponential in double precision, then demotes back to float—significantly slower than std::expf which computes directly in single precision.
These scalar paths handle stride≠1, small tensors (len≤16), and tail elements. While the vectorized paths (which use vexpq_fast_f32) are unaffected, this change degrades performance for non-contiguous memory layouts and small inputs with no apparent benefit.
🔎 Proposed fix
- auto tmp = std::exp(X[i * stride] - max_value);
+ auto tmp = std::expf(X[i * stride] - max_value);And for the tail loop:
- float tmp = std::exp(X[i] - max_value);
+ float tmp = std::expf(X[i] - max_value);Also applies to: 115-115
🤖 Prompt for AI Agents
In mllm/backends/cpu/kernels/arm/softmax.cpp around lines 28 and 115, the scalar
fallback and tail loops use std::exp on float values which promotes to double
and slows single-precision paths; change these calls to std::expf (and any other
scalar float exp usages in those nearby lines) so the exponential is computed in
single precision, keeping vectorized vexpq_fast_f32 unaffected; ensure the
argument types remain float and that <cmath> is available for expf.
| float sum = 0.f; | ||
| for (int i = 0; i < len; ++i) { | ||
| auto tmp = std::expf(X[i * stride] - max_value); | ||
| auto tmp = std::exp(X[i * stride] - max_value); |
There was a problem hiding this comment.
Performance regression: prefer std::expf for half-precision inputs.
Replacing std::expf with std::exp introduces an unnecessarily expensive promotion path for half-precision (fp16) inputs. With fp16 arguments, std::exp promotes fp16→double (8 bytes), computes in double precision, then the result is cast to fp16. Using std::expf would promote fp16→float (4 bytes), compute in single precision—both faster promotion and faster computation.
This affects the same scalar fallback paths as the fp32 version (stride≠1, len≤16, tail elements), and is inconsistent with the vectorized paths that use vexpq_fast_f16.
🔎 Proposed fix
- auto tmp = std::exp(X[i * stride] - max_value);
+ auto tmp = std::expf(X[i * stride] - max_value);And for the tail loop:
- float tmp = std::exp(X[i] - max_value);
+ float tmp = std::expf(X[i] - max_value);Also applies to: 232-232
🤖 Prompt for AI Agents
In mllm/backends/cpu/kernels/arm/softmax.cpp around lines 166 and 232, the
scalar fallback uses std::exp which forces fp16 inputs to promote to double and
back; change these calls to std::expf so fp16 promotes to float and computes in
single precision for better performance and consistency with vexpq_fast_f16,
ensuring the surrounding expressions still use float temporaries (or cast to
float) before calling std::expf and then cast the result back to the fp16
storage type as needed.
| target_link_libraries(Mllm-Test-AscendKernel PRIVATE | ||
| gtest_main | ||
| MllmRT | ||
| MllmCPUBackend | ||
| MllmAscendBackend | ||
| ) |
There was a problem hiding this comment.
Linker conflict: gtest_main provides main() but test has custom main().
KernelTest.cpp defines its own main() function for Ascend context initialization, but linking against gtest_main will also provide a main(). This causes a linker error (duplicate symbol).
Replace gtest_main with gtest:
🔎 Proposed fix
target_link_libraries(Mllm-Test-AscendKernel PRIVATE
- gtest_main
+ gtest
MllmRT
MllmCPUBackend
MllmAscendBackend
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| target_link_libraries(Mllm-Test-AscendKernel PRIVATE | |
| gtest_main | |
| MllmRT | |
| MllmCPUBackend | |
| MllmAscendBackend | |
| ) | |
| target_link_libraries(Mllm-Test-AscendKernel PRIVATE | |
| gtest | |
| MllmRT | |
| MllmCPUBackend | |
| MllmAscendBackend | |
| ) |
🤖 Prompt for AI Agents
In tests/ascend/CMakeLists.txt around lines 8 to 13, the target currently links
against gtest_main which provides its own main(), causing a duplicate-symbol
linker error because KernelTest.cpp defines a custom main(); replace gtest_main
with gtest in the target_link_libraries invocation so the test uses the gtest
library without bringing its main(), then reconfigure/rebuild to verify the
duplicate main() linker error is resolved.
04e56be to
2fa8eed
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (12)
mllm/backends/ascend/AscendCommon.hpp (1)
130-131: Fix__LINE__concatenation in macros.The
__LINE__macro requires a two-level expansion to work correctly in token concatenation. As flagged in a previous review, without the helper macros, the variable name will literally betimer_scope___LINE__.tests/ascend/CMakeLists.txt (1)
8-13: Linker conflict:gtest_mainprovidesmain()but test has custommain().
KernelTest.cppdefines its ownmain()function for Ascend context initialization, but linking againstgtest_mainwill also provide amain(). This causes a duplicate symbol linker error.🔎 Proposed fix
target_link_libraries(Mllm-Test-AscendKernel PRIVATE - gtest_main + gtest MllmRT MllmCPUBackend MllmAscendBackend )mllm/backends/ascend/memory/MemoryBlock.hpp (1)
11-15: Initialize all struct members to prevent undefined behavior.The struct members
block_id_andblock_size_are uninitialized. If aMemoryBlockis default-constructed, these fields will contain garbage values, leading to undefined behavior when accessed.🔎 Proposed fix
struct MemoryBlock { - int64_t block_id_; - size_t block_size_; + int64_t block_id_ = -1; + size_t block_size_ = 0; void* address_ = nullptr; };mllm/backends/ascend/memory/AscendMemoryPool.cpp (1)
42-71: Alignment and capacity check issues previously flagged.The issues with alignment math, capacity checking (
>vs>=), and missingblock_id = -1on failure were already identified in a previous review comment. Please address those concerns.mllm/backends/ascend/ops/AscendElewiseOps.cpp (2)
44-48: ATB operation leak on error paths previously flagged.This issue was identified in a previous review. The ATB operation created at line 45 is not destroyed if
Setup()(line 78-81) orExecute()(lines 94-96) fails viaMLLM_ERROR_EXIT. Consider using RAII as suggested in the prior review.
85-89: Truncation risk on workspace size previously flagged.The cast from
uint64_t workspaceSizetouint32_twas identified in a previous review as a potential truncation issue for workspaces exceeding 4GB.mllm/backends/ascend/AscendDispatcher.hpp (1)
13-15:MLLM_EMPTY_SCOPEcauses uninitialized field warning.Static analysis reports that the implicit constructor doesn't initialize the field introduced by
MLLM_EMPTY_SCOPE. If the options struct is truly empty, consider removing the macro or adding a default constructor.🔎 Proposed fix
struct AscendDispatcherOptions { - MLLM_EMPTY_SCOPE; + // Reserved for future options. };mllm/backends/ascend/memory/AscendMemoryPool.hpp (1)
15-38: Missing copy/move semantics (Rule of Five) for class owning raw device memory.
AscendMemoryPooldefines a destructor that freesbase_mem_ptr_but lacks copy/move constructors and assignment operators. Copying this class would cause double-free. Delete or explicitly define these operations.🔎 Proposed fix
class AscendMemoryPool { public: explicit AscendMemoryPool(size_t pool_size); ~AscendMemoryPool(); + // Non-copyable, non-movable (owns device memory) + AscendMemoryPool(const AscendMemoryPool&) = delete; + AscendMemoryPool& operator=(const AscendMemoryPool&) = delete; + AscendMemoryPool(AscendMemoryPool&&) = delete; + AscendMemoryPool& operator=(AscendMemoryPool&&) = delete; + void allocateBlock(uint32_t size, int& block_id);mllm/backends/ascend/memory/AscendMemoryManager.cpp (2)
27-27: Check the return value ofaclrtSetDevice.If
aclrtSetDevicefails, the memory pool will be created with an incorrect device context. This could cause device mismatch errors in subsequent operations.🔎 Proposed fix
- aclrtSetDevice(i); + auto set_ret = aclrtSetDevice(i); + MLLM_ACL_CHECK(set_ret);
43-50: Handle negative device ID before casting tosize_t.If
getDeviceId()returns-1(device not set), casting tosize_tproduces a huge value that passes the bounds check but accesses invalid memory. Check for negative values explicitly.🔎 Proposed fix
std::shared_ptr<AscendMemoryPool> &AscendMemoryManager::getMemoryPool() { - size_t device_id = static_cast<size_t>(getDeviceId()); + if (memory_pools_.empty()) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Ascend memory pools not initialized"); + } + const int32_t device_id_signed = getDeviceId(); + if (device_id_signed < 0) { + MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id_signed); + } + const size_t device_id = static_cast<size_t>(device_id_signed); if (device_id >= memory_pools_.size()) { MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "Invalid device id {}", device_id); } return memory_pools_[device_id]; }mllm/backends/ascend/AscendCommon.cpp (2)
269-270:aclFinalize()in constructor terminates ACL runtime prematurely.Calling
aclFinalize()after device enumeration will shut down the ACL runtime. Subsequent ACL operations (e.g., ingetGlobalAtbContext()) will fail. Remove this call and let backend shutdown handle ACL cleanup.🔎 Proposed fix
devices.push_back(info); } - // Finalize ACL after enumeration - aclFinalize(); + // Note: ACL should remain initialized for subsequent operations. + // aclFinalize() should only be called during backend shutdown. #else
209-218: Rename parametertto a more descriptive name.The parameter name
tis too short per coding guidelines. Rename totensorfor clarity.🔎 Proposed fix
-void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc) { +void fillAtbTensorDesc(const Tensor& tensor, atb::TensorDesc& desc) { desc.dtype = ACL_FLOAT16; // Currently hardcoded as per demo, can be expanded later desc.format = ACL_FORMAT_ND; - auto shape = t.shape(); + auto shape = tensor.shape(); desc.shape.dimNum = static_cast<uint64_t>(shape.size());
🧹 Nitpick comments (15)
mllm/backends/ascend/ops/AscendElewiseOps.hpp (1)
12-25: LGTM!The
AscendAddOpandAscendAddOpFactoryclasses follow the established factory pattern used by other backends. The class hierarchy and method signatures are appropriate.Consider adding brief docstrings for
setupandforwardmethods to document their expected behavior and any Ascend-specific semantics. As per coding guidelines, public APIs should have clear comments explaining purpose, parameters, and behavior.mllm/backends/ascend/AscendCommon.hpp (2)
19-28: Parenthesize macro arguments to prevent operator precedence issues.The
errargument in both macros should be wrapped in parentheses to avoid unexpected behavior if a complex expression is passed.🔎 Proposed fix
#define MLLM_ACL_CHECK(err) \ - if (err != ACL_SUCCESS) { \ + if ((err) != ACL_SUCCESS) { \ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ACL error code {}: {}", int(err), aclGetRecentErrMsg()); \ } #define MLLM_ATB_CHECK(err) \ - if (err != atb::NO_ERROR) { \ + if ((err) != atb::NO_ERROR) { \ MLLM_ERROR_EXIT(::mllm::ExitCode::kAscendError, "ATB error code {}", int(err)); \ }
54-67: Consider adding move semantics toAscendDeviceMetaInfo.The class deletes copy operations but doesn't explicitly handle move operations or define a destructor. Since it's a singleton with a
std::vectormember, the compiler-generated defaults are likely fine, but explicitly defaulting them would silence the static analysis warning and clarify intent.🔎 Proposed fix
class AscendDeviceMetaInfo { public: AscendDeviceMetaInfo(); + ~AscendDeviceMetaInfo() = default; static AscendDeviceMetaInfo& instance() { static AscendDeviceMetaInfo instance; return instance; } AscendDeviceMetaInfo(const AscendDeviceMetaInfo&) = delete; AscendDeviceMetaInfo& operator=(const AscendDeviceMetaInfo&) = delete; + AscendDeviceMetaInfo(AscendDeviceMetaInfo&&) = delete; + AscendDeviceMetaInfo& operator=(AscendDeviceMetaInfo&&) = delete; std::vector<AscendDeviceInfo> devices; };tests/ascend/CMakeLists.txt (1)
24-24: Missinggtest_discover_testscall.The
include(GoogleTest)is present but there's nogtest_discover_tests()call to register the tests with CTest. This meansctestwon't discover and run these tests.🔎 Proposed fix
include(GoogleTest) +gtest_discover_tests(Mllm-Test-AscendKernel)mllm/backends/ascend/AscendBackend.cpp (1)
14-23: Move lambda outside the loop for efficiency.The
bytes_to_mblambda is defined inside the loop body, causing it to be recreated on each iteration. Move it before the loop for cleaner code.🔎 Proposed fix
AscendBackend::AscendBackend() : Backend(kAscend, createAscendAllocator()) { regOpFactory<AscendAddOpFactory>(); regOpFactory<AscendX2XOpFactory>(); + const auto bytes_to_mb = [](size_t bytes) { return bytes / (1024.0 * 1024.0); }; auto& devices = AscendDeviceMetaInfo::instance().devices; for (const auto& device : devices) { - const auto bytes_to_mb = [](size_t bytes) { return bytes / (1024.0 * 1024.0); }; MLLM_INFO("Found Ascend device {} (ID: {}, SOC: {}, Memory: {:.2f} MB free / {:.2f} MB total)", device.name, device.id, device.soc_version, bytes_to_mb(device.free_memory), bytes_to_mb(device.total_memory)); } }mllm/backends/ascend/AscendAllocator.hpp (1)
15-18: Consider explicitly deleting copy/move operations.The class contains a
std::mutexwhich is non-copyable, but explicitly deleting copy/move constructors and assignment operators makes the intent clear and satisfies the Rule of Five.🔎 Proposed fix
class AscendAllocator final : public Allocator { public: AscendAllocator(); ~AscendAllocator(); + + AscendAllocator(const AscendAllocator&) = delete; + AscendAllocator& operator=(const AscendAllocator&) = delete; + AscendAllocator(AscendAllocator&&) = delete; + AscendAllocator& operator=(AscendAllocator&&) = delete; inline bool ctrlByMemManager() override { return false; }tests/ascend/AscendKernelTest.hpp (1)
18-50: Improve const-correctness and add failure diagnostics.Two minor issues:
- Line 20: Use
const auto&sinceshapesis passed by const reference.- Lines 45-47: Log which shape failed to aid debugging.
🔎 Proposed fix
bool AddFloat16Test(const std::vector<mllm::Tensor::shape_t>& shapes) { using namespace mllm; // NOLINT - for (auto& shape : shapes) { + for (const auto& shape : shapes) { // 1. Construct random FP16 inputs on CPU Tensor x_cpu = Tensor::random(shape, -3, 3, kFloat16, kCPU); Tensor y_cpu = Tensor::random(shape, -3, 3, kFloat16, kCPU); // ... existing code ... auto result = mllm::test::allClose(z_cpu, ref_cpu, 1e-2f, 1e-2f); if (!result.is_close) { + // Log failure details for debugging + MLLM_ERROR("AddFloat16Test failed for shape, max_diff={}", result.max_diff); return false; } } return true; }mllm/backends/ascend/Register.cpp (1)
19-19: Consider making pool size configurable.The hardcoded 100MB pool size may not be optimal for all use cases. Consider making this configurable via environment variable or initialization options.
🔎 Example
// Could read from environment or options struct size_t pool_size = getenv("MLLM_ASCEND_POOL_SIZE_MB") ? std::stoul(getenv("MLLM_ASCEND_POOL_SIZE_MB")) * 1024 * 1024 : 100 * 1024 * 1024; // Default 100MBmllm/backends/ascend/memory/AscendMemoryManager.hpp (2)
22-22: Returning reference toshared_ptris unusual.
getMemoryPool()returnsstd::shared_ptr<AscendMemoryPool>&. Typically, shared_ptr is returned by value (for shared ownership) orconst&(for read-only access). Returning a non-const reference allows callers to reset or reassign the internal pointer, which could cause issues.🔎 Proposed fix
- std::shared_ptr<AscendMemoryPool> &getMemoryPool(); + std::shared_ptr<AscendMemoryPool> getMemoryPool();
24-24:uint32_t sizelimits allocations to 4GB.Large model tensors or workspaces may exceed 4GB. Consider using
size_tfor future-proofing.mllm/backends/ascend/memory/AscendMemoryPool.cpp (1)
90-104: Setaddrtonullptron failure for safer usage.When
block_idis invalid or not found,addrremains uninitialized/unchanged. Setting it tonullptron error paths prevents use of stale values.🔎 Proposed fix
void AscendMemoryPool::getBlockPtr(int block_id, void *&addr) { std::unique_lock<std::mutex> lock(block_mutex_); if (block_id < 0) { MLLM_INFO("Invalid block id {} to get ptr", block_id); + addr = nullptr; return; } auto it = used_blocks_.find(block_id); if (it != used_blocks_.end()) { addr = it->second.address_; } else { + addr = nullptr; MLLM_ERROR("Get block address error, block id {}", block_id); } }mllm/backends/ascend/ops/AscendX2XOp.cpp (1)
39-41: Redundant error check beforeMLLM_ACL_CHECK.The pattern
if (ret != ACL_SUCCESS) { MLLM_ACL_CHECK(ret); }is redundant—MLLM_ACL_CHECKalready handles the success case internally. Simplify to justMLLM_ACL_CHECK(ret);directly after theaclrtMemcpycall.🔎 Proposed fix
auto ret = aclrtMemcpy( dst_data, data_size, src_data, data_size, ACL_MEMCPY_HOST_TO_DEVICE); - - if (ret != ACL_SUCCESS) { - MLLM_ACL_CHECK(ret); - } + MLLM_ACL_CHECK(ret); syncGlobalAtbStream();Apply the same pattern to the other two cases (lines 59-61 and 79-81).
Also applies to: 59-61, 79-81
mllm/backends/ascend/AscendAllocator.cpp (1)
82-91: Inconsistent logging: useMLLM_WARNinstead ofstd::cout.The rest of this file uses structured logging (
MLLM_INFO,MLLM_WARN,MLLM_ERROR), butgeneralAllocandgeneralFreeuse rawstd::cout. For consistency and to enable log filtering, use the logging macros.🔎 Proposed fix
bool AscendAllocator::generalAlloc(void** ptr, size_t cap, size_t align) { - //we don't support generalAlloc , therefore return false - std::cout << "generalAlloc is not supported in AscendAllocator" << std::endl; + // generalAlloc is not supported + MLLM_WARN("generalAlloc is not supported in AscendAllocator"); return false; } void AscendAllocator::generalFree(void* ptr) { - //we don't support generalFree , therefore do nothing - std::cout << "generalFree is not supported in AscendAllocator" << std::endl; + // generalFree is not supported + MLLM_WARN("generalFree is not supported in AscendAllocator"); }mllm/backends/ascend/AscendCommon.cpp (1)
90-90:const_caston tensor parameter is a code smell.The
const_castto call.to(kCPU)suggests the method might have side effects. Consider accepting the tensor by non-const reference if mutation is expected, or document why this is safe.mllm/backends/ascend/CMakeLists.txt (1)
14-23: Usingfile(GLOB ...)for sources can miss newly added files.CMake's glob doesn't re-run automatically when files are added. Consider either explicitly listing sources or ensuring developers know to re-run CMake when adding new files. This is acceptable for development but worth documenting.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (33)
CMakeLists.txtdocs/ascend_backend/core_design.rstdocs/ascend_backend/index.rstdocs/index.rstmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/AscendAllocator.hppmllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/AscendCommon.hppmllm/backends/ascend/AscendDispatcher.cppmllm/backends/ascend/AscendDispatcher.hppmllm/backends/ascend/CMakeLists.txtmllm/backends/ascend/Register.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/memory/MemoryBlock.hppmllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/ops/AscendX2XOp.hppmllm/backends/cpu/kernels/arm/rmsnorm.cppmllm/backends/cpu/kernels/arm/softmax.cppmllm/core/DeviceTypes.hppmllm/engine/Dispatcher.hppmllm/utils/Common.hpptasks/build_arm_ascend.yamltests/CMakeLists.txttests/ascend/AscendKernelTest.hpptests/ascend/CMakeLists.txttests/ascend/KernelTest.cpptests/ascend/KernelTestHelper.hpp
✅ Files skipped from review due to trivial changes (3)
- mllm/backends/cpu/kernels/arm/rmsnorm.cpp
- tasks/build_arm_ascend.yaml
- docs/ascend_backend/core_design.rst
🚧 Files skipped from review as they are similar to previous changes (7)
- docs/index.rst
- mllm/backends/ascend/AscendDispatcher.cpp
- tests/CMakeLists.txt
- docs/ascend_backend/index.rst
- mllm/backends/cpu/kernels/arm/softmax.cpp
- mllm/backends/ascend/ops/AscendX2XOp.hpp
- CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (5)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/ascend/AscendBackend.cppmllm/core/DeviceTypes.hppmllm/backends/ascend/Register.cppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/ascend/AscendDispatcher.hppmllm/utils/Common.hppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/CMakeLists.txtmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/AscendCommon.hppmllm/engine/Dispatcher.hppmllm/backends/ascend/AscendAllocator.hppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/MemoryBlock.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/ascend/AscendBackend.cppmllm/core/DeviceTypes.hppmllm/backends/ascend/Register.cppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/ascend/AscendDispatcher.hppmllm/utils/Common.hppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/AscendCommon.hppmllm/engine/Dispatcher.hppmllm/backends/ascend/AscendAllocator.hppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/MemoryBlock.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/ascend/AscendBackend.cppmllm/backends/ascend/Register.cppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/memory/AscendMemoryPool.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/ascend/AscendBackend.cppmllm/core/DeviceTypes.hppmllm/backends/ascend/Register.cppmllm/backends/ascend/ops/AscendX2XOp.cppmllm/backends/ascend/memory/AscendMemoryManager.hppmllm/backends/ascend/ops/AscendElewiseOps.cppmllm/backends/ascend/AscendDispatcher.hppmllm/utils/Common.hppmllm/backends/ascend/AscendAllocator.cppmllm/backends/ascend/memory/AscendMemoryPool.hppmllm/backends/ascend/AscendCommon.cppmllm/backends/ascend/ops/AscendElewiseOps.hppmllm/backends/ascend/AscendCommon.hppmllm/engine/Dispatcher.hppmllm/backends/ascend/AscendAllocator.hppmllm/backends/ascend/memory/AscendMemoryManager.cppmllm/backends/ascend/memory/AscendMemoryPool.cppmllm/backends/ascend/memory/MemoryBlock.hpp
{mllm,mllm-cli,pymllm}/**/*.{toml,json,txt,md}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
When suggesting new dependencies, evaluate their maturity, licensing, and maintenance status. Prefer well-established and actively maintained libraries.
Files:
mllm/backends/ascend/CMakeLists.txt
🧠 Learnings (5)
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi} : Use named constants instead of magic numbers.
Applied to files:
mllm/backends/ascend/CMakeLists.txt
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh} : Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Applied to files:
mllm/backends/ascend/CMakeLists.txt
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh} : Encourage consistent coding style and patterns with the existing codebase.
Applied to files:
mllm/backends/ascend/CMakeLists.txt
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi} : Avoid unnecessary object creation in loops or hot paths.
Applied to files:
mllm/backends/ascend/CMakeLists.txt
📚 Learning: 2025-11-25T07:26:06.575Z
Learnt from: CR
Repo: UbiquitousLearning/mllm PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T07:26:06.575Z
Learning: Applies to {mllm,mllm-cli,pymllm}/**/* : No line may end with trailing whitespace.
Applied to files:
mllm/backends/ascend/memory/AscendMemoryPool.hpp
🧬 Code graph analysis (10)
mllm/backends/ascend/AscendBackend.cpp (4)
mllm/backends/ascend/AscendBackend.hpp (1)
AscendBackend(13-13)mllm/backends/ascend/AscendAllocator.cpp (2)
createAscendAllocator(107-107)createAscendAllocator(107-107)mllm/backends/ascend/AscendAllocator.hpp (1)
createAscendAllocator(46-46)mllm/backends/ascend/AscendCommon.hpp (1)
instance(58-61)
mllm/backends/ascend/Register.cpp (5)
mllm/backends/ascend/AscendCommon.hpp (1)
instance(58-61)mllm/backends/ascend/memory/AscendMemoryManager.hpp (1)
pool_size(18-18)mllm/engine/DispatcherManager.hpp (1)
dispatcher(49-49)mllm/backends/ascend/AscendDispatcher.cpp (2)
createAscendDispatcher(83-86)createAscendDispatcher(83-84)mllm/backends/ascend/AscendDispatcher.hpp (1)
createAscendDispatcher(36-37)
mllm/backends/ascend/ops/AscendX2XOp.cpp (1)
mllm/backends/ascend/ops/AscendX2XOp.hpp (4)
AscendX2XOp(14-14)options(21-23)options(21-21)inputs(16-16)
mllm/backends/ascend/memory/AscendMemoryManager.hpp (1)
mllm/backends/ascend/memory/AscendMemoryPool.hpp (3)
size(20-20)block_id(22-22)block_id(24-24)
mllm/backends/ascend/AscendAllocator.cpp (2)
mllm/backends/ascend/memory/AscendMemoryManager.cpp (2)
getAscendMemoryManager(15-18)getAscendMemoryManager(15-15)mllm/backends/ascend/memory/AscendMemoryPool.hpp (2)
block_id(22-22)block_id(24-24)
mllm/backends/ascend/memory/AscendMemoryPool.hpp (2)
mllm/backends/ascend/memory/AscendMemoryManager.hpp (4)
pool_size(18-18)size(24-24)block_id(26-26)block_id(28-28)mllm/backends/ascend/memory/AscendMemoryPool.cpp (2)
AscendMemoryPool(14-26)AscendMemoryPool(28-36)
mllm/backends/ascend/AscendCommon.cpp (3)
mllm/backends/ascend/AscendCommon.hpp (6)
AscendTensorHandle(71-71)AscendTensorHandle(72-72)AscendTensorHandle(73-73)AscendTensorHandle(75-75)AscendTensorHandle(77-77)other(78-78)mllm/backends/ascend/memory/AscendMemoryPool.hpp (3)
block_id(22-22)block_id(24-24)size(20-20)mllm/backends/ascend/memory/AscendMemoryManager.cpp (2)
getAscendMemoryManager(15-18)getAscendMemoryManager(15-15)
mllm/backends/ascend/AscendCommon.hpp (1)
mllm/backends/ascend/AscendCommon.cpp (7)
AscendTensorHandle(23-24)AscendTensorHandle(26-26)AscendTensorHandle(28-31)other(33-41)other(33-33)AscendTimer(153-161)AscendTimer(163-170)
mllm/backends/ascend/AscendAllocator.hpp (1)
mllm/backends/ascend/AscendAllocator.cpp (2)
AscendAllocator(11-13)AscendAllocator(15-21)
mllm/backends/ascend/memory/AscendMemoryManager.cpp (3)
mllm/backends/ascend/memory/AscendMemoryManager.hpp (5)
AscendMemoryManager(16-16)pool_size(18-18)size(24-24)block_id(26-26)block_id(28-28)mllm/backends/ascend/memory/AscendMemoryPool.cpp (6)
allocateBlock(42-71)allocateBlock(42-42)freeBlock(73-88)freeBlock(73-73)getBlockPtr(90-104)getBlockPtr(90-90)mllm/backends/ascend/memory/AscendMemoryPool.hpp (3)
size(20-20)block_id(22-22)block_id(24-24)
🪛 Clang (14.0.6)
mllm/backends/ascend/Register.cpp
[error] 19-19: variable 'pool_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
tests/ascend/KernelTestHelper.hpp
[error] 7-7: class 'KernelTest' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/ascend/AscendDispatcher.hpp
[error] 13-13: constructor does not initialize these fields: MLLM_EMPTY_SCOPE
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
tests/ascend/AscendKernelTest.hpp
[error] 12-12: class 'AscendKernelTest' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 18-18: method 'AddFloat16Test' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryPool.hpp
[error] 15-15: class 'AscendMemoryPool' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.cpp
[error] 54-54: 3 adjacent parameters of 'prepareAscendTensor' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 57-57: variable 'expected_elements' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 68-68: variable 'device_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 69-69: variable 'bytes' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 92-92: variable 'elem_cnt' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 95-95: variable 'src' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 102-102: 3 adjacent parameters of 'verifyAscendTensor' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 102-102: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 121-121: variable name 'ok' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 143-143: 3 adjacent parameters of 'verifyAscendTensor' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 143-143: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 173-173: variable 'ctx' provides global access to a non-const object; consider making the pointed-to data 'const'
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 209-209: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
tests/ascend/KernelTest.cpp
[error] 16-16: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
[error] 21-21: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 21-21: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 22-22: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 22-22: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 23-23: 32 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 23-23: 32 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
mllm/backends/ascend/AscendCommon.hpp
[error] 19-19: function-like macro 'MLLM_ACL_CHECK' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 20-20: macro argument should be enclosed in parentheses
(bugprone-macro-parentheses,-warnings-as-errors)
[error] 25-25: function-like macro 'MLLM_ATB_CHECK' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 26-26: macro argument should be enclosed in parentheses
(bugprone-macro-parentheses,-warnings-as-errors)
[error] 42-42: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 45-45: constructor does not initialize these fields: name, id, total_memory, free_memory, soc_version
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 54-54: class 'AscendDeviceMetaInfo' defines a copy constructor and a copy assignment operator but does not define a destructor, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 70-70: constructor does not initialize these fields: tensor_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 81-81: function 'valid' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 84-84: function 'tensor' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 85-85: function 'blockId' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 101-101: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 103-103: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 103-103: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 104-104: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 104-104: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 109-109: parameter name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 111-111: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 111-111: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 112-112: 1e-2f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 112-112: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 117-117: class 'AscendTimer' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 130-130: function-like macro 'ASCEND_TIME_SCOPE' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 131-131: function-like macro 'ASCEND_TIME_SCOPE_NOSYNC' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
mllm/engine/Dispatcher.hpp
[error] 28-28: invalid case style for global constant 'ascend_dispatcher_id'
(readability-identifier-naming,-warnings-as-errors)
mllm/backends/ascend/AscendAllocator.hpp
[error] 15-15: class 'AscendAllocator' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryManager.cpp
[error] 20-20: method 'createMemoryPool' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 22-22: variable 'device_count' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 37-37: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 45-45: variable 'device_id' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
mllm/backends/ascend/memory/AscendMemoryPool.cpp
[error] 14-14: constructor does not initialize these fields: block_mutex_, used_blocks_, free_blocks_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 42-42: method 'allocateBlock' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 45-45: variable 'align_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 59-59: variable 'cur_mem_ptr_align' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 73-73: method 'freeBlock' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 81-81: variable name 'it' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 90-90: method 'getBlockPtr' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 98-98: variable name 'it' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/ascend/memory/MemoryBlock.hpp
[error] 11-11: constructor does not initialize these fields: block_id_, block_size_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
🔇 Additional comments (17)
mllm/utils/Common.hpp (1)
34-34: LGTM!The new
kAscendErrorenum value follows the existing pattern for backend-specific error codes and is correctly placed alongside other backend errors.mllm/core/DeviceTypes.hpp (1)
36-37: LGTM!The string conversion functions are correctly extended to support
kAscendandkAscendHostdevice types, following the established pattern for other backends.Also applies to: 52-55
mllm/engine/Dispatcher.hpp (1)
28-28: LGTM!The new
ascend_dispatcher_idconstant follows the established naming convention and initialization pattern used by other dispatcher IDs in this class.tests/ascend/KernelTest.cpp (2)
28-44: LGTM!The test harness correctly initializes the mllm context before the Ascend backend and properly cleans up resources after test execution. The initialization order addresses the concern raised in the previous review.
16-26: LGTM!The test covers a good range of tensor sizes for the FP16 Add operation. The inline test data is appropriate for unit tests.
tests/ascend/KernelTestHelper.hpp (1)
7-17: LGTM!The test fixture scaffolding is appropriate. The static analysis warning about special member functions is a false positive for GoogleTest fixtures—test instances are managed by the framework and are not copied or moved.
mllm/backends/ascend/memory/AscendMemoryPool.cpp (1)
73-88: LGTM:freeBlockimplementation is correct.The double-free detection and logging are appropriate. The early return for invalid
block_id < 0prevents unnecessary map lookups.mllm/backends/ascend/ops/AscendElewiseOps.cpp (2)
26-40: LGTM: Input validation and constraint checks are appropriate.The assertions for input/output sizes and the dtype/numel validation provide clear error messages for unsupported cases.
98-106: LGTM: Resource cleanup path is correct for the success case.Stream synchronization, workspace deallocation, and ATB operation destruction are handled properly in the normal execution path.
mllm/backends/ascend/ops/AscendX2XOp.cpp (1)
17-91: Well-structured device transfer implementation.The forward method correctly handles all three device transfer cases (CPU→Ascend, Ascend→CPU, Ascend→Ascend) with appropriate ACL memcpy directions and synchronization. The early-return pattern keeps the logic clear.
mllm/backends/ascend/AscendAllocator.cpp (1)
23-47: Good memory pool integration with proper error handling.The allocation flow correctly handles failures at each step: block allocation failure, pointer retrieval failure (with cleanup via
freeBlock), and successful mapping. Thread-safe access to the block map is ensured.mllm/backends/ascend/AscendDispatcher.hpp (1)
17-37: Dispatcher class structure looks good.The
AscendDispatcherclass properly inherits fromDispatcher, uses the shared pointer pattern consistently, and the factory function follows the established codebase patterns.mllm/backends/ascend/memory/AscendMemoryManager.cpp (1)
52-65: Delegation methods are clean and correct.The
allocateBlock,freeBlock, andgetBlockPtrmethods correctly delegate to the per-device memory pool obtained viagetMemoryPool().mllm/backends/ascend/AscendCommon.cpp (2)
172-194: Thread-safe ATB context initialization is correctly implemented.The use of
std::once_flagwithstd::call_onceensures the ATB context and stream are initialized exactly once in a thread-safe manner. This addresses the previous race condition concern.
23-52: Well-implemented RAII handle with proper move semantics.
AscendTensorHandlecorrectly implements move constructor/assignment with proper resource transfer (settingother.block_id_ = -1), and the destructor properly releases resources via therelease()helper.mllm/backends/ascend/CMakeLists.txt (2)
27-39: Environment-based path handling is appropriate.The warnings when
ASCEND_HOME_PATHorATB_HOME_PATHare not set provide good feedback to developers. This follows the established pattern for optional SDK paths.
42-48: PUBLIC linkage is correctly applied.All dependencies (
ascendcl,opapi,nnopbase,atb,MllmRT) are linked with PUBLIC visibility, allowing consumers ofMllmAscendBackendto transitively access these libraries. This addresses previous review feedback.
|
|
||
| constexpr size_t POOL_SIZE = 104857600; // 100 MiB, | ||
|
|
||
| AscendMemoryPool::AscendMemoryPool(size_t pool_size = POOL_SIZE) { |
There was a problem hiding this comment.
Default parameter should be in declaration, not definition.
C++ requires default arguments to be specified in the declaration (header), not the definition. This may cause compilation errors if the header declares the constructor without the default.
🔎 Proposed fix
In AscendMemoryPool.hpp, add the default parameter:
explicit AscendMemoryPool(size_t pool_size = 104857600); // 100 MiBIn AscendMemoryPool.cpp, remove the default:
-AscendMemoryPool::AscendMemoryPool(size_t pool_size = POOL_SIZE) {
+AscendMemoryPool::AscendMemoryPool(size_t pool_size) {🧰 Tools
🪛 Clang (14.0.6)
[error] 14-14: constructor does not initialize these fields: block_mutex_, used_blocks_, free_blocks_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/backends/ascend/memory/AscendMemoryPool.cpp around line 14, the
constructor definition currently includes a default parameter which must be
placed in the declaration instead; remove the "= POOL_SIZE" from the definition,
add the default to the declaration in AscendMemoryPool.hpp (e.g. explicit
AscendMemoryPool(size_t pool_size = POOL_SIZE); or the numeric value), and
ensure both header and cpp signatures match (no default in .cpp) before
rebuilding.
| // 1. Create memory pool | ||
| size_t pool_size = 100 * 1024 * 1024; // 100MB, can be adjusted as needed | ||
| ascend::getAscendMemoryManager().createMemoryPool(pool_size); | ||
| MLLM_INFO("Ascend memory pool initialized"); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find and examine the createMemoryPool implementation
find . -type f -name "*.cpp" -o -name "*.h" | xargs grep -l "createMemoryPool" | head -20Repository: UbiquitousLearning/mllm
Length of output: 156
🏁 Script executed:
# Check the memory manager header file
rg -n "createMemoryPool" --type=cpp --type=hpp -A 3 -B 1Repository: UbiquitousLearning/mllm
Length of output: 94
🏁 Script executed:
# Examine the full Register.cpp file context around lines 18-21
rg -n "initAscendBackend|createMemoryPool" mllm/backends/ascend/Register.cpp -A 10 -B 5Repository: UbiquitousLearning/mllm
Length of output: 814
🏁 Script executed:
# Find and examine the AscendMemoryManager header and implementation
find . -path "*/ascend/memory/AscendMemoryManager*" -type fRepository: UbiquitousLearning/mllm
Length of output: 174
🏁 Script executed:
# Look at the createMemoryPool method signature and implementation
rg -n "createMemoryPool" mllm/backends/ascend/memory/ -A 10 -B 2Repository: UbiquitousLearning/mllm
Length of output: 2196
🏁 Script executed:
# Check if there's any return value or exception handling expected
rg -n "class.*MemoryManager|void createMemoryPool|bool createMemoryPool|int createMemoryPool" mllm/backends/ascend/memory/ -A 5Repository: UbiquitousLearning/mllm
Length of output: 813
🏁 Script executed:
# Check what MLLM_ACL_CHECK does - it seems to be an error handling macro
rg -n "MLLM_ACL_CHECK" mllm/ -B 2 -A 2 | head -50Repository: UbiquitousLearning/mllm
Length of output: 3108
🏁 Script executed:
# Find the definition of MLLM_ACL_CHECK
rg -n "#define MLLM_ACL_CHECK" mllm/ -A 5Repository: UbiquitousLearning/mllm
Length of output: 690
🏁 Script executed:
# Also check the rest of the createMemoryPool implementation
cat -n mllm/backends/ascend/memory/AscendMemoryManager.cpp | sed -n '20,40p'Repository: UbiquitousLearning/mllm
Length of output: 843
🏁 Script executed:
# Check if aclrtSetDevice has error handling elsewhere or if it's documented
rg -n "aclrtSetDevice" mllm/backends/ascend/ -B 2 -A 2 | head -40Repository: UbiquitousLearning/mllm
Length of output: 1597
🏁 Script executed:
# Check AscendMemoryPool constructor to see if it can fail
rg -n "class AscendMemoryPool|AscendMemoryPool::" mllm/backends/ascend/memory/AscendMemoryPool -A 10 | head -50Repository: UbiquitousLearning/mllm
Length of output: 151
🏁 Script executed:
# Find AscendMemoryPool files
find . -name "*AscendMemoryPool*" -type fRepository: UbiquitousLearning/mllm
Length of output: 168
🏁 Script executed:
# Check the AscendMemoryPool header
rg -n "class AscendMemoryPool" mllm/ -A 15Repository: UbiquitousLearning/mllm
Length of output: 1253
🏁 Script executed:
# Check the AscendMemoryPool constructor implementation
cat -n mllm/backends/ascend/memory/AscendMemoryPool.cpp | head -60Repository: UbiquitousLearning/mllm
Length of output: 2483
🏁 Script executed:
# Also check if there are any try-catch or error handling in the constructor
rg -n "AscendMemoryPool::AscendMemoryPool" mllm/backends/ascend/memory/ -A 20Repository: UbiquitousLearning/mllm
Length of output: 1906
🏁 Script executed:
# Check if there's any pattern of validating memory pool after creation in Register.cpp
rg -n "getMemoryPool\|memory_pool" mllm/backends/ascend/Register.cpp -A 3 -B 3Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
# Check how other backends handle initialization errors in their register functions
rg -n "void init.*Backend\(\)" mllm/backends/ -A 15 | head -80Repository: UbiquitousLearning/mllm
Length of output: 4676
Add error handling and validation for memory pool initialization.
The createMemoryPool() call lacks error validation in Register.cpp. Additionally, the implementation has multiple error-handling gaps:
aclrtSetDevice(i)inAscendMemoryManager::createMemoryPool()is not checked for errorsAscendMemoryPoolconstructor fails silently (logs error but returns with null pointers) instead of throwing, leaving the pool in a broken state- Register.cpp should validate successful pool creation before proceeding
Consider either adding error checks with MLLM_ACL_CHECK for aclrtSetDevice, or having AscendMemoryPool throw an exception on allocation failure so the initialization chain fails explicitly.
🧰 Tools
🪛 Clang (14.0.6)
[error] 19-19: variable 'pool_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/backends/ascend/Register.cpp around lines 18-21, the call to
ascend::getAscendMemoryManager().createMemoryPool(pool_size) is unvalidated;
update this block to check the return or thrown exception from createMemoryPool
and abort/init-fail with a clear MLLM_ERROR/MLLM_ACL_CHECK if pool creation
fails. In the backend, ensure AscendMemoryManager::createMemoryPool checks and
handles aclrtSetDevice return codes (use MLLM_ACL_CHECK or equivalent) and
change AscendMemoryPool construction to signal failure explicitly (either throw
an exception on allocation/initialization errors or return a nullable/Status
result) so createMemoryPool can detect failure and propagate it back to
Register.cpp for proper logging and process exit.
Please check Guidelines for Contributing.
Summary by CodeRabbit
New Features
Documentation
Build System
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.