Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,6 @@ option(MLLM_KERNEL_THREADS_VENDOR_APPLE_GCD "Enable Apple GCD Threads" OFF)
option(MLLM_PERFETTO_ENABLE "Enable perfetto" OFF)
option(MLLM_TRACY_ENABLE "Enable Tracy. A more advanced profiler" OFF)

# NPU AOT things
option(MLLM_QUALCOMM_QNN_AOT_ON_X86_ENABLE "Enable Qualcomm NPU AOT on X86 devices" OFF)

# Platform Hints
option(MLLM_ANDROID_BURST_PERFORMANCE_HINTS "If MLLM need use APerformanceHintManager to tell android we need best performance" OFF)

Expand Down
81 changes: 81 additions & 0 deletions examples/ascend_add_demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Ascend Add Op Demo

这是一个简单的 demo,用于测试 Ascend 后端的 Add 算子实现。

## 功能

- 初始化 Ascend 后端和内存池
- 创建两个输入张量(shape: [2, 3])
- 在 Ascend NPU 上执行 Add 操作
- 验证计算结果是否正确

## 编译和运行

### 方法 1: 使用自动化脚本(推荐)

```bash
cd /home/HwHiAiUser/mLLM/examples/ascend_add_demo
./build_and_run.sh
Comment on lines +16 to +18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid hard-coded absolute paths in the demo instructions.

Using an environment-specific path makes the steps non-portable. Consider a placeholder such as <repo_root>/examples/ascend_add_demo.

✏️ Suggested edit
-cd /home/HwHiAiUser/mLLM/examples/ascend_add_demo
+cd <repo_root>/examples/ascend_add_demo
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```bash
cd /home/HwHiAiUser/mLLM/examples/ascend_add_demo
./build_and_run.sh
🤖 Prompt for AI Agents
In `@examples/ascend_add_demo/README.md` around lines 16 - 18, Replace the
hard-coded absolute path in the README code example with a portable placeholder
or relative path so demos are reproducible; specifically update the code block
invoking the example to use either "<repo_root>/examples/ascend_add_demo" or a
relative path like "./examples/ascend_add_demo" and keep the same
./build_and_run.sh invocation (reference: README.md code block showing cd
/home/HwHiAiUser/mLLM/examples/ascend_add_demo and ./build_and_run.sh).

```

脚本会自动:
- 检查环境变量
- 配置 CMake
- 编译项目
- 运行 demo

### 方法 2: 手动编译

确保已经设置了必要的环境变量:
- `ASCEND_HOME_PATH`: Ascend SDK 路径(已设置: `/usr/local/Ascend/ascend-toolkit/latest`)
- `ATB_HOME_PATH`: ATB 库路径(已设置: `/usr/local/Ascend/nnal/nnal/atb/latest/atb/cxx_abi_0`)

在项目根目录下:

```bash
# 1. 创建构建目录
mkdir -p build-ascend-demo && cd build-ascend-demo

# 2. 配置 CMake
cmake .. \
-DMLLM_BUILD_ASCEND_BACKEND=ON \
-DMLLM_ENABLE_EXAMPLE=ON \
-DCMAKE_BUILD_TYPE=Release

# 3. 编译
make ascend_add_demo -j$(nproc)

# 4. 运行
./examples/ascend_add_demo/ascend_add_demo
```

## 预期输出

```
=== Ascend Add Op Demo ===
1. Initializing Ascend backend...
✓ Ascend backend initialized

2. Creating input tensors...
Input x shape: [2, 3]
Input y shape: [2, 3]

3. Transferring tensors to Ascend device...
✓ Tensors transferred to Ascend

4. Executing Add operation on Ascend...
✓ Add operation completed

5. Transferring result back to CPU and verifying...
Expected result: [11, 22, 33, 44, 55, 66]
Actual result: [11, 22, 33, 44, 55, 66]

✓ Test PASSED! All values match expected results.
```
Comment on lines +54 to +74
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Specify a language for the expected-output code fence.

markdownlint MD040 requires a language; text is fine for output.

✅ Suggested fix
-```
+```text
 === Ascend Add Op Demo ===
 ...
-```
+```
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)

[warning] 54-54: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@examples/ascend_add_demo/README.md` around lines 54 - 74, The markdown fenced
output block in the README lacks a language identifier causing markdownlint
MD040; update the triple-backtick fence that wraps the demo output (the block
beginning with "=== Ascend Add Op Demo ===") to include a language specifier
(e.g., change the opening fence from ``` to ```text) so the code fence is
```text ... ``` and linting passes; no other content changes needed.


## 注意事项

- 当前实现使用 float16 数据类型
- 需要 Ascend NPU 设备可用
- 确保已正确安装 Ascend SDK 和 ATB 库

74 changes: 74 additions & 0 deletions examples/ascend_add_demo/build_and_run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/bin/bash

# Ascend Add Demo 编译和运行脚本

set -e # 遇到错误立即退出

# 颜色输出
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color

echo -e "${GREEN}=== Ascend Add Demo 编译和运行脚本 ===${NC}\n"

# 检查环境变量
echo -e "${YELLOW}检查环境变量...${NC}"
if [ -z "$ASCEND_HOME_PATH" ]; then
echo -e "${RED}错误: ASCEND_HOME_PATH 未设置${NC}"
exit 1
fi
if [ -z "$ATB_HOME_PATH" ]; then
echo -e "${RED}错误: ATB_HOME_PATH 未设置${NC}"
exit 1
fi
echo -e "${GREEN}✓ ASCEND_HOME_PATH: $ASCEND_HOME_PATH${NC}"
echo -e "${GREEN}✓ ATB_HOME_PATH: $ATB_HOME_PATH${NC}\n"

# 获取项目根目录
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
BUILD_DIR="$PROJECT_ROOT/build-ascend-demo"

echo -e "${YELLOW}项目根目录: $PROJECT_ROOT${NC}"
echo -e "${YELLOW}构建目录: $BUILD_DIR${NC}\n"

# 创建构建目录
if [ ! -d "$BUILD_DIR" ]; then
echo -e "${YELLOW}创建构建目录...${NC}"
mkdir -p "$BUILD_DIR"
fi

cd "$BUILD_DIR"

# 配置 CMake
echo -e "\n${YELLOW}配置 CMake...${NC}"
cmake "$PROJECT_ROOT" \
-DMLLM_BUILD_ASCEND_BACKEND=ON \
-DMLLM_ENABLE_EXAMPLE=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON

# 编译
echo -e "\n${YELLOW}开始编译...${NC}"
make ascend_add_demo -j$(nproc)

# 检查编译结果
if [ $? -eq 0 ]; then
echo -e "\n${GREEN}✓ 编译成功!${NC}\n"

# 运行
echo -e "${YELLOW}运行 demo...${NC}\n"
./examples/ascend_add_demo/ascend_add_demo

if [ $? -eq 0 ]; then
echo -e "\n${GREEN}✓ Demo 运行成功!${NC}"
else
echo -e "\n${RED}✗ Demo 运行失败${NC}"
exit 1
fi
Comment on lines +52 to +69
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the entire script to check for set -e
cat examples/ascend_add_demo/build_and_run.sh

Repository: UbiquitousLearning/mllm

Length of output: 1667


Fix error handling: custom failure messages won't execute with set -e.

With set -e, standalone commands like make ascend_add_demo -j$(nproc) cause the script to exit immediately on failure, before the if [ $? -eq 0 ] check runs. The custom error messages in the else blocks are therefore unreachable. Use if cmd; then ... else ... fi to preserve error handling (also quote $(nproc) to avoid word-splitting issues).

🔧 Suggested update
 # 编译
 echo -e "\n${YELLOW}开始编译...${NC}"
-make ascend_add_demo -j$(nproc)
-
-# 检查编译结果
-if [ $? -eq 0 ]; then
+if make ascend_add_demo -j"$(nproc)"; then
     echo -e "\n${GREEN}✓ 编译成功!${NC}\n"
     
     # 运行
     echo -e "${YELLOW}运行 demo...${NC}\n"
-    ./examples/ascend_add_demo/ascend_add_demo
-    
-    if [ $? -eq 0 ]; then
+    if ./examples/ascend_add_demo/ascend_add_demo; then
         echo -e "\n${GREEN}✓ Demo 运行成功!${NC}"
     else
         echo -e "\n${RED}✗ Demo 运行失败${NC}"
         exit 1
     fi
 else
     echo -e "\n${RED}✗ 编译失败${NC}"
     exit 1
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# 编译
echo -e "\n${YELLOW}开始编译...${NC}"
make ascend_add_demo -j$(nproc)
# 检查编译结果
if [ $? -eq 0 ]; then
echo -e "\n${GREEN}✓ 编译成功!${NC}\n"
# 运行
echo -e "${YELLOW}运行 demo...${NC}\n"
./examples/ascend_add_demo/ascend_add_demo
if [ $? -eq 0 ]; then
echo -e "\n${GREEN}✓ Demo 运行成功!${NC}"
else
echo -e "\n${RED}✗ Demo 运行失败${NC}"
exit 1
fi
# 编译
echo -e "\n${YELLOW}开始编译...${NC}"
if make ascend_add_demo -j"$(nproc)"; then
echo -e "\n${GREEN}✓ 编译成功!${NC}\n"
# 运行
echo -e "${YELLOW}运行 demo...${NC}\n"
if ./examples/ascend_add_demo/ascend_add_demo; then
echo -e "\n${GREEN}✓ Demo 运行成功!${NC}"
else
echo -e "\n${RED}✗ Demo 运行失败${NC}"
exit 1
fi
else
echo -e "\n${RED}✗ 编译失败${NC}"
exit 1
fi
🧰 Tools
🪛 Shellcheck (0.11.0)

[warning] 54-54: Quote this to prevent word splitting.

(SC2046)

🤖 Prompt for AI Agents
In `@examples/ascend_add_demo/build_and_run.sh` around lines 52 - 69, The script
currently relies on set -e and then checks $? after running "make
ascend_add_demo -j$(nproc)", which makes the else branch unreachable; change the
build invocation to use the conditional form "if make ascend_add_demo
-j\"$(nproc)\"; then ... else ..." so the success and failure branches run
correctly, and move the demo execution and its own conditional similarly into
the then/else structure (i.e., replace the standalone make line plus the
subsequent "if [ $? -eq 0 ]" block with an if wrapper around the make command
and quote "$(nproc)"); do the same for running
"./examples/ascend_add_demo/ascend_add_demo" so failure prints the custom error
and exits.

else
echo -e "\n${RED}✗ 编译失败${NC}"
exit 1
fi

1 change: 0 additions & 1 deletion mllm/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ add_library(
${MLLM_RT_MODELS_SRC}
${MLLM_RT_COMPILE_SRC}
${MLLM_RT_AUTO_TUNE_SRC}
${MLLM_QUALCOMM_AOT_SRC}
${WENET_AUDIO_SOURCES}
)

Expand Down
10 changes: 8 additions & 2 deletions mllm/backends/ascend/AscendBackend.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,18 @@

#include "mllm/backends/ascend/ops/AscendElewiseOps.hpp"
#include "mllm/backends/ascend/ops/AscendX2XOp.hpp"
#include "mllm/backends/ascend/ops/AscendSiLUOp.hpp"
#include "mllm/backends/ascend/ops/AscendLinearOp.hpp"
#include "mllm/backends/ascend/ops/AscendRMSNormOp.hpp"
#include "mllm/backends/ascend/ops/AscendViewOp.hpp"
#include "mllm/backends/ascend/ops/AscendMatMulOp.hpp"
#include "mllm/backends/ascend/ops/AscendSoftmaxOp.hpp"

namespace mllm::ascend {

AscendBackend::AscendBackend() : Backend(kAscend, createAscendAllocator()) {
regOpFactory<AscendAddOpFactory>();
regOpFactory<AscendX2XOpFactory>();
regOpFactory<AscendAddOpFactory,AscendSubOpFactory,AscendMulOpFactory,AscendX2XOpFactory,AscendSiLUOpFactory,
AscendLinearOpFactory,AscendRMSNormOpFactory,AscendViewOpFactory,AscendMatMulOpFactory,AscendSoftmaxOpFactory>();
auto& devices = AscendDeviceMetaInfo::instance().devices;
for (const auto& device : devices) {
const auto bytes_to_mb = [](size_t bytes) { return bytes / (1024.0 * 1024.0); };
Expand Down
18 changes: 14 additions & 4 deletions mllm/backends/ascend/AscendCommon.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,13 @@ void syncGlobalAtbStream() {
}

void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc) {
// Validate that the tensor is FP16
if (t.dtype() != MLLM_TYPE_F16) {
MLLM_ERROR_EXIT(ExitCode::kAscendError,
"fillAtbTensorDesc: Tensor must be FP16, but got dtype={}",
static_cast<int>(t.dtype()));
}

desc.dtype = ACL_FLOAT16; // Currently hardcoded as per demo, can be expanded later
desc.format = ACL_FORMAT_ND;

Expand All @@ -217,6 +224,13 @@ void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc) {
}
}

void fillAtbTensor(const Tensor& t, atb::Tensor& atb_tensor) {
fillAtbTensorDesc(t, atb_tensor.desc);
atb_tensor.deviceData = reinterpret_cast<uint8_t*>(t.ptr<void>());
// Use MLLM tensor's actual bytes as dataSize to match allocated memory
atb_tensor.dataSize = t.bytes();
}

AscendDeviceMetaInfo::AscendDeviceMetaInfo() {
#ifndef ASCENDC_CPU_DEBUG
// Initialize ACL to query devices
Expand All @@ -231,7 +245,6 @@ AscendDeviceMetaInfo::AscendDeviceMetaInfo() {
ret = aclrtGetDeviceCount(&device_count);
if (ret != ACL_SUCCESS) {
MLLM_ERROR("Failed to get Ascend device count: {}", ret);
aclFinalize();
return;
}

Expand Down Expand Up @@ -265,9 +278,6 @@ AscendDeviceMetaInfo::AscendDeviceMetaInfo() {

devices.push_back(info);
}

// Finalize ACL after enumeration
aclFinalize();
#else
// In CPU debug mode, add a dummy device
AscendDeviceInfo info;
Expand Down
3 changes: 3 additions & 0 deletions mllm/backends/ascend/AscendCommon.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ void syncGlobalAtbStream();
// Convert MLLM Tensor metadata to ATB TensorDesc
void fillAtbTensorDesc(const Tensor& t, atb::TensorDesc& desc);

// Setup ATB Tensor with correct dataSize calculated by ATB Utils
void fillAtbTensor(const Tensor& t, atb::Tensor& atb_tensor);

// Ascend device information structure
struct AscendDeviceInfo {
std::string name;
Expand Down
Loading
Loading