Skip to content

Conversation

@yao-fengchen
Copy link
Contributor

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR performs code refactoring focused on simplifying NPU operations and improving code quality in the Ascend backend implementation.

  • Simplified attention mask generation and removed unnecessary type conversions
  • Cleaned up function call signatures by using keyword arguments instead of excessive positional parameters
  • Fixed a bug where the graph capturing flag wasn't properly reset

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File Description
dlinfer/vendor/ascend/torch_npu_ops.py Refactored attention operations to remove unnecessary type conversions, simplified mask generation logic, and cleaned up API calls by using keyword arguments
dlinfer/framework/lmdeploy_ext/cudagraph/ascend_cudagraph.py Changed kv_start_indices buffer dtype from int64 to int32 for consistency, and fixed bug by properly resetting the capturing flag after graph capture
.github/workflows/main.yml Updated CI workflow to use DeepLink-org fork and refactor_code branch for testing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant