Skip to content

Conversation

@mkjsym
Copy link
Contributor

@mkjsym mkjsym commented Jul 2, 2025

Make sure to read the contributing guidelines before submitting a PR

llama_decode_eagle 함수의 latency issue를 해결하였습니다.

@LeeHayun LeeHayun requested a review from Copilot July 2, 2025 12:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for an EAGLE input layer to address latency issues in the llama_decode_eagle function and updates related tensor loading, normalization, and RoPE settings. Key changes include:

  • Introduce LLM_TENSOR_LAYER_INPUT_EAGLE and map it in tensor loading and architecture definitions.
  • Comment out the redundant RMS normalization step in llm_build_eagle to reduce overhead.
  • Adjust RoPE type mapping for the EAGLE architecture and update example integration under examples/speculative-eagle.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/llama-model.cpp Added handling for LLM_TENSOR_LAYER_INPUT_EAGLE and skipped norm.
src/llama-arch.h Defined new enum value LLM_TENSOR_LAYER_INPUT_EAGLE.
src/llama-arch.cpp Updated tensor info for LLM_TENSOR_EMBD_FC to use the new layer.
examples/speculative-eagle/* Added complete example and build integration for EAGLE decoding.
examples/CMakeLists.txt Registered the new speculative-eagle example in the build.
Comments suppressed due to low confidence (4)

src/llama-arch.h:378

  • The new enum value lacks a descriptive comment; add a brief explanation for LLM_TENSOR_LAYER_INPUT_EAGLE so its purpose is clear to future maintainers.
    LLM_TENSOR_LAYER_INPUT_EAGLE,

src/llama-arch.h:374

  • New tensor layer functionality for EAGLE is not covered by existing tests; add unit tests for load_tensors and the EAGLE decode path to ensure correctness.
enum llm_tensor_layer {

examples/speculative-eagle/speculative-eagle.cpp:1

  • [nitpick] Comments are written in Korean and English; for consistency and to accommodate a global contributor base, translate or unify comments in English.
//Tree-based EAGLE 구현 코드

@LeeHayun LeeHayun merged commit c288e68 into SKKU-ESLAB:master Jul 2, 2025
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants