Skip to content

Repeated Sequence Output with llama3.cuda #7

@dewijones92

Description

@dewijones92

Repeated Sequence Output with llama3.cuda

I'm encountering an issue with the llama3.cuda repository where I'm getting a repeating sequence in the output.

Steps to Reproduce:

  1. Cloned the repository and built it using make.
  2. Ran the following command:
./runcuda "i have a dream"

Actual Output:

i have a dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream
Token count: 50, elapsed: 0.047000s, 1021 tokens/s

Expected Output (from README):

$ ./runcuda "I have a dream"
"""
I have a dream. He dreams of a big, beautiful garden full of flowers and trees. He dreams of playing with his friends and eating yummy snacks.
One day, he was walking in the garden when he saw
Token count: 50, elapsed: 0.017000s, 2823 tokens/s
"""

Commit Information:

commit a05278f03b0aa9ae61baeea23c33067230463ca9 (HEAD -> master, origin/master, origin/HEAD)
Author: Sang Park <sang.park@dnotitia.com>
Date:   Tue Jun 4 02:20:05 2024 +0000

    Refactor conditional statements in llama3.cu

    The if/else chain in the llama3.cu file has been refactored to a switch statement. This change makes the code easier to read and understand.

GPU Information:

  • GPU Name: NVIDIA RTX 2000 Ada Generation
  • Driver Version: 552.74
  • CUDA Version: 12.4
  • Power Usage: 16W
  • Total Memory: 8188MiB

Any help in resolving this issue would be greatly appreciated. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions