-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
Repeated Sequence Output with llama3.cuda
I'm encountering an issue with the llama3.cuda repository where I'm getting a repeating sequence in the output.
Steps to Reproduce:
- Cloned the repository and built it using
make. - Ran the following command:
./runcuda "i have a dream"Actual Output:
i have a dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream dream
Token count: 50, elapsed: 0.047000s, 1021 tokens/s
Expected Output (from README):
$ ./runcuda "I have a dream"
"""
I have a dream. He dreams of a big, beautiful garden full of flowers and trees. He dreams of playing with his friends and eating yummy snacks.
One day, he was walking in the garden when he saw
Token count: 50, elapsed: 0.017000s, 2823 tokens/s
"""
Commit Information:
commit a05278f03b0aa9ae61baeea23c33067230463ca9 (HEAD -> master, origin/master, origin/HEAD)
Author: Sang Park <sang.park@dnotitia.com>
Date: Tue Jun 4 02:20:05 2024 +0000
Refactor conditional statements in llama3.cu
The if/else chain in the llama3.cu file has been refactored to a switch statement. This change makes the code easier to read and understand.
GPU Information:
- GPU Name: NVIDIA RTX 2000 Ada Generation
- Driver Version: 552.74
- CUDA Version: 12.4
- Power Usage: 16W
- Total Memory: 8188MiB
Any help in resolving this issue would be greatly appreciated. Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels