-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
First of all, thank you for providing such a helpful resource.
While simulating with Llama3-8b, I noticed a potential issue with the required_accum_size formula in Mapping.cc
The current code is
(inner_I * inner_J) * _config.precision
However, since the result of a 16-bit x 16-bit matrix multiplication should be stored as 32-bit,
I believe the correct formula should be
(inner_I * inner_J) * _config.precision * 2
Additionally, when defining max_acc_rows as
(_config.core_config[key.target_core].accum_spad_size KB) / (dim * 4 * 2)
the denominator considers core_height, precision, and double_buffer, which indicates that
the stored data is 32-bit
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels