No Difference in tokens/sec - Ministral3 8B Q5_K_M

I used the repo to rebuild llama-cpp from scratch to a different dest compared to original llama-cpp. I am comparing performance of the same base model being executed with same command line parameters using llama-server -m for turbo3 and turbo4. Not seeing any improvement in tokens/second before and after. Actually before the speed of generation is better than after. I am using MAC M1 with 32GB RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No Difference in tokens/sec - Ministral3 8B Q5_K_M #48

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

No Difference in tokens/sec - Ministral3 8B Q5_K_M #48

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions