Releases · toberyan/llama.cpp

05 Mar 09:12

3ccbfe5

b4826 Latest

Latest

ci : remove xframework upload (#12190)

* ci : remove xframework upload

This commit removes the upload of the xframework zip file as an
artifact.

The motivation for this change is that the xframework zip file is
currently being uploaded as part of strategy and will therefore be
attempted to be uploaded multiple times and will fail the build.

The uploading should be moved to somewhere else in the build to avoid
this.

* ci : add xcframework upload to macos-latest job

Assets 26

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-03-05T09:12:48Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-03-05T09:12:57Z
llama-b1-xcframework.zip

68.7 MB 2025-03-05T09:13:08Z
llama-b4826-bin-macos-arm64.zip

23.4 MB 2025-03-05T09:13:10Z
llama-b4826-bin-macos-x64.zip

25 MB 2025-03-05T09:13:12Z
llama-b4826-bin-ubuntu-arm64.zip

25.5 MB 2025-03-05T09:13:13Z
llama-b4826-bin-ubuntu-vulkan-x64.zip

30.9 MB 2025-03-05T09:13:14Z
llama-b4826-bin-ubuntu-x64.zip

27 MB 2025-03-05T09:13:16Z
llama-b4826-bin-win-avx-x64.zip

16.5 MB 2025-03-05T09:13:17Z
llama-b4826-bin-win-avx2-x64.zip

16.5 MB 2025-03-05T09:13:18Z
Source code (zip)

2025-03-05T07:34:02Z
Source code (tar.gz)

2025-03-05T07:34:02Z

14 Feb 03:32

github-actions

b4712

a7b8ce2

b4712

llama-bench : fix unexpected global variable initialize sequence issu…

Assets 23

13 Feb 11:51

github-actions

b4706

c7f460a

b4706

`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content…

Assets 22

13 Feb 09:11

github-actions

b4705

27e8a23

b4705

sampling: add Top-nσ sampler (#11223)

* initial sampling changes:

* completed top nsigma sampler implementation

* apply parameter to only llama-cli

* updated readme

* added tests and fixed nsigma impl

* cleaned up pr

* format

* format

* format

* removed commented tests

* cleanup pr and remove explicit floats

* added top-k sampler to improve performance

* changed sigma to float

* fixed string format to float

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update common/sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* added llama_sampler_init

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Assets 23

13 Nov 06:21

github-actions

b4067

54ef9cf

b4067

vulkan: Throttle the number of shader compiles during the build step.…

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: toberyan/llama.cpp

b4826

Uh oh!

b4712

Uh oh!

b4706

Uh oh!

b4705

Uh oh!

b4067

Uh oh!