@@ -98,7 +98,6 @@ Ensure you have the following installed and configured:
9898
9999- ** Java 21** : Required for Vector API support & TornadoVM.
100100- [ TornadoVM] ( https://github.com/beehive-lab/TornadoVM ) with OpenCL or PTX backends.
101- - [ Maven] ( https://maven.apache.org/ ) : For building the Java project.
102101
103102### Install, Build, and Run
104103
@@ -239,83 +238,19 @@ llama-tornado --gpu --model beehive-llama-3.2-1b-instruct-fp16.gguf --prompt "te
239238The above model can we swapped with one of the other models, such as ` beehive-llama-3.2-3b-instruct-fp16.gguf ` or ` beehive-llama-3.2-8b-instruct-fp16.gguf ` , depending on your needs.
240239Check models below.
241240
242- ## Download Model Files
241+ ## Collection of Tested Models
243242
244- Download ` FP16 ` quantized ` Llama-3 ` .gguf files from:
245- - https://huggingface.co/beehive-lab/Llama-3.2-1B-Instruct-GGUF-FP16
246- - https://huggingface.co/beehive-lab/Llama-3.2-3B-Instruct-GGUF-FP16
247- - https://huggingface.co/beehive-lab/Llama-3.2-8B-Instruct-GGUF-FP16
243+ ### Llama3.2 Collection
244+ https://huggingface.co/collections/beehive-lab/llama3-gpullama3java
248245
249- Download ` FP16 ` quantized ` Mistral ` .gguf files from:
250- - https://huggingface.co/collections/beehive-lab/mistral -gpullama3java-684afabb206136d2e9cd47e0
246+ ### Llama3.2 Collection
247+ https://huggingface.co/collections/beehive-lab/llama3 -gpullama3java
251248
252- Download ` FP16 ` quantized ` Qwen3 ` .gguf files from:
253- - https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF
254- - https://huggingface.co/ggml-org/Qwen3-1.7B-GGUF
255- - https://huggingface.co/ggml-org/Qwen3-4B-GGUF
256- - https://huggingface.co/ggml-org/Qwen3-8B-GGUF
249+ ### Llama3.2 Collection
250+ https://huggingface.co/collections/beehive-lab/llama3-gpullama3java
257251
258- Download ` FP16 ` quantized ` Qwen2.5 ` .gguf files from:
259- - https://huggingface.co/bartowski/Qwen2.5-0.5B-Instruct-GGUF
260- - https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF
261-
262- Download ` FP16 ` quantized ` DeepSeek-R1-Distill-Qwen ` .gguf files from:
263- - https://huggingface.co/hdnh2006/DeepSeek-R1-Distill-Qwen-1.5B-GGUF
264-
265- Please be gentle with [ huggingface.co] ( https://huggingface.co ) servers:
266-
267- ** Note** FP16 models are first-class citizens for the current version.
268- ```
269- # Llama 3.2 (1B) - FP16
270- wget https://huggingface.co/beehive-lab/Llama-3.2-1B-Instruct-GGUF-FP16/resolve/main/beehive-llama-3.2-1b-instruct-fp16.gguf
271-
272- # Llama 3.2 (3B) - FP16
273- wget https://huggingface.co/beehive-lab/Llama-3.2-3B-Instruct-GGUF-FP16/resolve/main/beehive-llama-3.2-3b-instruct-fp16.gguf
274-
275- # Llama 3 (8B) - FP16
276- wget https://huggingface.co/beehive-lab/Llama-3.2-8B-Instruct-GGUF-FP16/resolve/main/beehive-llama-3.2-8b-instruct-fp16.gguf
277-
278- # Mistral (7B) - FP16
279- wget https://huggingface.co/MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF/resolve/main/Mistral-7B-Instruct-v0.3.fp16.gguf
280-
281- # Qwen3 (0.6B) - FP16
282- wget https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-0.6B-f16.gguf
283-
284- # Qwen3 (1.7B) - FP16
285- wget https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-1.7B-f16.gguf
286-
287- # Qwen3 (4B) - FP16
288- wget https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-4B-f16.gguf
289-
290- # Qwen3 (8B) - FP16
291- wget https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-8B-f16.gguf
292-
293- # Phi-3-mini-4k - FP16
294- wget https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-fp16.gguf
295-
296- # Qwen2.5 (0.5B)
297- wget https://huggingface.co/bartowski/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/Qwen2.5-0.5B-Instruct-f16.gguf
298-
299- # Qwen2.5 (1.5B)
300- wget https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-fp16.gguf
301-
302- # DeepSeek-R1-Distill-Qwen (1.5B)
303- wget https://huggingface.co/hdnh2006/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-1.5B-F16.gguf
304- ```
305-
306- ** [ Experimental] ** you can download the Q8 and Q4 used in the original implementation of Llama3.java, but for now are going to be dequanted to FP16 for TornadoVM support:
307- ```
308- # Llama 3.2 (1B) - Q4_0
309- curl -L -O https://huggingface.co/mukel/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_0.gguf
310- # Llama 3.2 (3B) - Q4_0
311- curl -L -O https://huggingface.co/mukel/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_0.gguf
312- # Llama 3 (8B) - Q4_0
313- curl -L -O https://huggingface.co/mukel/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q4_0.gguf
314- # Llama 3.2 (1B) - Q8_0
315- curl -L -O https://huggingface.co/mukel/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q8_0.gguf
316- # Llama 3.1 (8B) - Q8_0
317- curl -L -O https://huggingface.co/mukel/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_0.gguf
318- ```
252+ ### Llama3.2 Collection
253+ https://huggingface.co/collections/beehive-lab/llama3-gpullama3java
319254
320255-----------
321256
0 commit comments