-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Hi, thanks so much for your documentation, I'm using AMD 8845HS with 780M GPU run deepseek-r1:1.5b by Ollama follow by your document. But there has GPU hang error after several rounds of conversation:
HW Exception by GPU node-1 (Agent handle: 0x7e6eb7d0bb40) reason :GPU Hang
-
Hardware:
- CPU: AMD 8845HS
- GPU: 780M with 16GB VRAM
- Memory: DDR5 5600Mhz 48G
- OS: LXC Container in PVE 8.3
-
docker-compose:
services:
ollama:
image: ollama/ollama:rocm
container_name: ollama
restart: unless-stopped
devices:
- "/dev/kfd"
- "/dev/dri"
volumes:
- ./data:/root/.ollama
environment:
- OLLAMA_ORIGINS='chrome-extension://*,moz-extension://*'
- HSA_OVERRIDE_GFX_VERSION=11.0.0
- HCC_AMDGPU_TARGETS=gfx1103
- OLLAMA_LLM_LIBRARY=rocm_v60002
- OLLAMA_DEBUG=1
ports:
- "11434:11434" - error message:
ollama | time=2025-02-24T09:45:48.152Z level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc
ollama | time=2025-02-24T09:45:48.152Z level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc
ollama | llama_model_loader: loaded meta data with 26 key-value pairs and 339 tensors from /root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc (version GGUF V3 (latest))
ollama | llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
ollama | llama_model_loader: - kv 0: general.architecture str = qwen2
ollama | llama_model_loader: - kv 1: general.type str = model
ollama | llama_model_loader: - kv 2: general.name str = DeepSeek R1 Distill Qwen 1.5B
ollama | llama_model_loader: - kv 3: general.basename str = DeepSeek-R1-Distill-Qwen
ollama | llama_model_loader: - kv 4: general.size_label str = 1.5B
ollama | llama_model_loader: - kv 5: qwen2.block_count u32 = 28
ollama | llama_model_loader: - kv 6: qwen2.context_length u32 = 131072
ollama | llama_model_loader: - kv 7: qwen2.embedding_length u32 = 1536
ollama | llama_model_loader: - kv 8: qwen2.feed_forward_length u32 = 8960
ollama | llama_model_loader: - kv 9: qwen2.attention.head_count u32 = 12
ollama | llama_model_loader: - kv 10: qwen2.attention.head_count_kv u32 = 2
ollama | llama_model_loader: - kv 11: qwen2.rope.freq_base f32 = 10000.000000
ollama | llama_model_loader: - kv 12: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
ollama | llama_model_loader: - kv 13: general.file_type u32 = 15
ollama | llama_model_loader: - kv 14: tokenizer.ggml.model str = gpt2
ollama | llama_model_loader: - kv 15: tokenizer.ggml.pre str = qwen2
ollama | llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
ollama | llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
ollama | llama_model_loader: - kv 18: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
ollama | llama_model_loader: - kv 19: tokenizer.ggml.bos_token_id u32 = 151646
ollama | llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 151643
ollama | llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 151643
ollama | llama_model_loader: - kv 22: tokenizer.ggml.add_bos_token bool = true
ollama | llama_model_loader: - kv 23: tokenizer.ggml.add_eos_token bool = false
ollama | llama_model_loader: - kv 24: tokenizer.chat_template str = {% if not add_generation_prompt is de...
ollama | llama_model_loader: - kv 25: general.quantization_version u32 = 2
ollama | llama_model_loader: - type f32: 141 tensors
ollama | llama_model_loader: - type q4_K: 169 tensors
ollama | llama_model_loader: - type q6_K: 29 tensors
ollama | llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
ollama | llm_load_vocab: special tokens cache size = 22
ollama | llm_load_vocab: token to piece cache size = 0.9310 MB
ollama | llm_load_print_meta: format = GGUF V3 (latest)
ollama | llm_load_print_meta: arch = qwen2
ollama | llm_load_print_meta: vocab type = BPE
ollama | llm_load_print_meta: n_vocab = 151936
ollama | llm_load_print_meta: n_merges = 151387
ollama | llm_load_print_meta: vocab_only = 1
ollama | llm_load_print_meta: model type = ?B
ollama | llm_load_print_meta: model ftype = all F32
ollama | llm_load_print_meta: model params = 1.78 B
ollama | llm_load_print_meta: model size = 1.04 GiB (5.00 BPW)
ollama | llm_load_print_meta: general.name = DeepSeek R1 Distill Qwen 1.5B
ollama | llm_load_print_meta: BOS token = 151646 '<|begin▁of▁sentence|>'
ollama | llm_load_print_meta: EOS token = 151643 '<|end▁of▁sentence|>'
ollama | llm_load_print_meta: PAD token = 151643 '<|end▁of▁sentence|>'
ollama | llm_load_print_meta: LF token = 148848 'ÄĬ'
ollama | llm_load_print_meta: EOG token = 151643 '<|end▁of▁sentence|>'
ollama | llm_load_print_meta: max token length = 256
ollama | llama_model_load: vocab only - skipping tensors
ollama | time=2025-02-24T09:45:48.455Z level=DEBUG source=routes.go:1457 msg="chat request" images=0 prompt="You are a professional, authentic machine translation engine.\n\nYou are about to translate text from an article. Title: “OLLAMA_ORIGINS=chrome-extension://etc does not work · Issue #1686 · ollama/ollama”, Summary: {{imt_theme}}\n\nThis content may include the following terms {{imt_terms}}. Please handle these terms carefully.<|User|>; 把下一行文本作为纯文本输入,并将其翻译为简体中文,, if the text contains html tags, please consider after translate, where the tags should be in translated result, meanwhile keep the result fluently.仅输出翻译。如果某些内容无需翻译(如专有名词、代码等),则保持原文不变。不要解释,输入文本:\nOLLAMA_ORIGINS=chrome-extension://etc does not work #1686<|Assistant|>"
ollama | time=2025-02-24T09:45:48.457Z level=DEBUG source=routes.go:1457 msg="chat request" images=0 prompt="You are a professional, authentic machine translation engine.\n\nYou are about to translate text from an article. Title: “OLLAMA_ORIGINS=chrome-extension://etc does not work · Issue #1686 · ollama/ollama”, Summary: {{imt_theme}}\n\nThis content may include the following terms {{imt_terms}}. Please handle these terms carefully.<|User|>; 把下一行文本作为纯文本输入,并将其翻译为简体中文,, if the text contains html tags, please consider after translate, where the tags should be in translated result, meanwhile keep the result fluently.仅输出翻译。如果某些内容无需翻译(如专有名词、代码等),则保持原文不变。不要解释,输入文本:\nOLLAMA_ORIGINS=chrome-extension://etc does not work · Issue #1686 · ollama/ollama<|Assistant|>"
ollama | time=2025-02-24T09:45:48.458Z level=DEBUG source=cache.go:99 msg="loading cache slot" id=0 cache=0 prompt=174 used=0 remaining=174
// ... part of chat message
ollama | time=2025-02-24T09:45:48.776Z level=DEBUG source=cache.go:99 msg="loading cache slot" id=1 cache=0 prompt=183 used=0 remaining=183
ollama | time=2025-02-24T09:45:48.776Z level=DEBUG source=cache.go:99 msg="loading cache slot" id=2 cache=0 prompt=190 used=0 remaining=190
ollama | time=2025-02-24T09:45:48.776Z level=DEBUG source=cache.go:99 msg="loading cache slot" id=3 cache=0 prompt=168 used=0 remaining=168
ollama | HW Exception by GPU node-1 (Agent handle: 0x7e6eb7d0bb40) reason :GPU Hang
ollama | time=2025-02-24T09:45:50.472Z level=DEBUG source=sched.go:407 msg="context for request finished"
ollama | time=2025-02-24T09:45:50.472Z level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc refCount=14
ollama | time=2025-02-24T09:45:50.472Z level=DEBUG source=sched.go:407 msg="context for request finished"
ollama | time=2025-02-24T09:45:50.472Z level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc refCount=13
ollama | time=2025-02-24T09:45:50.472Z level=DEBUG source=sched.go:407 msg="context for request finished"- rocm
rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.14
Runtime Ext Version: 1.6
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: YES
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 7 8845HS w/ Radeon 780M Graphics
Uuid: CPU-XX
Marketing Name: AMD Ryzen 7 8845HS w/ Radeon 780M Graphics
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5137
BDFID: 0
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 32638500(0x1f20624) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 32638500(0x1f20624) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 32638500(0x1f20624) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 32638500(0x1f20624) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1103
Uuid: GPU-XX
Marketing Name: AMD Radeon Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 2048(0x800) KB
Chip ID: 6400(0x1900)
ASIC Revision: 12(0xc)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2700
BDFID: 50432
Internal Node ID: 1
Compute Unit: 12
SIMDs per CU: 2
Shader Engines: 1
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties: APU
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 40
SDMA engine uCode:: 21
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16319248(0xf90310) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 16319248(0xf90310) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1103
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done *** - os
uname -a
Linux dev 6.8.12-4-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-4 (2024-11-06T15:04Z) x86_64 x86_64 x86_64 GNU/LinuxDid you meet same issues or can you give me some message to fix this issue, thank you so much.
Metadata
Metadata
Assignees
Labels
No labels