Skip to content

Signal 11 crash in grpc container #44

@kirel

Description

@kirel

I'm getting weird crashes since a couple of days. I think after an update of the container but I'm not sure.

I have Ubuntu 25.04 with 3090 on 570 nvidia server driver and cuda 2.8

Has anyone experienced such error and knows what's wrong?


*** Signal 11: Backtracing from 0x5ceff4c95781... done ***

*** Program crashed: Bad pointer dereference at 0x0000000000000040 ***

Thread 0 "gRPCServerCLI":

0  0x0000728319ab2117 <unknown> in libc.so.6

Thread 1 "NIO-ELT-0-#0":

0  0x0000728319b46e2e <unknown> in libc.so.6

Thread 2:

0  0x0000728319b46e2e <unknown> in libc.so.6

Thread 3:

0  0x0000728319ab2117 <unknown> in libc.so.6

Thread 4 "NIO-ELT-0-#0" crashed:

0      0x00005ceff4c95781 _ccv_nnc_index_select_forw(ccv_nnc_cmd_s, ccv_nnc_hint_t, int, ccv_nnc_tensor_t* const*, int, ccv_nnc_tensor_t* const*, int, ccv_nnc_stream_context_s*) + 49 in gRPCServerCLI
1 [ra] 0x00005ceff4ab1c58 _ccv_nnc_graph_exec_run_task + 1655 in gRPCServerCLI
2 [ra] 0x00005ceff4ab0f65 _ccv_nnc_graph_exec_run_loop + 1428 in gRPCServerCLI
3 [ra] 0x00005ceff4cff7e7 co_schedule + 374 in gRPCServerCLI
4 [ra] 0x00005ceff4ab0839 ccv_nnc_graph_run_with_schedule + 184 in gRPCServerCLI
5 [ra] 0x00005ceff4a62372 _ccv_cnnp_model_exec + 273 in gRPCServerCLI
6 [ra] 0x00005ceff4a53bea ccv_nnc_cmd_exec + 393 in gRPCServerCLI
7 [ra] 0x00005ceff4a5e042 ccv_nnc_dynamic_graph_exec_ret + 2689 in gRPCServerCLI
8 [ra] 0x00005ceff4a61fb9 ccv_nnc_dynamic_graph_evaluate + 1592 in gRPCServerCLI

Thread 5 "NIO-ELT-0-#0":

0  0x0000728319ab2117 <unknown> in libc.so.6

Thread 6 "cuda00004400012":

0      0x0000728319b39bcf <unknown> in libc.so.6
1 [ra] 0x00007282f43cb65f <unknown> in libcuda.so.570.158.01
2 [ra] 0x00007282f430e633 <unknown> in libcuda.so.570.158.01


Registers:

rax 0x00005ceff4c95701  8b 44 24 08 48 8d 3d 64 ff ff ff 4c 8d 4c 24 50  ·D$·H·=dÿÿÿL·L$P
rdx 0x00007282e80dec80  02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ················
rcx 0x00007282e8346a50  00 00 00 00 00 00 00 00 f1 05 00 00 00 00 00 00  ········ñ·······
rbx 0x0000000000000000  0
rsi 0x00007282e80dec80  02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ················
rdi 0x00007282e8346a40  50 7b 38 e8 82 72 00 00 90 79 38 e8 82 72 00 00  P{8è·r···y8è·r··
rbp 0x00007282f2ccae70  10 b0 cc f2 82 72 00 00 58 1c ab f4 ef 5c 00 00  ·°Ìò·r··X·«ôï\··
rsp 0x00007282f2ccacc0  ca 00 00 00 ca 00 00 00 ca 00 00 00 4d 00 00 00  Ê···Ê···Ê···M···
 r8 0x000000000000004d  77
 r9 0x00007282e80dec80  02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ················
r10 0x00007282e8346a40  50 7b 38 e8 82 72 00 00 90 79 38 e8 82 72 00 00  P{8è·r···y8è·r··
r11 0x0000000000000000  0
r12 0x0000000000000000  0
r13 0x00007282e8387b50  00 00 12 30 01 00 00 00 00 00 00 00 00 00 00 00  ···0············
r14 0x00007282e8346a50  00 00 00 00 00 00 00 00 f1 05 00 00 00 00 00 00  ········ñ·······
r15 0x00007282e8346a40  50 7b 38 e8 82 72 00 00 90 79 38 e8 82 72 00 00  P{8è·r···y8è·r··
rip 0x00005ceff4c95781  41 8b 5c 24 40 85 db 74 09 41 83 7c 24 44 00 0f  A·\$@·Ût·A·|$D··

rflags 0x0000000000010206  PF

cs 0x0033  fs 0x0000  gs 0x0000


Images (63 omitted):

0x00005ceff4082000–0x00005ceff5652a71 9c48d9d41ec43e4191e9bba857f4e942         gRPCServerCLI         /usr/local/bin/gRPCServerCLI
0x00007282f4000000–0x00007282f4e95072 95c03cc762af9b66d196d396f95aa42604dfb5cd libcuda.so.570.158.01 /usr/lib/x86_64-linux-gnu/libcuda.so.570.158.01
0x0000728319a21000–0x0000728319bdd341 c289da5071a3399de893d2af81d6a30c62646e1e libc.so.6             /usr/lib/x86_64-linux-gnu/libc.so.6

Backtrace took 0.06s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions