Skip to content

Ensure threads are labelled in cardano-node #6086

@jasagredo

Description

@jasagredo

Most of the threads in a running node are unlabelled. They can be seen with the -Ds flag, threads only show their address in the heap.

In order to be able to properly debug things we should try to have those threads named, such that we can then observe who throws exceptions to who or which thread gets blocked.

I built a custom GHC 9.10.1 which auto-labelled threads with a callstack, and did the same in the async package, to end up with this debugging traces:

threads on capability 0:
	thread    3 @ 00007ff4d9dfa000 ["main"] is not blocked (TSO_DIRTY)
threads on capability 1:
other threads:
	thread    2 @ 00007ff4d9dfb088 ["IOManagerThread"] is blocked on an external call
	thread    8 @ 00007ff4d9ed7000 ["AutoUpdate"] is blocked on an MVar @ 00007ff4d9ed5ef8
	thread    9 @ 00007ff4d9ed57f0 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Katip\Core.hs:788:46 in katip-0.8.8.2-inplace:Katip.Core"] is blocked on an STM operation
	thread   10 @ 00007ff4db506370 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Cardano\BM\Backend\Switchboard.lhs:237:13 in iohk-monitoring-0.2.0.0-inplace:Cardano.BM.Backend.Switchboard"] is blocked on a black hole 00007ff4db5061d0
	thread   11 @ 00007ff4db0a5500 ["CallStack (from HasCallStack):
  forkIO, called at .\Control\Concurrent\Async\Internal.hs:858:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  forkRepeat, called at .\Control\Concurrent\Async\Internal.hs:509:10 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  linkOnly, called at src\Cardano\BM\Backend\Switchboard.lhs:265:5 in iohk-monitoring-0.2.0.0-inplace:Cardano.BM.Backend.Switchboard"] is blocked on an STM operation
	thread   12 @ 00007ff4da0ee488 ["CallStack (from HasCallStack):
  forkIO, called at libraries\base\src\Control\Concurrent.hs:199:5 in base:Control.Concurrent"] is blocked on an MVar @ 00007ff4da0f0090
	thread   13 @ 00007ff4da0ef4d0 ["Date cacher (AutoUpdate)"] is blocked on an MVar @ 00007ff4da0f2bc8
	thread   14 @ 00007ff4d9dfa088 ["CallStack (from HasCallStack):
  forkIO, called at .\Network\Wai\Handler\Warp\Windows.hs:21:12 in warp-3.4.7-inplace:Network.Wai.Handler.Warp.Windows"] is blocked on an external call
	thread   15 @ 00007ff4d9ed30a8 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Cardano\BM\Backend\EKGView.lhs:347:5 in lobemo-backend-ekg-0.2.0.0-inplace:Cardano.BM.Backend.EKGView"] is blocked on an STM operation
	thread   16 @ 00007ff4d9ed4230 ["CallStack (from HasCallStack):
  forkIO, called at .\Control\Concurrent\Async\Internal.hs:858:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  forkRepeat, called at .\Control\Concurrent\Async\Internal.hs:509:10 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  linkOnly, called at .\Control\Concurrent\Async\Internal.hs:492:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  link, called at src\Cardano\BM\Backend\EKGView.lhs:250:9 in lobemo-backend-ekg-0.2.0.0-inplace:Cardano.BM.Backend.EKGView"] is blocked on an STM operation
	thread   17 @ 00007ff4d9dfb548 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Cardano\BM\Backend\Prometheus.lhs:75:5 in lobemo-backend-ekg-0.2.0.0-inplace:Cardano.BM.Backend.Prometheus"] is blocked on an MVar @ 00007ff4da0ee3a8
	thread   18 @ 00007ff4da0ee678 ["CallStack (from HasCallStack):
  forkIO, called at .\Control\Concurrent\Async\Internal.hs:858:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  forkRepeat, called at .\Control\Concurrent\Async\Internal.hs:509:10 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  linkOnly, called at .\Control\Concurrent\Async\Internal.hs:492:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  link, called at src\Cardano\BM\Backend\EKGView.lhs:258:21 in lobemo-backend-ekg-0.2.0.0-inplace:Cardano.BM.Backend.EKGView"] is blocked on an STM operation
	thread   19 @ 00007ff4d9ed24c8 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Cardano\BM\Backend\Aggregation.lhs:150:5 in lobemo-backend-aggregation-0.1.0.0-inplace:Cardano.BM.Backend.Aggregation"] is blocked on an STM operation
	thread   20 @ 00007ff4d9ed3908 ["CallStack (from HasCallStack):
  forkIO, called at .\Control\Concurrent\Async\Internal.hs:858:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  forkRepeat, called at .\Control\Concurrent\Async\Internal.hs:509:10 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  linkOnly, called at .\Control\Concurrent\Async\Internal.hs:492:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  link, called at src\Cardano\BM\Backend\Aggregation.lhs:120:9 in lobemo-backend-aggregation-0.1.0.0-inplace:Cardano.BM.Backend.Aggregation"] is blocked on an STM operation
	thread   21 @ 00007ff4da0f0a38 ["Date cacher (AutoUpdate)"] is blocked on an MVar @ 00007ff4da0ee5d8
	thread   22 @ 00007ff4d9ed23c8 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Cardano\BM\Backend\Monitoring.lhs:162:5 in lobemo-backend-monitoring-0.1.0.0-inplace:Cardano.BM.Backend.Monitoring"] is blocked on an STM operation
	thread   23 @ 00007ff4d9dfb000 ["CallStack (from HasCallStack):
  forkIO, called at .\Network\Wai\Handler\Warp\Windows.hs:21:12 in warp-3.4.7-inplace:Network.Wai.Handler.Warp.Windows"] is blocked on an external call
	thread   24 @ 00007ff4d9ed4d38 ["CallStack (from HasCallStack):
  forkIO, called at .\Control\Concurrent\Async\Internal.hs:858:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  forkRepeat, called at .\Control\Concurrent\Async\Internal.hs:509:10 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  linkOnly, called at .\Control\Concurrent\Async\Internal.hs:492:8 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  link, called at src\Cardano\BM\Backend\Monitoring.lhs:131:9 in lobemo-backend-monitoring-0.1.0.0-inplace:Cardano.BM.Backend.Monitoring"] is blocked on an STM operation
	thread   25 @ 00007ff4da0f3ef0 ["CallStack (from HasCallStack):
  rawForkIO, called at .\Control\Concurrent\Async\Internal.hs:102:27 in async-2.2.5-inplace:Control.Concurrent.Async.Internal
  async, called at src\Cardano\Node\Configuration\Logging.hs:288:13 in cardano-node-10.2-inplace:Cardano.Node.Configuration.Logging"] is blocked on atomic MVar read @ 00007ff4db506e30

Now it is time to get those threads named.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

🚫 Help needed

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions