-
Notifications
You must be signed in to change notification settings - Fork 58
Description
Hi all,
Great to see you guys are still going strong developing SSR after over a decade. Congrats on the 0.6 release!
Recently, I've increased the number of network messages I send to ssr-brs (As an example, let's say 20 sources each get messages updating some of their attributes at 100 Hz update rate). Unfortunately that came with a big decrease of stability of the ssr.
I am experiencing some unexpected crashes after varying amounts of time - sometimes it runs fine for hours, other times only minutes. At first I thought this is maybe the older FUDI interface's fault (seeing some open issues here describing similar crashes using the older network interface), so I switched over to using the more recent websocket interface. Unfortunately, same problem with crashes there. The messages I send all seem to contain values within a valid range, i.e. it is no particular message that crashes ssr-brs as far as I can tell.
I attached the process to lldb, however the messages mean very little to me - most of the time it is a bad access in the cleanup:
"
Process 45934 stopped
- thread # 13, stop reason = EXC_BAD_ACCESS (code=1, address=0xbeadde8ca818)
frame # 0: 0x000000010000d2e4 ssr-brs`apf::CommandQueue::push(apf::CommandQueue::Command*) [inlined] apf::CommandQueue::_cleanup(this=0x0000000100110588, cmd=0x000060000021a800) at commandqueue.h:173:12 [opt]
170 void _cleanup(Command* cmd)
171 {
172 assert(cmd != nullptr);
-> 173 cmd->cleanup();
174 delete cmd;
175 }
176
Target 0: (ssr-brs) stopped.
"
Any thoughts on what this means or how I could prevent it, to get ssr-brs to a more robust state again? These bad_accesses happen somewhere in APF? Any other logs that would help? I'm on a M1 Mac.
Many thanks!