I have been encountering intermittent segfaults when running code where I repeatedly (say, many hundreds of times) make calls like
psr = libstempo.tempopulsar(parfile, timfile)
Inspecting the core dumps shows that the segfault is coming from deep in tempo2 and the root cause is that the value of ne_sw_ifuncN in the pulsar struct is becoming garbage - the end of the backtrace typically looks like the following:
#0 0x000014fe6096144d in ifunc (mjd=mjd@entry=0x3b257d8, yoffs=yoffs@entry=0x3b27718, t=t@entry=57191.436434533738, N=465682051) at ifunc.C:38
#1 0x000014fe6090c421 in dm_delays (psr=<optimized out>, npsr=<optimized out>, p=<optimized out>, i=<optimized out>, delt=<optimized out>, dt_SSB=<optimized out>) at dm_delays.C:324
#2 0x000014fe608ed430 in calculate_bclt._omp_fn.0(void) () at calculate_bclt.C:143
#3 0x000014fe6060b736 in GOMP_parallel (fn=0x14fe608ed150 <calculate_bclt._omp_fn.0(void)>, data=0x7ffe2e859200, num_threads=1, flags=0) at ../../../libgomp/parallel.c:178
#4 0x000014fe608ed97b in calculate_bclt (psr=0x35c4f10, npsr=1) at calculate_bclt.C:63
#5 0x000014fe6093ac3e in formBatsAll (psr=0x35c4f10, npsr=1) at global.C:148
#6 0x000014fe60c7f2aa in __pyx_pf_9libstempo_9libstempo_11tempopulsar___cinit__ (__pyx_v_obsfreq=<optimized out>, __pyx_v_observatory=<optimized out>, __pyx_v_toaerrs=<optimized out>,
__pyx_v_toas=<optimized out>, __pyx_v_t2cmethod=<optimized out>, __pyx_v_clk=<optimized out>, __pyx_v_ephem=<optimized out>, __pyx_v_units=<optimized out>,
__pyx_v_maxobs=<optimized out>, __pyx_v_dofit=<optimized out>, __pyx_v_fixprefiterrors=<optimized out>, __pyx_v_warnings=<optimized out>, __pyx_v_timfile=0x14fe60e614d0,
__pyx_v_parfile=<optimized out>, __pyx_v_self=0x14fe6112c540) at libstempo/libstempo.cpp:32805
I'm filing this as a libstempo bug because a tentative fix seems to be adding a memset call to tempopulsar's __cinit__ to zero out the allocated memory, as I've done here, but I'm not experienced enough with C or Cython to know if this is a good way to handle this, or if this is even really a libstempo bug as opposed to something going wrong in tempo2's memory management.
This may also be the cause of nanograv/enterprise#339?
I have been encountering intermittent segfaults when running code where I repeatedly (say, many hundreds of times) make calls like
Inspecting the core dumps shows that the segfault is coming from deep in
tempo2and the root cause is that the value ofne_sw_ifuncNin thepulsarstruct is becoming garbage - the end of the backtrace typically looks like the following:I'm filing this as a libstempo bug because a tentative fix seems to be adding a
memsetcall totempopulsar's__cinit__to zero out the allocated memory, as I've done here, but I'm not experienced enough with C or Cython to know if this is a good way to handle this, or if this is even really alibstempobug as opposed to something going wrong intempo2's memory management.This may also be the cause of nanograv/enterprise#339?