-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Following the command mpirun -n 15 pb_mpi -d ../aligned_RNA_seqs_postprocessed.phylip -cat -gtr -mutsel run02, I get the following error:
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: compute-a-16-46
Local device: mlx4_0
--------------------------------------------------------------------------
model:
stick-breaking Dirichlet process mixture (cat)
read data from file : ../aligned_RNA_seqs_postprocessed.phylip
number of taxa : 1139
number of sites : 711
number of states: 4
chain name : run02
run started
[compute-a-16-46.o2.rc.hms.harvard.edu:11425] 14 more processes have sent help message help-mpi-btl-openib.txt / error in device init
[compute-a-16-46.o2.rc.hms.harvard.edu:11425] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
*** Error in `pb_mpi': corrupted size vs. prev_size: 0x0000000004e03120 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7f7c4)[0x7fca5756c7c4]
/lib64/libc.so.6(+0x82fd4)[0x7fca5756ffd4]
/lib64/libc.so.6(__libc_malloc+0x4c)[0x7fca57572adc]
/n/app/gcc/6.2.0/lib64/libstdc++.so.6(_Znwm+0x18)[0x7fca5807ecd8]
pb_mpi[0x4de838]
pb_mpi[0x49c180]
pb_mpi[0x4e66a5]
pb_mpi[0x488cb9]
pb_mpi[0x404d9b]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fca5750f505]
pb_mpi[0x421927]
followed by a memory map.
The error occurs before the first MCMC iteration but after the 0th iteration has been written to the .trace file. Strangely, the error occurs very frequently but not always when running the command. It also occurs with a variety of settings of -n.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels