Don't raise error if ray already initialized.#77
Don't raise error if ray already initialized.#77bd-j wants to merge 1 commit intoryanhausen:masterfrom
Conversation
|
Hi @bd-j, thanks for the PR I will take a look. I haven't done a lot of testing on slurm, so there could be an issue with how ray get's initialized. How do you make the call to fitsmap, is it via sbatch/srun or manually in an interactive session? Sorry for the delayed reply. I seem to have somehow unsubscribed from notifications on this repo. |
|
Hi @ryanhausen, it was sbatch, but not requesting an entire node (actually only requesting a single cpu). I added the following to the top of the script that called I didn't test replacing |
|
@bd-j thanks. I need to read some into how best to use ray and slurm. I want to make sure I don't implement things in a way that breaks other things. Thanks! |
I was having problems running fitsmap via
convert.dir_to_mapon a certain cluster using slurm. A little digging suggests that on clusters where slurm does not provide exclusive node accessraystill attempts to use all cores on the node, leading to errors.The errors can be avoided by initializing ray with only a single cpu (I haven't checked if it works using the number of cpus requested via slurm) before calling fitsmap, but only if ray is then re-initialized within fitsmap with
ignore_reinit_error=True.I'm not sure if this is the best way to address the issue, but thought I'd provide the fix in case it's helpful. Happy to close this and just raise an issue or rework this PR if you have suggestions.