-
Notifications
You must be signed in to change notification settings - Fork 6
Issue 87: ALCF Recon Flow #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kkamdin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey David, I don't see any big issues so this will probably be fine as-is. I flagged two things in comments that you may want to check out in bl823/alcf.py before merging:
- mixed logging modules
- if/else and ValueError raise behavior in the flow steps
| # Transfer A: Send reconstructed data (tiff) to data832 | ||
| logger.info(f"Transferring {file_name} from {config.alcf832_scratch} " | ||
| f"at ALCF to {config.data832_scratch} at data832") | ||
| data832_tiff_transfer_success = transfer_controller.copy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does Step 2B (the multi-res flow below) depend on data832_tiff_transfer_success being true? If yes, you could wrap 2B in an if/else statement so it doesn't try to run and crash.
Or do a similar value error raise if that's the right behavior like you do here (if I have this right -- if alcf_multi_res_success is not true you don't try to run the zarr transfer "Transfer B" even further below):
if not alcf_multi_res_success:
logger.error("Tiff to Zarr Failed.")
raise ValueError("Tiff to Zarr at ALCF Failed")
else:
logger.info("Tiff to Zarr Successful.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multiresolution step does not rely on the TIFF transfer being successful, just that reconstruction was successful. I switched the transfer order so that users have the option to view the TIFF stack at the beamline before waiting for the Zarr conversion to complete (an extra 10+ minute overhead).
…ng the TIFF reconstruction file transfer to happen before Zarr generation for more immediate results.
… stage is complete
Simple fixes to the ALCF recon flow as outlined in this issue: #87
Update: also adding a timeout to globus compute jobs to address this issue: #45