-
Notifications
You must be signed in to change notification settings - Fork 2
Targets-based workflows leveraging slurm for dist. computing - workflow 1a and POC #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…it in case i break everything.
… entirely reproducible.
…preparation and analysis. Update .gitignore to exclude workflow run directories. Enhance run_pipeline scripts for better directory management and parameterization. Introduce new utility functions for data handling and workflow execution. slurm workflow not yet functional.
added simple roxygen docs updated pecan settings qstat to work with zero-length strings added first draft setup shell script for one-button install added workflow functions necessary for 1a
Added apptainer build image parent workflow added apptainer sipnet-carb build workflow added dockerfile to tools/ subdirectory unlikely first attempt will build.
added line on obtaining current temp container
NOTE THE BUG: apptainer must be updated both in runscript as well as in the XML.
|
GHA based workflow successfully builds sipnet-carb docker in the source repo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comments:
- I think that integrating targets into the workflow is a good idea. It will be worth reviewing together after Chris has also had a chance to review it.
- Documentation (e.g. a README) would be helpful. Contents could include:
- Overview and useage of targets for workflows - general approach that can be used across the project
- Rationale behind divergence from more standard targets workflows, choice not to use crew or _targets.R file; use of environment vars + gsub; storing functions and args. I know you've explained in meetings, but these will be helpful to document.
- For this specific example implementing the ensemble workflow, it would be useful to document the workflow components, including a
tar_manifestand diagram of the DAG (output oftar_network()ortar_mermaid)?
| # function authors are encouraged to think carefully about the dependencies of their functions. | ||
| # if dependencies are not present, it would be ideal for functions to error informatively rather than fail on imports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are dependencies used? What should the authors consider?
| return(file.path(local_path, prefix_filename)) | ||
| } | ||
|
|
||
| #' Prepare PEcAn Run Directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something similar (create directory if it doesn't exist) is already done by prepare.settings() when it calls check.settings(). prepare.settings does a lot of other things but
A standard pattern is:
settings <- PEcAn.settings::read.settings("pecan.xml")
settings <- PEcAn.settings::prepare.settings(settings)But I don't see it used here in the workflows, so I'll defer to @infotroph to comment on whether not calling prepare.settings was a deliberate choice and whether it would be appropriate to use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was deliberate: Specifically because prepare.settings wants a live DB and queries it too many times to change that readily, and more generally because this workflow puts responsibility for constructing and verifying the settings into the xml_build stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll admit to not fully understanding the db dependency, but prepare.settings is used for this purpose in the db-independent Demo 1: https://github.com/PecanProject/pecan/blob/bff6203e17cf4ff7f6c8e553f0ea16170051018b/documentation/tutorials/Demo_1_Basic_Run/run_pecan.qmd#L137
| jobids[task_id] <- PEcAn.remote::qsub_get_jobid( | ||
| out = out[length(out)], | ||
| qsub.jobid = pecan_settings$host$qsub.jobid, | ||
| stop.on.error = stop.on.error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is stop.on.error defined (identified by lintr as 'no visible binding for global variable')
| @@ -0,0 +1,159 @@ | |||
| name: build-image | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is a lightly modified version of the docker-build-image.yml in PecanProject/pecan and I have a vague memory of seeing instructions for using workflow files from other repositories. Would it be worth investigating if we can call this from the PEcAn repo rather than maintain duplicate versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its definitely worth investigating. In moving it into this repository, I hesitated to create a new maintenance point for the same method. Investigation is the right word - i have questions, such as: which ghcr/docker repo will the created sipnet-carb image end up in? (which do we want it to end up in?) Can we invoke the method in the pecan repo using secrets in the ccmmf repo?
I'll look into it.
Co-authored-by: David LeBauer <dlebauer@gmail.com>
…intended for review at this point
…/workflows into abstracted_workflows
… a distributed workflow documents in hopefully useful state.
updated apptainer build to support develop
…can's version of this yaml. added image-version input parameter at base apptainer sipnet-carb builder
-refactored configs into latest and devel for ease of stack testing -refactored parameter passing: majority of workflow parameters are passed via orchestration XML -minimized gsub replacements for clarity
- added script for XML build step - added single function for XML build step
- leveraged targets "target_raw" methodology to enable function-call like invokations of multiple targets in re-usable blocks - enabled parameter passing and parsing for function-like behavior of target blocks - combined 03 and 04 steps from workflow 2a - workflow 2a function execution working, data routing incomplete
…g parsing with centralized functions - added smart functional resolution for either referencing external data, or copying external data into a run. - added argument parsing through as.numeric() to correctly parameterize centralized workflow functions - obtained successful 2a workflow replication via targets, apptainer and slurm - updated example workflows for new data referencing - removed obsolete example 3 variant - removed some obsolete functions within workflow_functions.R - added a gha for CI of workflows - added self hosted runner info to github action
- workflows now trigger on self hosted runner - to be resolved: CI location output cleanup
Updated description of PR contents:
Now contains fully-functional workflow for workflow 2a, as well as example workflows for downloading data, referencing previous workflow runs, copying data in from prior workflow runs, and executing PEcAn function calls in a distributed fashion via slurm and apptainer.
Also supports local execution in a containerized environment.
Have created 'adapter' R files which leverage @infotroph 's command line argument structure and pass it via a XML structure into the workflow_functions.R centralized versions of code for stand-alone execution.
workflow_functions.R versions of workflow steps are - for the most part - copy-pasted from @infotroph's implementations.
workflow 2a can be realized via (from repo root, on ccmmf test cluster)
conda activate /home/hdpriest/miniconda3/envs/pecan-all-102425 cd ./orchestration/Note: you will have to edit the 'workflow.base.run.directory' XML parameter in the orchestration XML below to your preferred location