Argo demo by barne856 · Pull Request #14 · fema-ffrd/specs

barne856 · 2025-10-02T23:17:36Z

What has been created

A JSON-based workflow abstraction layer that compiles to Argo Workflows

Schema Definition (orchestration/argo/schemas/workflow.json)
- Defines workflow structure with DAG tasks, shared volumes, and task inputs
- Validates task configurations against HMS and RAS schemas based on image
Compiler (orchestration/argo/compile_workflow.py)
- Python script that compiles JSON workflow definitions to Argo YAML
- Validates input JSON against workflow schema
- Outputs to stdout (validation messages to stderr)
Docker Container (orchestration/argo/Dockerfile)
- Based on ffrd_base image
- Includes workflow schema + HMS/RAS schemas for validation
- Entrypoint validates then compiles
- Usage: cat workflow.json | docker run -i workflow-compiler > output.yaml
Example Workflows
- examples/hms-ras-workflow.json:
  - HMS downloads inputs from S3, writes outputs (including excess precip) to shared volume
  - ffrd_base downloads RAS model files to shared volume (same location as HMS outputs)
  - RAS reads all inputs from shared volume, uploads results to S3
Documentation (orchestration/argo/README.md): build and usage instructions

Current Issues and Limitaitons

The images (hms-golang, ras_v660, ffrd_base) are not in a container registry. This is becoming more of a blocker for development since I'm struggling getting argo to work with local container and I'm not sure it is possible or really designed for that anyway? It would be very helpful to have these images build and pull-able from a registry going forward.

There is limitation in the schemas where we cannot specify where we want the inputs downloaded to when we use and s3_store type store, this is an issue since we'll need to copy (and replace?) some RAS model files with results from the HMS container (excess precip HDF file) and we need to know where this file needs to go in relation to the model files. For now I've used the ffrd_base image to download to a shared mount and used that as the store for the RAS file inputs. It's a bit hacky but if we move forward with that we should probably develop a reference container and schemas for uploading and downloading files.

Questions for review

Is the JSON -> YAML compilation approach the right abstraction?
Does the workflow schema make sense for orchestrating these models?
Is the ffrd_base download task acceptable? Or should we add "actions" or something else for uploading/downloading to the RAS/HMS reference containers to support this workflow:
(Download model files -> modify those files -> run the simulation)
in addition to:
(Download model files -> run the simulation )
Registry strategy
a. What container registry should we use?
b. If we use GHCR, should we make separate repositories and packages for each container?
c. Is any CI/CD needed or would manual publishing work?
How will environment variables be handled for access to cloud storage buckets? I've defined an env object in the json schema to support passing env vars to individual tasks, but hard coding access keys into these sort of makes them hard to source control. In production environments they may not be needed at all if EC2 instance credentials are used correctly and all buckets are accessible from the same account (cross account access is possible through bucket policies).

barne856 added 11 commits August 1, 2025 21:41

Reference workflow using argo

a993604

orchestration docs

3c7f747

Merge remote-tracking branch 'origin/main' into orchestration

7caddcc

mdformat

a0a812f

Refactor orchestration documentation

714ab7a

move Dev Container setup instructions for Argo Workflows

6b54bec

move orchestration docs from drafts to proposals

335b166

Merge remote-tracking branch 'origin/main' into orchestration

8d2a62c

Merge branch 'orchestration' into argo-demo

68370c0

Merge remote-tracking branch 'origin/cc-schemas' into argo-demo

91fc5db

add orchestration schemas

c69a01b

barne856 requested review from daniel-aragon-MBI, slawler and thwllms October 2, 2025 23:17

thwllms approved these changes Oct 3, 2025

View reviewed changes

Base automatically changed from cc-schemas to main October 9, 2025 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Argo demo#14

Argo demo#14
barne856 wants to merge 11 commits intomainfrom
argo-demo

barne856 commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

barne856 commented Oct 2, 2025

What has been created

Current Issues and Limitaitons

Questions for review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants