Skip to content

Argo demo#14

Draft
barne856 wants to merge 11 commits intomainfrom
argo-demo
Draft

Argo demo#14
barne856 wants to merge 11 commits intomainfrom
argo-demo

Conversation

@barne856
Copy link
Copy Markdown
Member

@barne856 barne856 commented Oct 2, 2025

What has been created

A JSON-based workflow abstraction layer that compiles to Argo Workflows

  • Schema Definition (orchestration/argo/schemas/workflow.json)
    • Defines workflow structure with DAG tasks, shared volumes, and task inputs
    • Validates task configurations against HMS and RAS schemas based on image
  • Compiler (orchestration/argo/compile_workflow.py)
    • Python script that compiles JSON workflow definitions to Argo YAML
    • Validates input JSON against workflow schema
    • Outputs to stdout (validation messages to stderr)
  • Docker Container (orchestration/argo/Dockerfile)
    • Based on ffrd_base image
    • Includes workflow schema + HMS/RAS schemas for validation
    • Entrypoint validates then compiles
    • Usage: cat workflow.json | docker run -i workflow-compiler > output.yaml
  • Example Workflows
    • examples/hms-ras-workflow.json:
      • HMS downloads inputs from S3, writes outputs (including excess precip) to shared volume
      • ffrd_base downloads RAS model files to shared volume (same location as HMS outputs)
      • RAS reads all inputs from shared volume, uploads results to S3
  • Documentation (orchestration/argo/README.md): build and usage instructions

Current Issues and Limitaitons

The images (hms-golang, ras_v660, ffrd_base) are not in a container registry. This is becoming more of a blocker for development since I'm struggling getting argo to work with local container and I'm not sure it is possible or really designed for that anyway? It would be very helpful to have these images build and pull-able from a registry going forward.

There is limitation in the schemas where we cannot specify where we want the inputs downloaded to when we use and s3_store type store, this is an issue since we'll need to copy (and replace?) some RAS model files with results from the HMS container (excess precip HDF file) and we need to know where this file needs to go in relation to the model files. For now I've used the ffrd_base image to download to a shared mount and used that as the store for the RAS file inputs. It's a bit hacky but if we move forward with that we should probably develop a reference container and schemas for uploading and downloading files.

Questions for review

  1. Is the JSON -> YAML compilation approach the right abstraction?
  2. Does the workflow schema make sense for orchestrating these models?
  3. Is the ffrd_base download task acceptable? Or should we add "actions" or something else for uploading/downloading to the RAS/HMS reference containers to support this workflow:
    (Download model files -> modify those files -> run the simulation)
    in addition to:
    (Download model files -> run the simulation )
  4. Registry strategy
    a. What container registry should we use?
    b. If we use GHCR, should we make separate repositories and packages for each container?
    c. Is any CI/CD needed or would manual publishing work?
  5. How will environment variables be handled for access to cloud storage buckets? I've defined an env object in the json schema to support passing env vars to individual tasks, but hard coding access keys into these sort of makes them hard to source control. In production environments they may not be needed at all if EC2 instance credentials are used correctly and all buckets are accessible from the same account (cross account access is possible through bucket policies).

Base automatically changed from cc-schemas to main October 9, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants