Skip to content

Backfill config gen, and as-of date changes #286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

natemcintosh
Copy link
Collaborator

@natemcintosh natemcintosh commented May 20, 2025

Nate's Summary

This aims to do issues:

It became clear that we needed to pass the job ids created during config generation to the job creation script. The options I could think of were:

  • Have config generation spit out the job ids, parse them with bash, and then loop over job.py for each job id.
  • Put a modified copy of job.py in the same script as the backfill config generation.

Option 2 felt much safer. Have tested this in Batch, and things run!

Example model output

You can see output from this in the nssp-rt-testing storage container, in the job ids starting with your-backfill-name....

Copilot's Summary

This pull request introduces a new feature for generating and running backfill configurations, along with various supporting changes. The most significant updates include adding a new script for backfill configuration, updating the Makefile to include a command for running this script, and refining documentation and parameter handling in related files.

New Feature: Backfill Configuration and Execution

  • New script for backfill configuration: Added azure/backfill_generate_and_run.py to generate and upload configuration files for the EpiNow2 pipeline, and optionally execute tasks in Azure Batch. This includes detailed parameter handling, validation, and Azure integration.
  • Makefile update: Added a backfill-config-and-run command to the Makefile to streamline running the new backfill script. This provides an easy way to configure and execute backfill tasks.
  • Documentation update: Added an entry in NEWS.md to announce the new commands for generating backfill configurations.

Improvements to Parameter Handling

  • Refinement of as_of_date parameter: Updated the as_of_date parameter description in R/config.R for better clarity and alignment with its usage.
  • generate_configs.py adjustment: Modified the as_of_date calculation to use the report date directly, ensuring consistency and simplifying logic. [1] [2]

Copy link
Contributor

github-actions bot commented May 20, 2025

Thank you for your contribution @micahwiesner67 🚀! Your pkgdown-site is ready for download 👉 here 👈!
(The artifact expires on 2025-06-03T17:14:08Z. You can re-generate it by re-running the workflow here.)

Copy link

codecov bot commented May 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Makefile Outdated
@@ -42,6 +42,21 @@ rerun-config:
--job-id=$(JOB) \
--report-date-str=$(REPORT_DATE)

backfill-config:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How possible is it to expose arguments here for them to be input via the target backfill-config?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is already working using Typer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, passing in different arguments in a nice way is something I haven't entirely figured out yet. My best thought so far is to have the user run the help message to read about the arguments, and then alter them in this command to their needs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the user can pass the arguments in, it's just that they don't have help on figuring out what the arguments are (unless they read the help message)?

If so, interesting problem. Wonder if there is a better solution.

Co-authored-by: Adam Howes <adamthowes@gmail.com>
I decided to do this as it became clear that we needed
to pass the job ids created during config generation
to the job creation script. Either that happens with
bash, or like this with python, which feels much safer.

Have tested this in Batch, and things run!
@natemcintosh natemcintosh marked this pull request as ready for review May 21, 2025 21:14
Copy link
Collaborator

@zsusswein zsusswein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth putting the duplicated code here in a function and importing it? There seems to be quite a bit of logic here shared with existing stuff and it could be annoying to maintain.

@natemcintosh
Copy link
Collaborator Author

Is it worth putting the duplicated code here in a function and importing it? There seems to be quite a bit of logic here shared with existing stuff and it could be annoying to maintain.

I was waiting for someone to say this 😆. At what point do we make the azure/ folder its own little python package? Part of the problem with putting the common functionality is that it requires various azure packages. Right now, everything in the azure/ folder is a uv script, each of which creates its own environment. If we wanted to import functionality that uses azure packages, we'd probably have to create a small package.

Copy link
Collaborator

@kgostic kgostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so as not to hold you up, but I have some comments.

Most importantly, could you put your head together with @PatrickTCorbett to talk about how people are supposed to set up a run, and add some documentation?

uv run --env-file .env azure/backfill_generate_and_run.py \
--state=all \
--disease="COVID-19,Influenza" \
--str-report-dates="2025-04-30,2025-05-07,2025-05-14" \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this as an example, but I don't think this is going to be the full range of default dates. And I don't think that people can pass these args easily into the makefile without them being defined as makefile vars?

If, (e.g.) we want to do a backfill run for every weds in the 2024-2025 season, or change any of these other options, what's the recommended workflow? Modify the makefile locally and then call make backfill-config-and-run?

Whatever is recommended, I think it would be good to add an example call (e.g. to the SOP.md)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an official start and end date for the 2024-2025 season. I will add an example showing how to easily generate those dates.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been resolved? Otherwise, looks good to me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. Still need to add that little bit more documentation, and then I think it'll be ready to go.

@zsusswein
Copy link
Collaborator

zsusswein commented May 23, 2025

At what point do we make the azure/ folder its own little python package?

Definitely out of scope for this PR, but what about sticking it in the STF package you're using for pool setup? Probably way more trouble than it's worth because it's a different format than what they do...........

I do think this is maybe worth doing though. This feel worth hashing out in an issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants