-
Notifications
You must be signed in to change notification settings - Fork 135
Composable auspice config #1756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0c4004e to
5a162fa
Compare
6f4d481 to
5911881
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1756 +/- ##
=======================================
Coverage 73.23% 73.24%
=======================================
Files 79 81 +2
Lines 8402 8498 +96
Branches 1715 1732 +17
=======================================
+ Hits 6153 6224 +71
- Misses 1960 1981 +21
- Partials 289 293 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
victorlin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only reviewed the first 2 commits so far, sending in comments before lab meeting
I think re-validating the merged config is a good idea — even if it's the equivalent of a "this never happens" |
|
Only commenting from a user perspective, not coding: +1 For this feature, because I'd like to use it to simplify dengue's set of auspice configs
Cool and on-board with this! And I imagine from a user's perspective, a usage case would be similar to |
The behaviour of this function was identified as wanting in review¹ and since we use it twice (within `export_v2.py`) and call `AugurError` twice I opted to move all usages to the latter. Note that this changes the printed message slightly, but it's now more consistent with other code in the project. ¹ <#1756 (comment)>
The previous code used `warnings.warn` as well as `warning`, where the latter was a wrapper around the former¹. We now rename the wrapper as `warn` so that it's consistently used. Note that the only actual usage of the previous `warning` function was removed in an earlier commit in this series. This change was suggested in both the PR description and by others in review². ¹ The underlying warning category is unchanged (`UserWarning`) but we now have a stacklevel of 2 not 1. In the current usage I don't observe any differences. ² <#1756 (comment)>
Re-validate the merged config, as discussed in review¹, and optionally write out the merged JSON to a file. ¹ <#1756 (comment)>
5911881 to
62457b6
Compare
Re-validate the merged config, as discussed in review¹, and optionally write out the merged JSON to a file. ¹ <#1756 (comment)>
62457b6 to
23595ee
Compare
|
Force pushed to address requested changes (see comments above) and rebased onto latest master.
Done, and also added a Read The Docs is failing as it failed to check out the branch, which is unrelated to the changes in this PR. All other CI checks pass, so it's ok to merge from that perspective. |
joverlee521
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general merge behavior makes sense to me! My only blocking comment is to add some documentation to the help message to clarify how multiple Auspice configs are merged.
Re-validate the merged config, as discussed in review¹, and optionally write out the merged JSON to a file. ¹ <#1756 (comment)>
23595ee to
b294cbc
Compare
|
We now have new CI failures after this force-push. I think the salient parts of the CI logs are completely unrelated to this PR, implying that it's a flake8 version bump, so we should see this appearing on other PRs imminently 🫠 I didn't see these locally with flake8 v7.1.2, but upgrading to flake8 7.2.0 (the latest available on pip) I now see $ python3 -m pytest --no-cov tests/test_flake8.py
======================================================================================== test session starts =========================================================================================
platform darwin -- Python 3.11.11, pytest-8.3.5, pluggy-1.5.0 -- /Users/naboo/miniconda3/envs/augur-py11/bin/python3
cachedir: .pytest_cache
rootdir: /Users/naboo/github/nextstrain/augur
configfile: pytest.ini
plugins: cov-6.0.0, anyio-4.9.0, mock-3.14.0
collected 1 item
tests/test_flake8.py::test_flake8 ./augur/filter/include_exclude_rules.py:973:9: F824 `nonlocal replacements` is unused: name is never assigned in scope
FAILED
============================================================================================== FAILURES ==============================================================================================
____________________________________________________________________________________________ test_flake8 _____________________________________________________________________________________________
tests/test_flake8.py:9: in test_flake8
assert result.returncode == 0, "flake8 exited with errors"
E AssertionError: flake8 exited with errors
E assert 1 == 0
E + where 1 = CompletedProcess(args=['flake8'], returncode=1).returncode
result = CompletedProcess(args=['flake8'], returncode=1)
========================================================================================= 1 failed in 1.04s ==========================================================================================
|
The behaviour of this function was identified as wanting in review¹ and since we use it twice (within `export_v2.py`) and call `AugurError` twice I opted to move all usages to the latter. Note that this changes the printed message slightly, but it's now more consistent with other code in the project. ¹ <#1756 (comment)>
The previous code used `warnings.warn` as well as `warning`, where the latter was a wrapper around the former¹. We now rename the wrapper as `warn` so that it's consistently used. Note that the only actual usage of the previous `warning` function was removed in an earlier commit in this series. This change was suggested in both the PR description and by others in review². ¹ The underlying warning category is unchanged (`UserWarning`) but we now have a stacklevel of 2 not 1. In the current usage I don't observe any differences. ² <#1756 (comment)>
This is in preparation for another util file wanting to have access to these files which would otherwise result in a circular dependency if they remained within `export_v2.py`. Over time we will hopefully make use of these functions in other parts of Augur. Note that `deprecationWarningsEmitted` is now a function and we've updated the comment in `configure_warnings()` to reflect that it can be used in any augur command, not just export.
This centralises auspice config parsing code and handles all deprecations at the time the config is parsed, rather than than scattering them through the `export_v2.py` code. This makes the export code easier to follow and is a pre-requisite for (upcoming) config file overlays/merging. At least two unreported bugs were fixed: 1. 'maintainer' (deprecated) would be used instead of 'maintainers' 2. 'color_options' - any option which used a 'legendTitle' but not a 'menuTitle' would trigger an uncaught `KeyError` The `read_config` function (now named `read_json`) is moved out of `utils.py` (without changes) so that the majority of config-related code is now in one place.
The structure of (display) defaults changed between auspice configs v1 &
v2. The top-level key was changed ('defaults' → 'display_defaults') and
the internal keys were also changed (e.g. 'colorBy' → 'color_by').
The previously released code remapped both levels of keys (good!) but
also incorrectly remapped the deprecated internal key names for a
v2-like top-level 'display_defaults'. Such a structure is invalid, and
would have failed validation.
Here we restore the intended behaviour which updates the structure of
config['defaults'] at the time we move this deprecated structure to the
(now current) config['display_defaults']
There are a large number of modifications and special-casing around 'author'/'authors' and 'numdate'/'num_date'. If author or temporal information is set on the metadata / node-data JSONs (under "author" or "authors", "numdate" or "num_date") we'll always export it on nodes in the tree. This means the need to specify author[s] or numdate/num_date in "metadata columns" is a no-op and we can simplify things by filtering them all out. For colorings we preserve the existing behaviour of renaming "authors" to "author". Going forward a big simplification would be to move the command-line parsing into the `read_auspice_config` function and generate a unified config structure ahead of time, but for now I'm only focusing on the config JSON.
This refactors and updates code in preparation for supporting multiple auspice config JSONs
The merging approach implemented here is specific to the config files and differs from other merging approaches we may use in the group. Specifically, when lists (e.g. 'colorings') are found in multiple configs we don't replace the first list but rather extend it, and use a customised approach to choosing when to replace elements in the original list with an updated version (e.g. for 'colorings' if the 'key' of an individual coloring matches a previously seen one it will replace it in the list).
Re-validate the merged config, as discussed in review¹, and optionally write out the merged JSON to a file. ¹ <#1756 (comment)>
Following instructions in <#1762>
b294cbc to
f8caa5c
Compare
Confirmed via Slack¹ that there are no breaking changes in #1756. ¹ <https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1744311108124139?thread_ts=1744287839.266899&cid=C01LCTT7JNN>
Closes #298
See commit messages for details, but there's a few parts I want to call out explicitly:
Merging approach
Specifically, merging of lists is done by extending the original list, rather than simply replacing it. We use a custom approach to decide whether new elements should be added to the list or replace already-present elements. As an example:
{ "colorings": [ {"key": "num_date", "title": "date", "type": "continuous"} ] }{ "colorings": [ {"key": "country", "title": "Country", "type": "categorical"}, {"key": "num_date", "title": "Sampling Date", "type": "continuous"} ] }Here the second JSON's
num_datecoloring replaces the first JSONs one, and thecountrycoloring is added in, so we end up with:{ "colorings": [ {"key": "num_date", "title": "Sampling Date", "type": "continuous"} {"key": "country", "title": "Country", "type": "categorical"}, ] }I think this is the only practical way to combine colorings in multiple configs, but we also use lists for many other options, e.g. 'panels', 'maintainers'. This PR still uses the same approach, but I'm less convinced for non-coloring options such as these whether we should extend lists or simply replace them.
Note that we don't really do deep merging here at all. This only applies in one situation as far as I can tell,
config.display_defaults.panels: list[str]where overlay configs will completely replace the original list, and this is different from howconfig.panelscurrently behaves.Refactoring of warnings code
This changed all usages of (export v2's)
warn()towarning(), which in practice only changes the stacklevel, which changes... nothing? Thewarning()helper function was added years ago but saw ~no usage, so I think we should either use it consistently (as done here) or remove it and just use the defaultwarn().Validate then merge
The approach here is to validate each config JSON independently, rather than merging then validating. (This works well since our schema has no required properties.)
augur validate auspice-config-v2hasn't been updated to allow multiple auspice configs, but we could do so if we want (requires a little refactoring to avoid circular imports). We could additionally validate the merged config if people want to, although the current implementation guarantees that valid configs produce a valid merged config.Long term direction
This work paves the way to merging in all config-related command line args within
auspice_config.pywhich should help clean upexport_v2.pyquite a bit.I didn't add an option to write out the merged config, but I know this is a subject that's been discussed quite a bit in the context of snakemake config merging so we could add this here if wanted.
Still to do: