Conversation
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #172 +/- ##
==========================================
+ Coverage 95.72% 96.25% +0.52%
==========================================
Files 18 18
Lines 1287 1467 +180
==========================================
+ Hits 1232 1412 +180
Misses 55 55 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
cinnabar/plotting.py
Outdated
|
|
||
| x = np.array([node[1]["exp_DG"] for node in graph.nodes(data=True)]) | ||
| y = np.array([node[1]["calc_DG"] for node in graph.nodes(data=True)]) | ||
| # we need to shift the arrays to both be centered around zero |
There was a problem hiding this comment.
Sorry likely asking a stupid question but I've been staring at this for a while this morning and I can't seem to work out why this doesn't introduce a bias in the results, especially if my dG results don't come from an MLE.
For example - if I have some dG results calculated by a method that works really poorly for most of my values but well for a few (say ~ 20%), if I do this, then my few good values are going to be shifted by my bad ones.. so I end up with an ECDF that yields really bad errors even for that 20%?
There was a problem hiding this comment.
I think I agree with Irfan. Maybe this could be an optional thing? In addition to the above, this removes the global offset e.g. in ABFEs which sometimes you may want to plot, but sometimes you may also want to use the raw data without shifting?
There was a problem hiding this comment.
Thanks both this is a good catch I was focusing on the DGs from MLE only here! How about we add a centralise flag (which shifts the predictions and experimental values by their means) similar to how plot_DGs currently works?
hannahbaumann
left a comment
There was a problem hiding this comment.
Thanks @jthorton ! This overall looks great, just some small things!
cinnabar/plotting.py
Outdated
| def ecdf_plot_all_DDGs( | ||
| graphs: list[FEMap | nx.MultiDiGraph], | ||
| labels: list[str], | ||
| title: str | None = "ECDF of Absolute Errors", |
There was a problem hiding this comment.
Maybe also add all-to-all here so the difference to the other plot is very obvious from a first glance.
cinnabar/plotting.py
Outdated
|
|
||
| x = np.array([node[1]["exp_DG"] for node in graph.nodes(data=True)]) | ||
| y = np.array([node[1]["calc_DG"] for node in graph.nodes(data=True)]) | ||
| # we need to shift the arrays to both be centered around zero |
There was a problem hiding this comment.
I think I agree with Irfan. Maybe this could be an optional thing? In addition to the above, this removes the global offset e.g. in ABFEs which sometimes you may want to plot, but sometimes you may also want to use the raw data without shifting?
| with pytest.raises(ValueError, match="At least one dataset is required to plot an ECDF."): | ||
| plotting.ecdf_plot({}) | ||
|
|
||
|
|
There was a problem hiding this comment.
Maybe you could add a test for the shifting behavior in the ecdf DG plots?
There was a problem hiding this comment.
Added a test for the new centralizing flag!
Co-authored-by: Irfan Alibay <IAlibay@users.noreply.github.com>
Co-authored-by: Hannah Baumann <43765638+hannahbaumann@users.noreply.github.com>
Co-authored-by: Hannah Baumann <43765638+hannahbaumann@users.noreply.github.com>
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
# Conflicts: # cinnabar/plotting.py # cinnabar/tests/test_plotting.py
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
IAlibay
left a comment
There was a problem hiding this comment.
Main question I have is re: should the centralizing flag default to True or False.
Other comments are nits, please ignore if you want.
cinnabar/plotting.py
Outdated
|
|
||
| # if the experimental value is missing, add a nan so we can filter it out | ||
| x = np.array([x[2].get("exp_DDG", np.nan) for x in graph.edges(data=True)]) | ||
| y = np.array([x[2]["calc_DDG"] for x in graph.edges(data=True)]) |
There was a problem hiding this comment.
Is it worth more gracefully checking if there's "calc_DDG" data in your graph?
There was a problem hiding this comment.
Good idea, added some basic checking and an error message!
cinnabar/plotting.py
Outdated
|
|
||
| # if the experimental value is missing, add a nan so we can filter it out | ||
| x = np.array([node[1].get("exp_DG", np.nan) for node in graph.nodes(data=True)]) | ||
| y = np.array([node[1]["calc_DG"] for node in graph.nodes(data=True)]) |
There was a problem hiding this comment.
Same as above on the graceful checking.
| labels: list[str], | ||
| title: str | None = "ECDF of Nodewise Absolute Errors", | ||
| filename: str | None = None, | ||
| centralizing: bool = True, |
There was a problem hiding this comment.
I'm not 100% sure here, but it feels a bit risky to set the default to be true. My thinking here is that centeralizing when you don't need it will give you more subtly wrong results, whilst not centralizing when you need it will give you plain wrong results.
Given it's a specific case for MLEs, I would maybe suggest setting it to False by default and then checking for the source in the FEMap and erroring out if you need it?
There was a problem hiding this comment.
Yeah this is a tricky one. The current default matches the other plot DG function so I was making there behaviour consistent, might be a good idea?
For the error, if we had a graph with results derived via MLE but there was an absolute value in the graph that pulled the results to the correct range, we wouldn't want to raise the error. Maybe it's better to just ensure the default is well-documented, whichever we pick?
There was a problem hiding this comment.
making behaviour consistent
Ok yeah, in the first instance I see that being a reasonable thing to do. Could you maybe open an issue that says we should revisit that default for everything? I think it's very context-based, i.e. will most users be using MLEs or not, etc...
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
IAlibay
left a comment
There was a problem hiding this comment.
Looks like all my remaining comments were things we might revisit in the future - lgtm.
hannahbaumann
left a comment
There was a problem hiding this comment.
Thanks, LGTM! Are you planning on adding this to the docs in a separate PR?
|
Thanks both lets add the docs to another PR! |
Description
Fixes #163 by adding a general ecdf plot function which can compare multipule datasets and convence functions which can plot directly from an FEMap or legacy graph.
API example
An example comparison plot

Todos
Notable points that this PR has either accomplished or will accomplish.
Questions
Checklist
newsentry for new features, bug fixes, or other user facing changes.Status
Tips
Since this will create a commit, it is best to make this comment when you are finished with your work.