Add ECDF plots based on industry benchmarks by jthorton · Pull Request #172 · OpenFreeEnergy/cinnabar

jthorton · 2026-01-15T11:04:31Z

Description

Fixes #163 by adding a general ecdf plot function which can compare multipule datasets and convence functions which can plot directly from an FEMap or legacy graph.

API example

# given two FEMaps each with experimental data plot the ddgs
from cinnabar import plotting

fig = plotting.ecdf_plot_DDGs(graphs=[femap_1, femap_2], labels=["FE Map 1", "FE Map 2"], filename="ecdf_ddg_comp.png")

An example comparison plot

Todos

Notable points that this PR has either accomplished or will accomplish.

add docs once happy with the API

Questions

Do we want to change all plotting functions to acept FEMaps and move towards not using the legacy graph API?

Checklist

Added a news entry for new features, bug fixes, or other user facing changes.

Status

Ready to go

Tips

Comment "pre-commit.ci autofix" to have pre-commit.ci atomically format your PR.
Since this will create a commit, it is best to make this comment when you are finished with your work.

jthorton · 2026-01-15T11:05:07Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

codecov · 2026-01-15T11:08:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.25%. Comparing base (bd2d6f2) to head (93e6cb1).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #172      +/-   ##
==========================================
+ Coverage   95.72%   96.25%   +0.52%     
==========================================
  Files          18       18              
  Lines        1287     1467     +180     
==========================================
+ Hits         1232     1412     +180     
  Misses         55       55

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

IAlibay

Just a quick initial review

cinnabar/plotting.py

IAlibay · 2026-01-15T13:03:56Z

cinnabar/plotting.py

+
+        x = np.array([node[1]["exp_DG"] for node in graph.nodes(data=True)])
+        y = np.array([node[1]["calc_DG"] for node in graph.nodes(data=True)])
+        # we need to shift the arrays to both be centered around zero


Sorry likely asking a stupid question but I've been staring at this for a while this morning and I can't seem to work out why this doesn't introduce a bias in the results, especially if my dG results don't come from an MLE.

For example - if I have some dG results calculated by a method that works really poorly for most of my values but well for a few (say ~ 20%), if I do this, then my few good values are going to be shifted by my bad ones.. so I end up with an ECDF that yields really bad errors even for that 20%?

I think I agree with Irfan. Maybe this could be an optional thing? In addition to the above, this removes the global offset e.g. in ABFEs which sometimes you may want to plot, but sometimes you may also want to use the raw data without shifting?

Thanks both this is a good catch I was focusing on the DGs from MLE only here! How about we add a centralise flag (which shifts the predictions and experimental values by their means) similar to how plot_DGs currently works?

hannahbaumann

Thanks @jthorton ! This overall looks great, just some small things!

cinnabar/plotting.py

hannahbaumann · 2026-01-16T09:04:19Z

cinnabar/plotting.py

+def ecdf_plot_all_DDGs(
+    graphs: list[FEMap | nx.MultiDiGraph],
+    labels: list[str],
+    title: str | None = "ECDF of Absolute Errors",


Maybe also add all-to-all here so the difference to the other plot is very obvious from a first glance.

Added thanks!

hannahbaumann · 2026-01-16T09:10:49Z

cinnabar/plotting.py

+
+        x = np.array([node[1]["exp_DG"] for node in graph.nodes(data=True)])
+        y = np.array([node[1]["calc_DG"] for node in graph.nodes(data=True)])
+        # we need to shift the arrays to both be centered around zero


I think I agree with Irfan. Maybe this could be an optional thing? In addition to the above, this removes the global offset e.g. in ABFEs which sometimes you may want to plot, but sometimes you may also want to use the raw data without shifting?

hannahbaumann · 2026-01-16T09:13:23Z

cinnabar/tests/test_plotting.py

+    with pytest.raises(ValueError, match="At least one dataset is required to plot an ECDF."):
+        plotting.ecdf_plot({})
+
+


Maybe you could add a test for the shifting behavior in the ecdf DG plots?

Added a test for the new centralizing flag!

Co-authored-by: Irfan Alibay <IAlibay@users.noreply.github.com>

Co-authored-by: Hannah Baumann <43765638+hannahbaumann@users.noreply.github.com>

jthorton · 2026-01-20T11:16:26Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

# Conflicts: # cinnabar/plotting.py # cinnabar/tests/test_plotting.py

jthorton · 2026-01-20T11:32:30Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

IAlibay

Main question I have is re: should the centralizing flag default to True or False.

Other comments are nits, please ignore if you want.

IAlibay · 2026-01-22T13:24:56Z

cinnabar/plotting.py

+
+        # if the experimental value is missing, add a nan so we can filter it out
+        x = np.array([x[2].get("exp_DDG", np.nan) for x in graph.edges(data=True)])
+        y = np.array([x[2]["calc_DDG"] for x in graph.edges(data=True)])


Is it worth more gracefully checking if there's "calc_DDG" data in your graph?

Good idea, added some basic checking and an error message!

IAlibay · 2026-01-22T13:25:20Z

cinnabar/plotting.py

+
+        # if the experimental value is missing, add a nan so we can filter it out
+        x = np.array([node[1].get("exp_DG", np.nan) for node in graph.nodes(data=True)])
+        y = np.array([node[1]["calc_DG"] for node in graph.nodes(data=True)])


Same as above on the graceful checking.

IAlibay · 2026-01-22T13:30:27Z

cinnabar/plotting.py

+    labels: list[str],
+    title: str | None = "ECDF of Nodewise Absolute Errors",
+    filename: str | None = None,
+    centralizing: bool = True,


I'm not 100% sure here, but it feels a bit risky to set the default to be true. My thinking here is that centeralizing when you don't need it will give you more subtly wrong results, whilst not centralizing when you need it will give you plain wrong results.

Given it's a specific case for MLEs, I would maybe suggest setting it to False by default and then checking for the source in the FEMap and erroring out if you need it?

Yeah this is a tricky one. The current default matches the other plot DG function so I was making there behaviour consistent, might be a good idea?

For the error, if we had a graph with results derived via MLE but there was an absolute value in the graph that pulled the results to the correct range, we wouldn't want to raise the error. Maybe it's better to just ensure the default is well-documented, whichever we pick?

making behaviour consistent

Ok yeah, in the first instance I see that being a reasonable thing to do. Could you maybe open an issue that says we should revisit that default for everything? I think it's very context-based, i.e. will most users be using MLEs or not, etc...

jthorton · 2026-01-27T12:00:56Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

IAlibay

Looks like all my remaining comments were things we might revisit in the future - lgtm.

hannahbaumann

Thanks, LGTM! Are you planning on adding this to the docs in a separate PR?

jthorton · 2026-01-30T14:18:31Z

Thanks both lets add the docs to another PR!

add ecdf plots for femaps and tests

ab63bce

[pre-commit.ci] auto fixes from pre-commit.com hooks

d278549

for more information, see https://pre-commit.ci

jthorton requested review from IAlibay and hannahbaumann January 15, 2026 11:09

IAlibay requested changes Jan 15, 2026

View reviewed changes

hannahbaumann reviewed Jan 16, 2026

View reviewed changes

jthorton and others added 4 commits January 20, 2026 10:38

Update cinnabar/plotting.py

c474a1e

Co-authored-by: Irfan Alibay <IAlibay@users.noreply.github.com>

Update cinnabar/plotting.py

312620b

Co-authored-by: Hannah Baumann <43765638+hannahbaumann@users.noreply.github.com>

Update cinnabar/plotting.py

ee13e2d

Co-authored-by: Hannah Baumann <43765638+hannahbaumann@users.noreply.github.com>

PR feedback

9d0698a

pre-commit-ci bot and others added 2 commits January 20, 2026 11:16

[pre-commit.ci] auto fixes from pre-commit.com hooks

50562f2

for more information, see https://pre-commit.ci

Merge branch 'main' into ecdf_plots

f7b9935

# Conflicts: # cinnabar/plotting.py # cinnabar/tests/test_plotting.py

[pre-commit.ci] auto fixes from pre-commit.com hooks

5bff7b1

for more information, see https://pre-commit.ci

jthorton requested review from IAlibay and hannahbaumann January 20, 2026 11:52

IAlibay reviewed Jan 22, 2026

View reviewed changes

check if calculated values are present

fff794c

jthorton mentioned this pull request Jan 26, 2026

Update defaults on everything #179

Open

[pre-commit.ci] auto fixes from pre-commit.com hooks

3924701

for more information, see https://pre-commit.ci

IAlibay approved these changes Jan 28, 2026

View reviewed changes

hannahbaumann approved these changes Jan 28, 2026

View reviewed changes

jthorton added 2 commits January 30, 2026 14:25

Merge branch 'refs/heads/main' into ecdf_plots

76c25ec

add news

93e6cb1

jthorton merged commit d17fc69 into main Jan 30, 2026
10 checks passed

jthorton deleted the ecdf_plots branch January 30, 2026 14:35

		with pytest.raises(ValueError, match="At least one dataset is required to plot an ECDF."):
		plotting.ecdf_plot({})

Conversation

jthorton commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Todos

Questions

Checklist

Status

Uh oh!

jthorton commented Jan 15, 2026

Uh oh!

codecov bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

IAlibay left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jthorton Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hannahbaumann left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jthorton commented Jan 20, 2026

Uh oh!

jthorton commented Jan 20, 2026

Uh oh!

IAlibay left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jthorton commented Jan 27, 2026

Uh oh!

IAlibay left a comment

Choose a reason for hiding this comment

Uh oh!

hannahbaumann left a comment

Choose a reason for hiding this comment

Uh oh!

jthorton commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

jthorton commented Jan 15, 2026 •

edited

Loading

codecov bot commented Jan 15, 2026 •

edited

Loading

jthorton Jan 20, 2026 •

edited

Loading