-
Notifications
You must be signed in to change notification settings - Fork 31
Make aggregation function selection consistent across Run and Chart pages #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make aggregation function selection consistent across Run and Chart pages #42
Conversation
# When aggregating multiple samples, it becomes unclear which sample to use for | ||
# associated data like the run date, the order, etc. Use the index of the closest | ||
# value in all the samples. | ||
closest_value = sorted(values, key=lambda val: abs(val - agg_value))[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the change I called out about reimplementing how we select related data associated to the aggregated point. Previously, for min
we'd use the actual data point that we extracted the minimum value from. For mean
we would always pick the first point's data (see how agg_mean
returned 0
for the index).
Now, we select the point closest to the value we selected. I can't really imagine a case where this would give us an incorrect answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't all samples have the same run and so the same order? Or do we end up aggregating samples from different runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% certain I follow. Are you suggesting that we could select any index at random and it wouldn't matter (assuming all samples are from the same run)?
So far, I have been operating under the assumption that you could have different runs that provide samples for the same test and the same order. For example, you could run benchmark foo
today on commit 123abc
and submit a run for that:
run | order | test | execution_time |
---|---|---|---|
1 | 123abc | foo | 10 |
Then, a few days later you decide to do more runs for the same test and the same commit since you find that point to be a bit noisy, and want to take the median-of-3 instead. You can do that by submitting your results with lnt import /path/to/db report.json --merge append
. Note that lnt import
defaults to --merge replace
, which will replace the data for the existing run for the same order, which IMO is a bit strange but that's how it works. Your database now looks like this:
run | order | test | execution_time |
---|---|---|---|
1 | 123abc | foo | 10 |
... | ... | ... | ... |
45 | 123abc | foo | 11 |
... | ... | ... | ... |
78 | 123abc | foo | 8 |
The samples you have for foo
at commit 123abc
all come from different runs, and the question is what run information you'll pick to display on the graph. With my patch:
- if you aggregate using
median
(which is10
), we'd end up selecting run information for run 1, which is correct - if you aggregate using
mean
(which is9.67
), we'd end up selecting run information for run 1 since that's the closest data point to9.67
. I feel there's not a perfect choice - if you aggregate using
min
, we'll pick run 78, which is correct - if you aggregate using
max
, we'll pick run 45, which is correct - if you hypothetically had another sample of value
10
and were aggregating withmedian
(which would be10 == median(8, 10, 10, 11)
), we'd pick the run information for either of these two 10 samples. That seems acceptable.
I just tested this with synthetic data and it seems to behave the way I described above. Let me know if that makes sense to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge assuming you're happy with this explanation. If not, let's talk and I can address that in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah interesting, your explanation makes sense. The way that we use aggregation on cc-perf.igalia.com is that we use --exec-multisample=3
to get multiple samples in a single run.
But I didn't consider that you might end up with multiple samples over multiple runs if you use --merge append
. No problems with the fix here, thanks for talking it through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tried this out locally and it works great. Btw very happy to see others hacking on LNT again!
# When aggregating multiple samples, it becomes unclear which sample to use for | ||
# associated data like the run date, the order, etc. Use the index of the closest | ||
# value in all the samples. | ||
closest_value = sorted(values, key=lambda val: abs(val - agg_value))[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't all samples have the same run and so the same order? Or do we end up aggregating samples from different runs?
…ages Currently, the Run page (e.g. /db_default/v4/nts/<runid>) provides a way to select the aggregation function used when there are multiple samples in the run (that option is located under "View Options"). The supported options are min and median. Similarly, the Graph page (e.g. /db_default/v4/nts/graph?<params>) provides a way to use mean instead of min to aggregate multiple data points for the same order on the graph. That is provided as a checkbox under the "Settings 🔧" menu. This patch makes both selection mechanisms consistent by using a dropdown menu that provides min, max, mean and median in both cases. As a drive-by, it reimplements how we select the sample to use as a reference for related data: we now select the point closest to the aggregated value instead of selecting the first point unconditionally.
32fa495
to
6dafde9
Compare
Currently, the Run page (e.g.
/db_default/v4/nts/<runid>
) provides a way to select the aggregation function used when there are multiple samples in the run (that option is located under "View Options"). The supported options are min and median.Similarly, the Graph page (e.g.
/db_default/v4/nts/graph?<params>
) provides a way to use mean instead of min to aggregate multiple data points for the same order on the graph. That is provided as a checkbox under the "Settings 🔧" menu.This patch makes both selection mechanisms consistent by using a dropdown menu that provides min, max, mean and median in both cases.
As a drive-by, it reimplements how we select the sample to use as a reference for related data: we now select the point closest to the aggregated value instead of selecting the first point unconditionally.