Skip to content

Updated Comparison Tool Parts 1-3: Refactored for New ALARA Output Parser#154

Merged
gonuke merged 11 commits intosvalinn:mainfrom
eitan-weinstein:alarajoy_std_plots
Dec 1, 2025
Merged

Updated Comparison Tool Parts 1-3: Refactored for New ALARA Output Parser#154
gonuke merged 11 commits intosvalinn:mainfrom
eitan-weinstein:alarajoy_std_plots

Conversation

@eitan-weinstein
Copy link
Contributor

@eitan-weinstein eitan-weinstein commented Nov 10, 2025

Closes #168 .
Closes #169 .

This PR implements changes made to tools/alara_output_parser.py for alarajoy_QA, as far as the most up-to-date version, as well as the newest plotting suggestions otherwise in #144 . This PR partially completes #151, however, I will be making adjustments to PRs #145, #146, and #147 as well to get them up to date with the new methods developed in #153.

@gonuke
Copy link
Member

gonuke commented Nov 12, 2025

Does this need a rebase for the output_parser additions?

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewed things in the QA script here while we wait for a rebase.

else:
element, A = isotope.split('-')
element = element.capitalize()
return f'$^{{{A}}}\\mathrm{{{element}}}$'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be simpler and equivalent since mathrm is basically the same as non-math

Suggested change
return f'$^{{{A}}}\\mathrm{{{element}}}$'
return f'$^{{{A}}}${element}'

Comment on lines +280 to +296
# Preprocess data to user specifications
all_labels = set()
all_data = []
for df_dict in df_dicts:
adf = df_dict[data_key]
times = adf.process_time_vals(seconds=seconds)
adf = adf.T
for col in adf.columns:
label_text = f"{adf[col].iloc[0]}"
all_labels.add(label_text)

all_data.append((df_dict, adf, times))

labels_sorted = sorted(all_labels)

cmap = plt.cm.get_cmap('Dark2')
color_map = {lbl: cmap(i % cmap.N) for i, lbl in enumerate(labels_sorted)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be good in a function of its own

Comment on lines +307 to +318
color = color_map[label_text]
label = label_text
if data_comp:
label = f"{label_text} ({df_dict['Run Label']})"

ax.plot(
times,
list(adf[col])[1:],
label=label,
color=color,
linestyle=linestyle,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
color = color_map[label_text]
label = label_text
if data_comp:
label = f"{label_text} ({df_dict['Run Label']})"
ax.plot(
times,
list(adf[col])[1:],
label=label,
color=color,
linestyle=linestyle,
)
label_suffix = ""
if data_comp:
label_suffix = f" ({df_dict['Run Label']})"
ax.plot(
times,
list(adf[col])[1:],
label=label_text + label_suffix,
color=color_map[label_text],
linestyle=linestyle,
)


ax.set_title(title_prefix + title_suffix)
if not relative:
ax.set_ylabel(f'{df_dict['Variable']} [{df_dict['Unit']}]')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the ylabel if it is relative?

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work on this. I have a lot of high-level thoughts about the data model, and maybe they'll evolve over the development of this capability.

import subprocess
from string import Template
from pathlib import Path
import matplotlib.pyplot as plt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed

adf = df_dict[data_key]
times = adf.process_time_vals(seconds=seconds)
adf = adf.T
for col in adf.columns:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is transposed, each column is a different nuclide, right? (and possibly a total)

Maybe we can note that:

Suggested change
for col in adf.columns:
for nuc in adf.columns:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we didn't transpose first, could we just perform this operation on all the entries in the first column? The transpose is not a necessary step yet (although it may make life easier for plotting)

(Defaults to True)

Returns:
all_data (list of tuples): A list of all relevant data for each table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of returning a list of tuples, this method added a new entry for the times to the df_dict for each entry in df_dicts, and collected the labels to make the color_map.

Notwithstanding the discussion about putting some of the metadata into the tables, I like the idea of having the df_dicts cache the processed metadata rather than making new data structures that may be less transparent.

# Plot data
for i, (df_dict, adf, times) in enumerate(all_data):
linestyle = line_styles[i % len(line_styles)]
for col in adf.columns:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These columns represent different nuclides

Suggested change
for col in adf.columns:
for nuc in adf.columns:

@eitan-weinstein eitan-weinstein force-pushed the alarajoy_std_plots branch 2 times, most recently from 1c1f99b to cd6e1d3 Compare November 19, 2025 20:19
@gonuke gonuke added this to the ALARA postprocessing tools milestone Nov 21, 2025
@eitan-weinstein eitan-weinstein force-pushed the alarajoy_std_plots branch 4 times, most recently from 1bad73f to 37c67e8 Compare November 25, 2025 23:02
@gonuke
Copy link
Member

gonuke commented Nov 26, 2025

It looks like you tried a rebase here, but all the changes from the previous PR are still here.
Maybe its easier to just merge main into this branch.

@eitan-weinstein
Copy link
Contributor Author

It looks like you tried a rebase here, but all the changes from the previous PR are still here. Maybe its easier to just merge main into this branch.

I just rebased again to upstream/main and I think it looks right now with the most recent changes.

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good.

Mostly minor comments here.

Comment on lines +72 to +74
if head:
sort_by_time = aop.extract_time_vals([sort_by_time])[0]
piv = piv.sort_values(sort_by_time, ascending=False).head(head)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why does this sorting only happen if head?
  2. the sorting would be a good stand alone function as I anticipate a user wanting to sort a dataframe and although it's only a couple of lines, it's not super obvious how

Comment on lines +94 to +95
for nuc in piv.index:
all_nucs.add(nuc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for nuc in piv.index:
all_nucs.add(nuc)
all_nucs.add(set(piv.index))

all_nucs.add(nuc)

nucs_sorted = sorted(all_nucs)
cmap = plt.cm.get_cmap('Dark2')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a cmap as an input to this function, with default of 'Dark2'?

nucs_sorted = sorted(all_nucs)
cmap = plt.cm.get_cmap('Dark2')

return {lbl: cmap(i % cmap.N) for i, lbl in enumerate(nucs_sorted)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not need nucs_sorted if it's only used once

Suggested change
return {lbl: cmap(i % cmap.N) for i, lbl in enumerate(nucs_sorted)}
return {lbl: cmap(i % cmap.N) for i, lbl in enumerate(sorted(all_nucs))}

else:
element, A = isotope.split('-')
element = element.capitalize()
return f'$^{{{A}}}${element}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my hero! 🦸‍♂️

Comment on lines +285 to +286
for run_lbl, times, filtered, piv, linestyle in data_list:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for run_lbl, times, filtered, piv, linestyle in data_list:
for run_lbl, times, filtered, piv, linestyle in data_list:

)
data_list.append((run_lbl, times, filtered, piv, linestyle))

color_map = build_color_map([piv for (_, _, _, piv, _) in data_list])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
color_map = build_color_map([piv for (_, _, _, piv, _) in data_list])
color_map = build_color_map([data[3] for data in data_list])

Comment on lines +294 to +295
sorted(times),
piv.loc[nuc].tolist(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting the times separately from the values seems fragile


if data_comp:
title_prefix = (
f'{run_lbls[0]}, {run_lbls[1]} Comparison: \n'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to support more than 2?

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks @eitan-weinstein

@gonuke gonuke merged commit acb98fb into svalinn:main Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Method to plot multiple time series results on the same plot Method to plot a single result from an ALARADFrame

2 participants