Create ALARA Post-Processor by eitan-weinstein · Pull Request #153 · svalinn/ALARA

eitan-weinstein · 2025-11-07T15:40:32Z

Closes #150.

This PR changes the current Python script tools/alara_output_parser.py to being a more robust postprocessor, tools/alara_output_processing.py. This is done by joining some of the pure data processing methods from tools/ALARAJOYWrapper/alarajoy_QA.py that have been merged already and from #144.

To combine these two scripts, I also rewrote alara_output_processing.py as object-oriented, per our discussion yesterday (although we hadn't come to a firm conclusion on whether or not it would be necessary). I created two classes TableParser and DataProcessing to further separate concerns between the actual reading of the ALARA output data and then the handling of said data from the former QA methods.

Currently, the __main__ method only encapsulates the parsing functionality, more or less as it did before, with my thinking that that is something that can be valuable as a standalone usage in converting the output tables to CSVs, whereas the actual data processing would be best used within other applications, such as a new QA Jupyter notebook that I am going to work on in upcoming PRs or in other postprocessing applications that the user wants to create.

I'll note that this is my first time converting a program to an object-oriented paradigm, so it's possible that I've missed some programmatic conventions that I am unaware of, but I think in principle, this change is worthwhile to create this combined module.

gonuke

I really like this change and thanks for introducing the object oriented approach - it will make future use of this all much cleaner!!!

I've made a few design suggestions based on these changes. Happy to discuss more if you have concerns.

gonuke · 2025-11-08T13:24:41Z

tools/alara_output_processing.py

+import argparse
+from io import StringIO
+
+class TableParser:


This is probably more of a FileParser

gonuke · 2025-11-08T13:25:25Z

tools/alara_output_processing.py

+    # ---------- Utility and Helper Methods ----------
+
+    @staticmethod
+    def normalize_header(header_line: str):


Might want all of these "internal" methods to start with a "_" to indicate that they are intended for internal use only

gonuke · 2025-11-08T13:26:27Z

tools/alara_output_processing.py

+            key = f'{current_parameter} - {current_block}'
+            self.results[key] = df
+
+    def parse_output(self):


Maybe we can call this extract_tables()

gonuke · 2025-11-08T13:28:49Z

tools/alara_output_processing.py

+        }
+
+    @staticmethod
+    def process_time_vals(df, seconds=True):


It occurs to me that we could make a new class that is derived from DataFrame and make some of these methods be applied to those objects.

class ALARADFrame(DataFrame):

Then your parser could create ALARADFrame objects and we could call

times = adf.process_time_vals()

without passing the DataFrame.

I saw this with the various methods (and potentially more in the future?) that all take a DataFrame as their first argument.

gonuke · 2025-11-08T13:31:03Z

tools/alara_output_processing.py

+        other_row[df.columns[0]] = 'Other'
+        other_row[cols] = df.loc[small_mask, cols].sum()
+
+        mask_df = df.loc[~small_mask].reset_index(drop=True)


Does this change the df and also copy it to mask_df? That is, will df be changed by this method still?

It does not change the df, it operates on a copy.

gonuke · 2025-11-08T13:34:14Z

tools/alara_output_processing.py

+            dfs (list of dicts): List of dictionaries containing ALARA output
+                DataFrames and their metadata, of the form:
+                df_dict = {
+                    'Data Source' : (Either 'fendl2' or 'fendl3'),


We probably want to generalize this away from the notion of Data Source. I think this method still makes sense for comparing data that comes from different runs of ALARA for whatever reason, but they may not be because of Data Source

gonuke · 2025-11-08T13:36:37Z

tools/alara_output_processing.py

+        if isinstance(data_source, pd.DataFrame):
+            dfs.update(
+                cls.make_entry(
+                    inp_datalib, inp_variable, inp_unit, data_source
+                )
+            )
+            return dfs


Maybe we just have users call make_entry directly in these cases, and only use this method for processing multiple output files? Then we could have one method called make_entry and one called make_entries? Or something like that...

gonuke · 2025-11-08T13:38:14Z

tools/alara_output_processing.py

+            filename = self.sanitize_filename(key) + '.csv'
+            df.to_csv(filename, index=False)
+
+class DataProcessing:


If we separate out the ALARADFrame to its own class, then maybe this one can be called DataLibrary?

gonuke · 2025-11-10T18:40:24Z

Did you mean for this PR to now also include some changes to ALARAjoy_QA?

eitan-weinstein · 2025-11-10T19:01:55Z

Did you mean for this PR to now also include some changes to ALARAjoy_QA?

Oh no, that was accidental. I am working on another PR for the implementation of the changes in alara_output_processing and unintentionally included them here. Thanks for catching that.

gonuke

One last little change for future readability

gonuke · 2025-11-10T22:36:22Z

tools/alara_output_processing.py

+    parser = FileParser(args().filepath[0])
+    parser.extract_tables()
+    parser.write_csv_files()


Minor suggestion on naming to make it easier to understand what's happening

Suggested change

parser = FileParser(args().filepath[0])

parser.extract_tables()

parser.write_csv_files()

alara_data = FileParser(args().filepath[0])

alara_data.extract_tables()

alara_data.write_csv_files()

gonuke

LGTM - thanks @eitan-weinstein

Eitan Shai Weinstein and others added 2 commits November 7, 2025 09:25

Expanding alara_output_parser to be larger post-processor

a5b98c5

Updating implementation for alarajoy QA up to what is already merged.

b5ac633

gonuke requested changes Nov 8, 2025

View reviewed changes

Responding to requested changes.

abb64ab

Revert alarajoy_QA to match main

4b100e9

eitan-weinstein mentioned this pull request Nov 10, 2025

Updated Comparison Tool Parts 1-3: Refactored for New ALARA Output Parser #154

Merged

gonuke requested changes Nov 10, 2025

View reviewed changes

Eitan Shai Weinstein added 2 commits November 11, 2025 08:17

Improving readability

8915f6d

Fixing some documentation.

db7d534

gonuke approved these changes Nov 11, 2025

View reviewed changes

gonuke merged commit cffafb7 into svalinn:main Nov 11, 2025

Conversation

eitan-weinstein commented Nov 7, 2025

Uh oh!

gonuke left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gonuke commented Nov 10, 2025

Uh oh!

eitan-weinstein commented Nov 10, 2025

Uh oh!

gonuke left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gonuke left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants