Skip to content

Conversation

@bhazelton
Copy link
Member

Description

This is following some of the ideas in UVBase._select_along_axis. This could probably be generalized to UVBase, but I wanted to get UVData's version done to work out the ideas before moving it up UVBase. And I wanted to at least get a draft up for @rlbyrne to try out.

Motivation and Context

fixes #1595

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation change (documentation changes only)
  • Version change
  • Build or continuous integration change
  • Other

Checklist:

Bug fix checklist:

  • My fix includes a new test that breaks as a result of the bug (if possible).
  • I have updated the CHANGELOG.

@codecov
Copy link

codecov bot commented Aug 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.93%. Comparing base (65efcce) to head (347cfdd).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1606      +/-   ##
==========================================
- Coverage   99.93%   99.93%   -0.01%     
==========================================
  Files          67       67              
  Lines       22688    22670      -18     
==========================================
- Hits        22674    22656      -18     
  Misses         14       14              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bhazelton
Copy link
Member Author

bhazelton commented Aug 18, 2025

@kartographer this is a slightly different approach than you took in the Telescope.__add__ method. Before I start thinking about moving it up to UVBase I'd like to get your thoughts about whether there is something that could be done better.

@bhazelton bhazelton force-pushed the add_concat_use_axis branch 3 times, most recently from f6b1427 to 33b50d9 Compare August 19, 2025 23:38
@bhazelton bhazelton requested a review from kartographer August 25, 2025 23:08
@bhazelton bhazelton force-pushed the add_concat_use_axis branch 5 times, most recently from 945b3a6 to 34449d2 Compare September 24, 2025 00:32
@bhazelton bhazelton force-pushed the add_concat_use_axis branch 2 times, most recently from a1f87dc to 26ed06f Compare December 11, 2025 22:29
Copy link
Contributor

@steven-murray steven-murray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bhazelton and sorry for the very slow review! I have a few comments. Mostly I think we can make some of these functions a bit more clear in what they are doing. They seem like they will become quite central in how UVxxx objects operate, so we should make them as clear as possible for our own future reference.

It might also be useful to do a quick profiling for these functions, as I expect they might be bottleneck functions in some applications (especially in terms of memory)

return np.asarray(inds)


def flt_ind_str_arr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be flt_int_str_arr? And can we just make it float_int_to_str_array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

flt_first: bool = True,
) -> StrArray:
"""
Create a string array built from float and integer arrays for matching.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have an example like "E.g. for float 3.7 and int 3 this will create an entry in the output array as '3.7_3'"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


def _get_param_axis(self, axis_name: str, single_named_axis: bool = False):
"""
Get a mapping of parameters that have a given axis to the axis number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use "attributes" instead of "parameters" here? I was confused what this function was doing until I realized you were talking about attributes.

Probably this will be even more easily cleared up by including a single simple example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem here in that the attributes are UVParameter objects. I'll work on the wording and an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I updated the name and the docstring. Let me know if it's more clear.

self,
other,
axis_name: str,
other_inds: IntArray,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this also have a default of None, corresponding to including all entries from the other object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, that is supported by get_from_form which this is passed to, and I'm willing to add it. But at least at the moment, that's not something that would actually get used because this gets called after we have checked to see if there are duplicates so we know which indices we want. It could possibly be a performance improvement path in the case that there's no down selection, so I'll add it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

new_array = np.concatenate(
[
getattr(self, param),
getattr(other, "_" + param).get_from_form(other_form_dict),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we couldn't just use np.take(getattr(other, param), other_inds, axis) here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I didn't write get_from_form, it looks like it tries to do a slicing if possible and if it can't it just uses np.take, so I think it's already somewhat optimized? But probably worth doing some performance testing on.

Comment on lines +923 to +949
if param not in multi_axis_params:
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not also padding single-axis objects? This is confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is confusing! Among other things, this is handling the case where we have data divided into chunks along more than one axis and we're trying to combine it. If you combine first along one axis and then start adding in the first chunk along the next axis you have to pad out the multi-dimensional arrays with zeros (and flags) for the corners you don't have actual information yet. That doesn't come up for single axis parameters.

Imagine a 2D array split into quadrants. The first two (along any axis) add fine. When you add in the third, you need to pad the fourth quadrant with zeros (and flags). Then when we add in the fourth, if all the data are zero and flagged we allow it to overwrite that quadrant. Does that make sense? Happy to add some more comments or docstrings if you think it'd be helpful.

Comment on lines +951 to +978
order_dict : dict
dict giving the final sort indices for each axis (keys are axes, values
are index arrays for sorting).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this sorting occurs in multiple functions, might it be better to just have a standalone "arbitrary axis sorting helper"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, possibly. I think the reason I did it this way is that each UVParmeter is only affected by one of these functions, so the sorting is currently only done only once per parameter. It's just done in a different function depending on whether it's a single axis or multiaxis array. For the single axis array, the sorting can be done at the same time as the indexing with the np.take call, so I think that's why I did it this way. But open to refactoring if it's not a performance hit.

@bhazelton bhazelton force-pushed the add_concat_use_axis branch from 26ed06f to 347cfdd Compare January 27, 2026 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parameter scan_number_array is not updated during concatenation

3 participants