Conversation
|
I will automatically update this comment whenever this PR is modified
|
e314990 to
7b563b1
Compare
7b563b1 to
1f1cfe5
Compare
1f1cfe5 to
749fc03
Compare
e805f01 to
7b57037
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## development #605 +/- ##
===============================================
- Coverage 71.81% 67.74% -4.07%
===============================================
Files 38 36 -2
Lines 3136 3150 +14
Branches 426 430 +4
===============================================
- Hits 2252 2134 -118
- Misses 774 932 +158
+ Partials 110 84 -26 ☔ View full report in Codecov by Sentry. |
|
@mfisher87 Any idea if this was rebased after merging #604? Trying to determine where to start for fixing merge conflicts and moving this forward. |
a63ae13 to
bfa9562
Compare
icepyx/core/spatial.py
Outdated
|
|
||
| def geodataframe( | ||
| extent_type: ExtentType, | ||
| spatial_extent: Union[str, list[float]], |
There was a problem hiding this comment.
| spatial_extent: Union[str, list[float]], | |
| spatial_extent: Union[str, list[float, tuple[float]], Polygon], |
Is this the proper way to do this? The list can be a list of floats or a list of tuples (which contain floats)?
There was a problem hiding this comment.
Just to confirm, should this also accept a Polygon?
Assuming that's the case, I think this is what we would want:
| spatial_extent: Union[str, list[float]], | |
| spatial_extent: Union[str, list[float], list[tuple[float, ...]], Polygon], |
This accepts either a string, a list of floats, a list of tuples containing floats, or a Polygon.
Edit: I tried adding this annotation to the code and ran pyright. This change resulted in 23 errors that will need to be resolved.
There was a problem hiding this comment.
I'm tackling this incrementally. Added support for Polygon input with: 1c1c490
There was a problem hiding this comment.
And this adds support for the list[tuple[float, float]] case: 52037e9. I also add unit tests for each of these new cases.
There was a problem hiding this comment.
Hmmm... I'm surprised this resulted in so many errors (I haven't gotten fully up to speed on pyright, so I was just trying to make the typing match what the code already did). It should have already been handling Polygons (now circa line 137) and lists (of either floats or tuples, now circa line 140). I wonder if pyright is choking on the fact that which types are valid for spatial_extent depends on the input for extent_type?
Thanks for adding those tests!
There was a problem hiding this comment.
I think the only thing still outstanding for this PR is we still need to resolve the duplicate functionality for handling Polygons (lines 85 + 135) and lists (92+140) that was added in your above linked commits. I think the original ordering got confusing when @mfisher87 moved the if file is True bit to the top, so it transitioned from handling all extent_type==bounding_box and then all extent_type==polygon cases to handling extent_type==polygon if it's from a file, extent_type==bounding_box, and then the rest of the extent_type==polygon cases. I don't think it particularly matters what order we do things in, so perhaps moving the rest of the polygon handling above the bounding_box handling is the clearest thing to do?
I'll start taking a look at this now! I'll be working on getting as much done with icepyx/harmony integration between now and our discussion Thursday as possible. Look forward to chatting then :)
There was a problem hiding this comment.
I think another part of the confusion here is that the extent_type and what's allowed by spatial_extent are closely tied together. Adding typechecking has found some code paths that are possible, but do not technically work.
For example, the check_dateline function:
- If
extent_typeis"bounding_box", thenspatial_extentmust be a list of floats (list[float]). - If
extent_typeispolygon,spatial_extentmust be either a list or a tuplelist | tupleand assumes that the list/tuple contains pairs of lon/lat.
This function does not support some of the other input types that have been typed on the geodataframe. There we have spatial_extent accepting a string, a list of floats, a list of tuples, or a Polygon. So we need to do some pre-processing to make the input spatial_extent match what called functions expect (like check_dateline) or we need to update those functions to accept the full list of possible extent_types.
There was a problem hiding this comment.
Ok, here's a commit that refactors the geodataframe function to more clearly show what is and is not supported: c89df21. Note that I removed the "redundant" handling I added in previous commits.
Most of the code assumes that the spatial_extent is a list of floats.
spatial_extent can be a shapely Polygon only if the extent_type is "polygon" and xdateline is passed in as a parameter. If None is given (the default) for xdatetline, the function will fail.
Otherwise spatial_extent must be a string (the file case) or a list of floats. A list of tuples or a list of lists (e.g., [(lon, lat), (lon, lat)] is not supported by this code and will require additional effort to support. I'll see what I can do, but may need to put this down in favor of focusing on Harmony adoption soon.
There was a problem hiding this comment.
@JessicaS11 Ok, I added a function that does the conversion to list[float] and track that as a separate variable that gets used where a list of floats is expected. I think things are working as expected now!
Note that I did have to update some unit tests because test data contained lists of int. I converted those to list[float]. Maybe we should support list[float] and list[int]? Currently, the code will fail if a list of ints is passed instead of the list of floats, as expected.
There was a problem hiding this comment.
I think another part of the confusion here is that the extent_type and what's allowed by spatial_extent are closely tied together. Adding typechecking has found some code paths that are possible, but do not technically work.
Makes sense. I think another [major] part of the issues we're running into (which I didn't realize fully until today) was how much of the context of when the functions are actually called isn't being accounted for. As a result, we're trying to type geodataframe according to ways it wasn't intended to be used. Some of the docstrings throughout this module were definitely not-quite-right, but the geodataframe one was almost spot on (with the critical caveats you need to interpret it correctly based on your above statement, which is definitely NOT ideal, and it was missing that we handle shapely.Polygon types too). For instance, if extent_type == 'polygon', then the only options for spatial_extent are string (if it's a filename), list (of int, float, np.int64, if it's a list of coordinates), or a valid Polygon object. The conversion of the list of tuples or list of lists has already happened in validate_polygon_pairs and validate_polygon_list, which are called during the __init__.
Otherwise spatial_extent must be a string (the file case) or a list of floats. A list of tuples or a list of lists (e.g., [(lon, lat), (lon, lat)] is not supported by this code and will require additional effort to support.
Ok, I added a function that does the conversion to list[float] and track that as a separate variable that gets used where a list of floats is expected. I think things are working as expected now!
As I alluded to above, at a glance some of these additions are now duplicating the functionality that's in the validate_... set of functions, which are set up to handle all of the possible user inputs and turn them into one common format (a list of floats, in the case of a polygon input) that is what would actually be passed into geodataframe.
I'll see what I can do, but may need to put this down in favor of focusing on Harmony adoption soon.
Please don't hesitate to do so. This PR has turned into a way bigger effort than anticipated, and has brought up some interesting discussions more broadly (e.g. related to earthaccess-dev/earthaccess#804) about what inputs we/earthaccess should/shouldn't accept and/or check for the user. The spatial module was our first effort towards isolating handling the various input types and returning something uniform that icepyx could latch on to for any further spatial formatting needs. It clearly needs some streamlining and updating (and my hope is to move some revised version of it upstream to earthaccess, which I'm starting conversations about there).
Thanks, @mfisher87! This PR is pretty close - a few suggestions where we may need to make type adjustments (would love a second set of eyes on these) EDIT: DONE and some notes for some docstring and error message edits. |
|
@JessicaS11 just a heads-up that I am beginning to review and work on the Harmony integration tasks that @mfisher87 started, beginning with this PR. I'll be looking over things here and will try to respond to your comments this afternoon! |
for more information, see https://pre-commit.ci
Also added tests for the Polygon and `list[tuple[float, float]]` case.
Trying to resolve code coverage complaints...
for more information, see https://pre-commit.ci
|
Not sure about the failing Aside from that issue, this should be ready for re-review! |
Co-authored-by: Jessica Scheick <JessicaS11@users.noreply.github.com>
No description provided.