-
Notifications
You must be signed in to change notification settings - Fork 100
SAMOS rolling window improvement #2289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAMOS rolling window improvement #2289
Conversation
…act a fixed length time window rather than looking over a fixed number of array points. Also add option of trailing rolling window calculations. Modify unit tests as required.
…and estimate_samos_gams_from_table CLIs. Fix related acceptance tests.
…iling rolling window calculations.
…os_rolling_window_v2
I don't think it is unreasonable. The purpose is to make you consider whether what you've done is reasonable, which it has achieved. The passing of this test is not compulsory for merging a PR. |
MoseleyS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see how this is going to work. Mostly doc-strings to improve.
improver_tests/calibration/samos_calibration/test_TrainGAMsForSAMOS.py
Outdated
Show resolved
Hide resolved
improver_tests/acceptance/test_estimate_samos_gams_from_table.py
Outdated
Show resolved
Hide resolved
maxwhitemet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting plugin changes @brhooper. I've suggested some minor adjustments to docstrings and tests.
improver_tests/calibration/samos_calibration/test_TrainGAMsForSAMOS.py
Outdated
Show resolved
Hide resolved
|
Thanks @maxwhitemet and @MoseleyS for providing such useful reviews so quickly. I think I've responded to all of your comments. |
MoseleyS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty happy with this. Just a couple of suggestions.
improver_tests/calibration/samos_calibration/test_TrainGAMsForSAMOS.py
Outdated
Show resolved
Hide resolved
improver_tests/calibration/samos_calibration/test_TrainGAMsForSAMOS.py
Outdated
Show resolved
Hide resolved
maxwhitemet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @brhooper. Happy with the changes made. Approved 👍
| window_td = datetime.timedelta(days=self.window_length) | ||
|
|
||
| for tp in input_cube.coord("time").points: | ||
| time_point = datetime.datetime.fromtimestamp(tp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is not timezone safe. I would get a different answer using the same cube depending upon the timezone I am in. We use an HPC that is set always to UTC, so we'll probably get away with it even when there is a switch to BST, but running on spice (which honours BST) we would get different behaviour.
The problem is the fromtimestamp method on a a purely integer representation of time given by the .points (no timezone awareness at all). The fromtimestamp method must use your current system timezone to interpret the integer.
The simplest way to see the impact here is to read in the engl_orography.nc file from our ancillaries. Print the cube and note the time. Then perform the step:
print(datetime.datetime.fromtimestamp(cube.coord("time").points[0]))
You should see that you get a different time to that which is intended by the cube metadata (7am instead of the intended 6am).
You can test this further by mocking the timezone and applying the process to a cube. For example the following will give you two different answers.
import os, time, iris, datetime
cube = iris.load_cube(<path to some cube>)
print(datetime.datetime.fromtimestamp(cube.coord("time").points[0]))
os.environ['TZ'] = 'America/Los_Angeles'
time.tzset()
print(datetime.datetime.fromtimestamp(cube.coord("time").points[0]))
Does this matter if everything is done in local time? Probably not on the HPC, and elswhere only at points where it changes. So on spice, if the cube from which you are taking the time is in April, but the historic data is mostly in March, your constraint would be offset by an hour from your intended times.
Anyway, a long way around to say you could use the following instead:
for tp in cube.coord("time").cells():
time_point = tp.point._to_real_datetime()

Addresses https://github.com/metoppv/mo-blue-team/issues/991, https://github.com/metoppv/mo-blue-team/issues/1051
This PR modifies the internal SAMOS rolling window calculations to use time-based rolling windows (e.g all data within a 3 day period) rather than using the iris rolling window methods, which are based on using n-neighbouring points in an array.
This change will facilitate the use of using these rolling window calculations for non-uniformly spaced data.
Effect of changes
The effect of changing the paradigm used for constructing each window is that the meaning of the

window_lengthargument is changed. Previously,window_lengthdictated how many neighbouring points in an array would be used to construct a window. In the new method, all points within a time window are used. This means that with data points spaced equally, 24-hours apart, the old method required awindow_lengthone greater than the new method in order to arrive at the same result. I've attempted to illustrate why this is in the graphic below.Testing: