-
Notifications
You must be signed in to change notification settings - Fork 13
Reduce the time taken to import cfdm
#362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the time taken to import cfdm
#362
Conversation
|
missed a few imports :) 82b6cb2 A note in the thread lock in |
|
... and a few more: 9909c2f |
|
Sorry, Sadie (although I know you haven't looked, yet :)) - converting to draft as I'm still fiddling! |
sadielbartholomew
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very hot off the press, a really nice thing to note first of all is that, by Python 3.15 (though we'd probably need to wait until that was our minimum to avoid lots of conditional logic on imports, so many years...) thanks to 'PEP 810 – Explicit lazy imports' which was officially accepted yesterday there will be no need to move any heavier imports to first run-time use, but instead they can be imported at the top-level with use of a new lazy keyword, e.g. lazy from scipy.sparse import issparse. So by then we can move all of the shifted imports back home 😃
In the meantime - and concerning the other aspect, our rather niche dostring rewriting - great PR. Does what it says on the tin, with import time reduced on my system from:
$ git checkout main # before
$ python -X importtime -c "import cfdm"
...
...
import time: 4581 | 1130718 | cfdm
$ git checkout david/docstring-rewrite-speed # after
$ python -X importtime -c "import cfdm"
...
...
import time: 2022 | 291805 | cfdmso by my calculation 291805/1130718 * 100 = 25.807053571270643 % of the previous import time - nice! (Looks like imports are slower on my system than yours, at least with the wind blowing in whatever direction it was at the time, but it's a similar pattern of improvement comparing main to this branch. Not sure there's much point doing multiple tries since there may be some caching effects reducing the time taken for any further attempts.) Tests still pass and functionality unaffected as far as I can otherwise tell.
Only minor comments raised, except regarding the removal of the version constraint checks in __init.py - see in-line comment. Once you've considered those, please merge!
| __cf_version__ = core.__cf_version__ | ||
| __version__ = core.__version__ | ||
|
|
||
| _requires = core._requires + ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to get rid of all of this version checking? We can still do it without having to actually import said modules, with use of importlib.metadata to see what is available in the existing environment, e.g:
>>> from importlib import metadata
>>> metadata.version("cftime")
'1.6.4'
>>> metadata.version("netCDF4")
'1.7.2'
>>> metadata.version("dask")
'2025.7.0'(Good old metadata coming in useful!) It would be a shame to lose the useful checks in the name of import speed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - I didn't know about this, thanks. I'll try it out and make a new commit if all goes well (not today!). Playing on the command line, each metadata.version call takes ~0.5 milliseconds - so not too expensive :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... Having chatted about this offline, we decided to leave things as they are here for now, but open another issue to look into what we want to do/don't in this area (which I will do soon).
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
Co-authored-by: Sadie L. Bartholomew <sadie.bartholomew@ncas.ac.uk>
|
Thanks for the review, Sadie. |
Fixes #361
Lots of files touched, but most of them are just moving the odd import from the module level to within a method.
Three main areas:
1. Doc string rewriting
2. Importing external modules that themselves have a slow import
3. Refactor CONSTANTS
As a consequence of 2., we can't initialise
chunksizeat import time (it needsdask). This prompted a bit of a refactor ofcfdm.constants,cfdm.functions.ConstantAccess, andcfdm.configuration. Essentially, the originalCONSTANTSdictionary is now removed, and replaced with the dictionarycfdm.functions.ConstantAccess._constants. This dictionary starts off empty, and gets populated as and when configuration parameters are accessed. In particular, whencfdm.configurationis called, it makes a special effort to full populate the dictionary (there could be a way of doing this without the pre-populating, but I couldn't easily find one that didn't play havoc with the setting of constants in a context manager.)