chore(cache): for generating test outputs#786
chore(cache): for generating test outputs#786HaoZeke wants to merge 10 commits intometatensor:mainfrom
Conversation
b506039 to
c3cde39
Compare
tests/resources/generate-outputs.sh
Outdated
| mtt train options.yaml -o model-32-bit.pt -r base_precision=32 | ||
| mtt train options.yaml -o model-64-bit.pt -r base_precision=64 | ||
| mtt train options-pet.yaml -o model-pet.pt | ||
| FORCE_REGENERATE=false |
There was a problem hiding this comment.
We also want to regenerate if the code changed, in case one of the change broke the tests
There was a problem hiding this comment.
makes sense, but this is a bit trickier? rn I pushed something relative to the last commit but then of course it doesn't run if the user ended up making an uncommitted change which would have broken tests..
PicoCentauri
left a comment
There was a problem hiding this comment.
I generally like the idea of caching a lot! I was always to lazy to implement it. Thanks!
| if [ "$FORCE_REGENERATE" = true ] || [ ! -f "model-32-bit.pt" ]; then | ||
| mtt train options.yaml -o model-32-bit.pt -r base_precision=32 | ||
| fi | ||
|
|
||
| if [ "$FORCE_REGENERATE" = true ] || [ ! -f "model-64-bit.pt" ]; then | ||
| mtt train options.yaml -o model-64-bit.pt -r base_precision=64 | ||
| fi | ||
|
|
||
| if [ "$FORCE_REGENERATE" = true ] || [ ! -f "model-pet.pt" ]; then | ||
| mtt train options-pet.yaml -o model-pet.pt | ||
| fi | ||
|
|
||
| if [ "$FORCE_REGENERATE" = true ]; then | ||
| echo "Saving current git commit hash to version the data." | ||
| git rev-parse HEAD > "$HASH_FILE" | ||
| fi |
There was a problem hiding this comment.
Maybe we loop this? If we add more files.
BTW: we also have some files to generate for the docs 🙈
There was a problem hiding this comment.
Maybe we loop this? If we add more files.
+1
BTW: we also have some files to generate for the docs 🙈
makes sense to design it once here then! I'm thinking now of maybe watching the status of some sources with git status as well. Or @Luthaf suggested we make the cache opt-in instead of opt-out.
There was a problem hiding this comment.
@PicoCentauri looping is a bit weird because of the different arguments, so I have 527daae
but I'd rather revert that one 😅
There was a problem hiding this comment.
Yes, I'm going to push a bit for making this opt-in for now. This way we don't break anything in the default case
There was a problem hiding this comment.
Yes, I'm going to push a bit for making this opt-in for now. This way we don't break anything in the default case
Made an opt-in variant.
This reverts commit 527daae.
714ada9 to
3ec9dd6
Compare
| # CI should always generate test files | ||
| FORCE_REGENERATE: true |
There was a problem hiding this comment.
Since this is now opt in. We don't need to declare these variables here, right?
| package.json | ||
|
|
||
| # caching githash | ||
| .data_version.txt |
There was a problem hiding this comment.
I would rather create this file inside .tox/<env-using-the-cached-data>
|
|
||
| By default, the main test suite regenerates the necessary model files every time | ||
| it runs. For faster local development, you can **opt-in** to caching these files | ||
| by setting the ``USE_CACHE`` environment variable to ``1``: |
There was a problem hiding this comment.
| by setting the ``USE_CACHE`` environment variable to ``1``: | |
| by setting the ``MTT_CACHE_TEST_DATA`` environment variable to ``1``: |
would be a lot clearer
(or MTT_CACHE_TOX_DATA to also apply to the docs things?)
Not really sure it matters to CI but locally it is nicer.
📚 Documentation preview 📚: https://metatrain--786.org.readthedocs.build/en/786/