Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/architecture-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,7 @@ jobs:
- name: run architecture tests
run: tox -e ${{ matrix.architecture-name }}-tests
env:
# CI should always generate test files
FORCE_REGENERATE: true
# Use the CPU only version of torch when building/running the code
PIP_EXTRA_INDEX_URL: https://download.pytorch.org/whl/cpu
2 changes: 2 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,7 @@ jobs:
- name: Test build integrity
run: tox -e build
env:
# CI should always generate test files
FORCE_REGENERATE: true
# Use the CPU only version of torch when building/running the code
PIP_EXTRA_INDEX_URL: https://download.pytorch.org/whl/cpu
2 changes: 2 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ jobs:
- name: build documentation
run: tox -e docs
env:
# CI should always generate test files
FORCE_REGENERATE: true
# Use the CPU-only version of torch
PIP_EXTRA_INDEX_URL: https://download.pytorch.org/whl/cpu

Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ jobs:
tox -e tests
coverage xml --data-file tests/.coverage
env:
# CI should always generate test files
FORCE_REGENERATE: true
Comment on lines +47 to +48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is now opt in. We don't need to declare these variables here, right?

# Use the CPU only version of torch when building/running the code
PIP_EXTRA_INDEX_URL: https://download.pytorch.org/whl/cpu
HUGGINGFACE_TOKEN_METATRAIN: ${{ secrets.HUGGINGFACE_TOKEN }}
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -179,3 +179,6 @@ docs/src/examples
node_modules/
package-lock.json
package.json

# caching githash
.data_version.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather create this file inside .tox/<env-using-the-cached-data>

11 changes: 11 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,17 @@ testing it. Also, you may want to setup your editor to automatically apply the `
are plugins to do this with `all major editors
<https://black.readthedocs.io/en/stable/editor_integration.html>`_.

By default, the main test suite regenerates the necessary model files every time
it runs. For faster local development, you can **opt-in** to caching these files
by setting the ``USE_CACHE`` environment variable to ``1``:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
by setting the ``USE_CACHE`` environment variable to ``1``:
by setting the ``MTT_CACHE_TEST_DATA`` environment variable to ``1``:

would be a lot clearer

(or MTT_CACHE_TOX_DATA to also apply to the docs things?)


.. code-block:: bash

USE_CACHE=1 tox -e tests

When caching is enabled, the script will skip regeneration as long as the cached
files exist and the underlying source code has not changed.

If you want to test a specific archicture you can also do it. For example

.. code-block:: bash
Expand Down
49 changes: 46 additions & 3 deletions tests/resources/generate-outputs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,52 @@ ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)

cd "$ROOT_DIR"

mtt train options.yaml -o model-32-bit.pt -r base_precision=32
mtt train options.yaml -o model-64-bit.pt -r base_precision=64
mtt train options-pet.yaml -o model-pet.pt
HASH_FILE=".data_version.txt"
WATCH_PATHS="src/"
FORCE_REGENERATE=true

if [[ "${USE_CACHE:-0}" == "1" ]]; then
echo "USE_CACHE=1 detected. Attempting to use cached data."
CACHE_IS_VALID=true
if [ -n "$(git status --porcelain -- $WATCH_PATHS)" ]; then
echo "Cache is invalid due to uncommitted changes. Must regenerate."
CACHE_IS_VALID=false
elif [ ! -f "$HASH_FILE" ]; then
echo "Cache is invalid: version file not found. Must regenerate."
CACHE_IS_VALID=false
else
SAVED_HASH=$(cat "$HASH_FILE")
CURRENT_HASH=$(git rev-parse HEAD)
if [ "$SAVED_HASH" != "$CURRENT_HASH" ]; then
echo "Cache is invalid: code version has changed. Must regenerate."
CACHE_IS_VALID=false
fi
fi

# If all checks passed, we can rely on the cache.
if [ "$CACHE_IS_VALID" = true ]; then
echo "Cache is valid. Will skip regeneration for existing files."
FORCE_REGENERATE=false
fi
fi

# Regenerate if regeneration is forced (default) OR if a file is missing.
if [ "$FORCE_REGENERATE" = true ] || [ ! -f "model-32-bit.pt" ]; then
mtt train options.yaml -o model-32-bit.pt -r base_precision=32
fi

if [ "$FORCE_REGENERATE" = true ] || [ ! -f "model-64-bit.pt" ]; then
mtt train options.yaml -o model-64-bit.pt -r base_precision=64
fi

if [ "$FORCE_REGENERATE" = true ] || [ ! -f "model-pet.pt" ]; then
mtt train options-pet.yaml -o model-pet.pt
fi

if [ "$FORCE_REGENERATE" = true ]; then
echo "Saving current git commit hash to version the data."
git rev-parse HEAD > "$HASH_FILE"
fi

set +x # disable command echoing for sensitive private token check
TOKEN_PRESENT=false
Expand Down