since a week or so, the GPU tests start failing randomly with OOM errors. Examples of this failure:
https://github.com/NumericalEarth/NumericalEarth.jl/actions/runs/24768246993/job/72467804322
https://github.com/NumericalEarth/NumericalEarth.jl/actions/runs/24667361659/job/72128408807
Maybe we are at the limit of what the GPU worker can handle? Should we increase the cache size or clean the cache before starting a job?
cc @awiteck @giordano
since a week or so, the GPU tests start failing randomly with OOM errors. Examples of this failure:
https://github.com/NumericalEarth/NumericalEarth.jl/actions/runs/24768246993/job/72467804322
https://github.com/NumericalEarth/NumericalEarth.jl/actions/runs/24667361659/job/72128408807
Maybe we are at the limit of what the GPU worker can handle? Should we increase the cache size or clean the cache before starting a job?
cc @awiteck @giordano