Commit 4506887
committed
Add automatic script for cluster verification
Adding a script to do automatic verifications to assert validity of the
current code.
The verifications are not automatic unit-tests, they need automatically
checks that the process executed successfully, but the administrator
still needs to verify manually, reading the logs, that the requested
resources were provided.
Verifications can easily be combined, building on top of each others,
from complex ones to simpler ones.
Here is a list of all the verification currently implemented for slurm
clusters:
1. very_simple_task (1 CPU)
2. verify_simple_task_with_one_gpu (1 CPU 1 GPU)
3. verify_simple_task_with_many_gpus (1 CPU X GPU)
4. verify_many_task (X CPU)
5. verify_many_task_with_many_cores (XY CPU)
6. verify_many_task_with_one_gpu (X CPU X GPU)
7. verify_many_task_with_many_gpus (X CPU Y GPU)
8. verify_simple_task_with_autoresume_unneeded (1 CPU)
9. verify_simple_task_with_autoresume_needed (1 CPU)
10. verify_many_task_with_autoresume_needed (X CPU)1 parent f734fb3 commit 4506887
1 file changed
+477
-0
lines changed
0 commit comments