Skip to content

Conversation

@donaldkuck
Copy link

Add MPS support and centralize device selection

Add get_device() function in pytorch_utils.py supporting CUDA, MPS, and CPU
Refactor 26 PyTorch model files to use centralized device selection
Enable automatic MPS device selection on Apple Silicon devices
The device selection priority is: CUDA > MPS > CPU

Add get_device() function in pytorch_utils.py supporting CUDA, MPS, and CPU
Refactor 26 PyTorch model files to use centralized device selection
Enable automatic MPS device selection on Apple Silicon devices
The device selection priority is: CUDA > MPS > CPU
@donaldkuck donaldkuck changed the title feat: pytorch benchmarks support mps device feat: pytorch benchmarks support mps device Nov 27, 2025
@donaldkuck
Copy link
Author

FAILED model/test_general_nn.py::TestNN::test_both_dataset - RuntimeError: MPS backend out of memory (MPS allocated: 8.00 MiB, other allocations: 16.00 KiB, max allowed: 7.93 GiB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
FAILED test_contrib_model.py::TestAllFlow::test_0_initialize - RuntimeError: MPS backend out of memory (MPS allocated: 8.00 MiB, other allocations: 0 bytes, max allowed: 7.93 GiB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
= 2 failed, 51 passed, 1 skipped, 10 deselected, 23 warnings in 288.46s (0:04:48) =

I think the failure is not because of the codes.

@SunsetWolf
Copy link
Collaborator

Hi, @donaldkuck , I don't think so, the pytest test passes in the current CI of the main branch, and in this PR pytest reports an error, indicating that the changes affected the pytest results.
The error message also points out the problem: RuntimeError: MPS backend out of memory due to very fragile MPS memory management.

@donaldkuck
Copy link
Author

Hi, @donaldkuck , I don't think so, the pytest test passes in the current CI of the main branch, and in this PR pytest reports an error, indicating that the changes affected the pytest results. The error message also points out the problem: RuntimeError: MPS backend out of memory due to very fragile MPS memory management.

I think that the mac machine in CI may have only 8GB memory totally, and the memory is not enough for mps when running pytest?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants