Skip to content

Conversation

@specture724
Copy link
Collaborator

@specture724 specture724 commented Dec 2, 2025

resolve #60
Safetensors file have aligned storage layout. If safetensors files are in /dev/shm, we can pin it inplace without copying it, which will not cost double memory usage.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds an optimization for loading safetensors checkpoint files stored in /dev/shm/ by enabling in-place memory pinning, which avoids copying data and reduces memory consumption by half. When safetensors files are detected in /dev/shm/, the code now pins the memory-mapped file directly instead of allocating separate pinned memory and copying the tensors.

Key changes:

  • Added inplace pin memory path for safetensors files in /dev/shm/ using CUDA's cudaHostRegister
  • Implemented manual safetensors header parsing to extract tensor metadata without loading through the safetensors library
  • Parallelized inplace pinning operations using ThreadPoolExecutor
  • Preserved existing checkpoint loading path as fallback for non-safetensors files or files outside /dev/shm/

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@specture724 specture724 force-pushed the feat/inplace-pin-memory branch from 648e61b to 272355d Compare December 5, 2025 07:35
@specture724 specture724 self-assigned this Dec 5, 2025
@specture724 specture724 force-pushed the feat/inplace-pin-memory branch 2 times, most recently from 93f4470 to 3b3371b Compare December 5, 2025 08:06
@specture724 specture724 force-pushed the feat/inplace-pin-memory branch from 9ce3f31 to 37d8f0b Compare December 8, 2025 10:31
Copy link
Collaborator

@blahgeek blahgeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@specture724 specture724 force-pushed the feat/inplace-pin-memory branch 2 times, most recently from 15e8dba to 93f3fa9 Compare December 11, 2025 06:00
@specture724 specture724 force-pushed the feat/inplace-pin-memory branch from 93f3fa9 to 4d68fb3 Compare December 11, 2025 06:05
@blahgeek blahgeek merged commit e88d462 into MoonshotAI:main Dec 11, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support in-place pin memory when checkpoint files are in /dev/shm

2 participants