Skip to content

reduce peak RAM usage during conversion#5

Open
spudone wants to merge 1 commit intoFastFlowLM:mainfrom
spudone:memory_opt
Open

reduce peak RAM usage during conversion#5
spudone wants to merge 1 commit intoFastFlowLM:mainfrom
spudone:memory_opt

Conversation

@spudone
Copy link
Copy Markdown

@spudone spudone commented Apr 11, 2026

do not merge to main

This may be worth merging to its own branch. I don't have time to do an in-depth review though. I have done a successful run on a 32GB model file with these changes but I have not extensively tested model output / behavior.

The changes are designed to use zero-copy and chunked disk offload during conversion. Memory usage during conversion is about 1.2x model size, versus the original code requiring 2-2.5x.

The goal was to allow conversion of larger models without oom / crash.

If someone finds it useful, please pick it up and continue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant