reduce peak RAM usage during conversion by spudone · Pull Request #5 · FastFlowLM/FLM_Q4NX_Converter

spudone · 2026-04-11T00:33:12Z

do not merge to main

This may be worth merging to its own branch. I don't have time to do an in-depth review though. I have done a successful run on a 32GB model file with these changes but I have not extensively tested model output / behavior.

The changes are designed to use zero-copy and chunked disk offload during conversion. Memory usage during conversion is about 1.2x model size, versus the original code requiring 2-2.5x.

The goal was to allow conversion of larger models without oom / crash.

If someone finds it useful, please pick it up and continue.

reduce peak RAM usage during conversion

1e33592

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce peak RAM usage during conversion#5

reduce peak RAM usage during conversion#5
spudone wants to merge 1 commit intoFastFlowLM:mainfrom
spudone:memory_opt

spudone commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spudone commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant