Skip to content

Conversation

@mmnga
Copy link
Contributor

@mmnga mmnga commented Nov 16, 2025

This PR adds support for the PLaMo-3 series (2B, 8B, 31B base models):

PLaMo-3 uses a hybrid architecture with Sliding Window Attention (SWA) and standard full attention layers, as well as a custom FFN layout. This PR wires those pieces into llama.cpp so that the official checkpoints can be converted to GGUF and run with the usual backends.

@github-actions github-actions bot added model Model specific python python script changes labels Nov 16, 2025
@mmnga mmnga closed this Nov 16, 2025
@mmnga mmnga reopened this Nov 16, 2025
@mmnga mmnga marked this pull request as ready for review November 17, 2025 09:49
@mmnga mmnga requested review from CISC and ggerganov as code owners November 17, 2025 09:49
@CISC
Copy link
Collaborator

CISC commented Nov 17, 2025

Any non-gated models available?

@mmnga
Copy link
Contributor Author

mmnga commented Nov 17, 2025

There are no non-gated models available at the moment.

@mmnga
Copy link
Contributor Author

mmnga commented Nov 17, 2025

Sorry, the checks failed, so I’m reverting it to draft for now.

@mmnga mmnga marked this pull request as draft November 17, 2025 13:34
@CISC
Copy link
Collaborator

CISC commented Nov 17, 2025

Sorry, the checks failed, so I’m reverting it to draft for now.

The nvidia-vulkan-cm CI failures are unrelated if that's what you're referring to...

@mmnga mmnga marked this pull request as ready for review November 18, 2025 15:32
@mmnga
Copy link
Contributor Author

mmnga commented Nov 18, 2025

I’ve reopened this PR. Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants