Skip to content

Conversation

@MohammedTaherMcW
Copy link

Ticket

Link to Github Issue

Problem description

Enable Support for Gemma-3-4b-it Model.

What's changed

Added support for gemma-3-4b-it model
Updated model_config.py to support gemma-3-4b-it , including end-of-sequence (EoS) token handling.
Updated load_checkpoints.py to support gemma-3-4b-it weight loading.
Modified apply_scaling logic to handle both LLaMA and gemma-3-4b-it model.

Checklist

@MohammedTaherMcW MohammedTaherMcW force-pushed the mcw/gemma_3_4b/pr_1_experimental branch from fc60390 to da70e34 Compare August 19, 2025 13:11
@arginugaTT
Copy link

Hi @MohammedTaherMcW

Please add your accuracy tests symlinked to tests/nightly/single_card/model_name and add the entry in .github/workflows/fast-dispatch-full-regressions-and-models-impl.yaml

create an entry for code owners of this model as pavle petrovic: .github/CODEOWNERS

then run the frequent model test workflow and attach to the ticket.

For reference: https://github.com/tenstorrent/tt-metal/pull/20690/files

@jschuhmacher
Copy link

For visibility, this was merged as tenstorrent#26924

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants