Skip to content

Conversation

@MohammedTaherMcW
Copy link

@MohammedTaherMcW MohammedTaherMcW commented Jul 29, 2025

Ticket

Link to Github Issue

Problem description

Enable Support for Gemma-3-1b-it Model.

What's changed

  • Added support for gemma-3-1b-it model
  • Updated model_config.py to support datatype (BF16) in gemma-3-1b-it , including end-of-sequence (EoS) token handling and loading via Gemma3CausalLM.
  • Updated load_checkpoints.py to support gemma-3-1b-it weight loading.
  • Modified apply_scaling logic to handle both LLaMA and gemma-3-1b-it model.
  • Added Compute Kernel Config of HiFi4 and Fp32 support.

Checklist

@MohammedTaherMcW MohammedTaherMcW changed the title google/gemma-3-1b-it Bringup Add experimental model bringup for google/gemma-3-1b-it Jul 31, 2025
@willwray
Copy link

Thanks, reviewing.
To review the first commit, it would be useful to know the origin or 'provenance' of the new files in order to compare for TR-specific changes. We discussed this in last week's call, and it sounded like there is no clean-slate origin, but if you can say where the work started then that will help.

@jennychristopher jennychristopher force-pushed the mcw/gemma_3_1b/pr_1_experimental branch from d114559 to ade214f Compare August 4, 2025 15:48
@willwray
Copy link

willwray commented Aug 7, 2025

Thanks Mohammed,

I rebased on a new branch and submitted to TT with no changes

tenstorrent#26438
tenstorrent#26439
tenstorrent#26440
tenstorrent#26441

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants