Skip to content

Conversation

@MohammedTaherMcW
Copy link

Ticket

Link to Github Issue

Problem description

Migration of Gemma-3-1b-it from experimental setup to TT-Transformers Library

What's changed

Moved the experimental Gemma-3-1b-it to TT-Transformers
Added Unit offset support in RMSNorm
Added activation_type support in MLP
Added sliding_window support in Attention
Added ACTIVATION dtype to bfloat16 in model_config

Checklist

@jennychristopher jennychristopher force-pushed the mcw/gemma_3_1b/pr_1_experimental branch from d114559 to ade214f Compare August 4, 2025 15:48
@jennychristopher jennychristopher force-pushed the mcw/gemma_3_1b/pr_2_tt_transformers branch from 7b12efc to 0e17632 Compare August 4, 2025 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants