Qwen3: allow tied embedding #39

martin-marek · 2025-10-28T01:49:32Z

Qwen3 0.6B, 1.7B, and 4B use tied embeddings. This PR updates model.py to support tied embeddings. It also fixes a bug in chkpt_utils.py: for whatever reason, the checkpoints of some Qwen3 models that use tied embeddings store both lm_head and model.embed_tokens even though these are identical tensors (the embeddings are tied). I added a check for this, and when it happens simply delete the lm_head tensor from the checkpoint.

martin-marek added 2 commits October 27, 2025 21:36

model: allow tied embed

b6bd876

chkpt: delete redundant head

ce0d145

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen3: allow tied embedding #39

Qwen3: allow tied embedding #39

Uh oh!

martin-marek commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Qwen3: allow tied embedding #39

Are you sure you want to change the base?

Qwen3: allow tied embedding #39

Uh oh!

Conversation

martin-marek commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant