Skip to content

Conversation

@nikileshx
Copy link

Ticket

Link to JIRA ticket

Problem description

Added experimental support for the mistralai/Mistral-Small-3.1-24B-Instruct-2503 model.

Changes Introduced

  • Integrated support for Mistral-Small-3.1-24B-Instruct-2503.
  • Updated model_config.py to handle the new model configuration.
  • Extended load_checkpoints.py to enable weight loading for the model.
  • Added new submodules to support the model’s vision component.
  • Reused the existing text model pipeline to integrate with Mistral-Small-3.1-24B-Instruct-2503

Notes

  • No code from tt-transformers was reused for the vision model bring-up.
  • All files under experimental were purpose-built or re-written specifically for Mistral-Small-3.1-24B-Instruct-2503.

Checklist

@nikileshx nikileshx force-pushed the mcw/mistral-small-3.1-24b-it/pr_1_experimental branch from f85bffd to 3d8da88 Compare August 19, 2025 19:59
@nikileshx nikileshx marked this pull request as ready for review August 20, 2025 03:55
@nikileshx nikileshx force-pushed the mcw/mistral-small-3.1-24b-it/pr_1_experimental branch 3 times, most recently from 56f3e6c to 514d756 Compare August 22, 2025 16:20
@nikileshx nikileshx requested a review from willwray August 22, 2025 17:37
@nikileshx nikileshx force-pushed the mcw/mistral-small-3.1-24b-it/pr_1_experimental branch 3 times, most recently from 7954126 to 4e6b224 Compare August 29, 2025 05:54
@MohammedTaherMcW MohammedTaherMcW force-pushed the mcw/mistral-small-3.1-24b-it/pr_1_experimental branch from 4e6b224 to c216b15 Compare October 24, 2025 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants