huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 28.9k
Star 144k

Code
Issues 1.1k
Pull requests 733
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/transformers

[Community contributions] Model cards

#36979 opened Mar 25, 2025 by stevhliu

Open 80

Beta

Labels 133 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,063 Open 16,143 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Attention mask for multi-image input in gemma3 bug

#38053 opened May 9, 2025 by deval281shah

1 of 4 tasks

.to on a PreTrainedModel throws a Pyright type check error. What is the correct way to put a model to the device that does not throw type check errors? bug

#38052 opened May 9, 2025 by nickeisenberg

4 tasks

Removing GenerateMixin inheritance from PreTrainedModel class results in Phi4 load fail bug

#38050 opened May 9, 2025 by yatindrav

1 of 4 tasks

Modernbert 3D attention mask Feature request

Request for a new feature

#38040 opened May 9, 2025 by meetdoshi-iitb

Trainer API doesnt stop after the training has been completed bug

#38039 opened May 9, 2025 by Awaisn25

2 of 4 tasks

[BUG] Setting average_tokens_across_devices to True caused an error because it attempted to gather CPU tensors using NCCL.

#38035 opened May 9, 2025 by Jintao-Huang

transformers require torch >= 2.1.0 to run fp8 model, but im using 2.7.0 bug

#38034 opened May 9, 2025 by O5-7

2 of 4 tasks

RuntimeError when loading InternVL3-14B model: Embedding size mismatch

#38033 opened May 9, 2025 by wkzcml-1

Removing the modification of loss value due to rounding off to 4 digits bug

#38032 opened May 9, 2025 by harish6696

2 of 4 tasks

bug in new prefill_chunk_size implementation bug

#38028 opened May 8, 2025 by SmerkyG

2 of 4 tasks

TimeSformer assumes a fixed number of frames in its layers even though it interpolates temporal embeddings based on the input bug

#38027 opened May 8, 2025 by kamila-chay

1 of 4 tasks

eps is not passed in Mistral3RMSNorm

#38025 opened May 8, 2025 by sergiopaniego

while using trainer to train mnist model, 'ValueError: Found input variables with inconsistent numbers of samples: [10000, 8750]' bug

#38024 opened May 8, 2025 by HaoyaWHL

2 of 4 tasks

Maybe the vocab_size can be duplicated to the mainconfig for PEFT to pick up

#38017 opened May 8, 2025 by lancercat

Trainer Stuck at 0% Progress during Training on Multi-GPU Setup bug

#38008 opened May 8, 2025 by yanho824

2 of 4 tasks

Does Qwen_2_5_VL support variable length attention computation? Feature request

Request for a new feature

#38007 opened May 8, 2025 by yingtongxiong

Potential bug in Qwen 2/2.5 VL Image Preprocessor

#38003 opened May 7, 2025 by ritwickchaudhry

[bug] use_sliding_window doesn't work as expected bug

#38002 opened May 7, 2025 by ZhiyuLi-Nvidia

1 of 4 tasks

Why can't InternVL3-8B start vLLM after being converted to the Hugging Face format? It shows the error: `ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.' bug

#38000 opened May 7, 2025 by FloSophorae

2 of 4 tasks

RuntimeError when converting and saving Flax ViT model to PyTorch bug Flax

#37999 opened May 7, 2025 by nobodyPerfecZ

4 tasks

Versions greater than 4.49 are not compatible with Ascend NPU bug

#37992 opened May 7, 2025 by 1737686924

4 tasks

Bug Report: Unexpected Keyword Argument 'padding_side' in PreTrainedTokenizerFast bug

#37989 opened May 7, 2025 by yunqianluo

1 of 4 tasks

Support saving tensors to a file in Model addition debuggers Feature request

Request for a new feature

#37983 opened May 6, 2025 by RyanMullins

Add pruna integration for loading model through transmorfers.from_pretrained / pipeline. Feature request

Request for a new feature

#37971 opened May 6, 2025 by davidberenstein1957

Inconsistency in installation instructions for venv and uv

#37956 opened May 5, 2025 by arjunaskykok

Previous 1 2 3 4 5 … 42 43 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly