Skip to content

Fix DINOv3 layer access for transformers >= 5.0#156

Open
Eyalm321 wants to merge 1 commit intomicrosoft:mainfrom
Eyalm321:fix/transformers-5x-dinov3-layer-path
Open

Fix DINOv3 layer access for transformers >= 5.0#156
Eyalm321 wants to merge 1 commit intomicrosoft:mainfrom
Eyalm321:fix/transformers-5x-dinov3-layer-path

Conversation

@Eyalm321
Copy link
Copy Markdown

Summary

  • In transformers >= 5.0, DINOv3ViTModel wraps its encoder under .model, so the layer ModuleList lives at self.model.model.layer (the inner model is DINOv3ViTEncoder). Older transformers exposed .layer directly on the top-level model.
  • extract_features in trellis2/modules/image_feature_extractor.py only handled the old layout, breaking inference on a fresh install with a current transformers release.
  • Pick whichever exists via getattr(self.model, "model", self.model) so the path works on both old and new transformers.

Repro

On a fresh setup.sh --basic --flash-attn --nvdiffrast --nvdiffrec --cumesh --o-voxel install (transformers 5.6.2 from PyPI), running Trellis2ImageTo3DPipeline.run(image) raises:

File "trellis2/modules/image_feature_extractor.py", line 86, in extract_features
    for i, layer_module in enumerate(self.model.layer):
AttributeError: 'DINOv3ViTModel' object has no attribute 'layer'

The fix unblocks inference on the latest pinned transformers without affecting older versions.

Verified

Inference end-to-end on RTX A6000 + transformers 5.6.2 + torch 2.5.1+cu124 produced a valid GLB (504K verts / 945K faces) from a single 1024×1024 PNG input.

Test plan

  • Inference completes and exports GLB on transformers 5.6.2
  • No-op on transformers <5 where self.model.layer already exists (getattr returns the same model object, behavior unchanged)

🤖 Generated with Claude Code

In recent transformers releases, `DINOv3ViTModel` no longer exposes the
encoder layer ModuleList directly on the top-level model. The encoder is
wrapped under `.model`, so the layers now live at
`self.model.model.layer` (the inner `model` is `DINOv3ViTEncoder`).

Older transformers versions exposed `self.model.layer` directly. Probe
for both via `getattr(self.model, "model", self.model)` so the change is
backward compatible.

Repro before the fix (transformers 5.6.2 on PyPI):

    AttributeError: 'DINOv3ViTModel' object has no attribute 'layer'

raised at trellis2/modules/image_feature_extractor.py:86 during
`Trellis2ImageTo3DPipeline.run()` when extracting DINOv3 features from
the input image.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant