Backbone motivation #162

sergiev · 2025-01-14T18:53:54Z

Hello! First of all, thank you for sharing the work of such high quality.

I'd like to know the reasons behind the backbone selection. Straight to the point, why the first version of shifted windows transformer (swin_v1) backbone was preferred over swin_v2?

ZhengPeng7 · 2025-01-15T11:04:20Z

I also wanted to try swin_v2. But I didn't find a code version of it for inputs with arbitrary sizes at that time (They're originally trained in 384x384 and 224x224 for ImageNet).
If you have a good version of it, I'm still willing to try it in the future.

sergiev · 2025-01-17T12:22:14Z

From SWINv2 paper abstract:

We present techniques for scaling Swin Transformer [35]
up to 3 billion parameters and making it capable of train-
ing with images of up to 1,536×1,536 resolution.

We tackle issues of training instability, and study how to
effectively transfer models pre-trained at low resolutions to
higher resolution ones...

ZhengPeng7 · 2025-01-17T13:13:02Z

Thanks. I know and did try it before, but I met some errors which were not easy to solve.
I'll still try it again when I'm free.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Backbone motivation #162

Backbone motivation #162

sergiev commented Jan 14, 2025 •

edited

Loading

ZhengPeng7 commented Jan 15, 2025

Uh oh!

sergiev commented Jan 17, 2025

Uh oh!

ZhengPeng7 commented Jan 17, 2025

Uh oh!

Backbone motivation #162

Backbone motivation #162

Comments

sergiev commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ZhengPeng7 commented Jan 15, 2025

Uh oh!

sergiev commented Jan 17, 2025

Uh oh!

ZhengPeng7 commented Jan 17, 2025

Uh oh!

sergiev commented Jan 14, 2025 •

edited

Loading