You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! First of all, thank you for sharing the work of such high quality.
I'd like to know the reasons behind the backbone selection. Straight to the point, why the first version of shifted windows transformer (swin_v1) backbone was preferred over swin_v2?
The text was updated successfully, but these errors were encountered:
I also wanted to try swin_v2. But I didn't find a code version of it for inputs with arbitrary sizes at that time (They're originally trained in 384x384 and 224x224 for ImageNet).
If you have a good version of it, I'm still willing to try it in the future.
We present techniques for scaling Swin Transformer [35]
up to 3 billion parameters and making it capable of train-
ing with images of up to 1,536×1,536 resolution.
We tackle issues of training instability, and study how to
effectively transfer models pre-trained at low resolutions to
higher resolution ones...
Uh oh!
There was an error while loading. Please reload this page.
Hello! First of all, thank you for sharing the work of such high quality.
I'd like to know the reasons behind the backbone selection. Straight to the point, why the first version of shifted windows transformer (swin_v1) backbone was preferred over swin_v2?
The text was updated successfully, but these errors were encountered: