Skip to content

Backbone motivation #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sergiev opened this issue Jan 14, 2025 · 3 comments
Open

Backbone motivation #162

sergiev opened this issue Jan 14, 2025 · 3 comments

Comments

@sergiev
Copy link

sergiev commented Jan 14, 2025

Hello! First of all, thank you for sharing the work of such high quality.

I'd like to know the reasons behind the backbone selection. Straight to the point, why the first version of shifted windows transformer (swin_v1) backbone was preferred over swin_v2?

@ZhengPeng7
Copy link
Owner

I also wanted to try swin_v2. But I didn't find a code version of it for inputs with arbitrary sizes at that time (They're originally trained in 384x384 and 224x224 for ImageNet).
If you have a good version of it, I'm still willing to try it in the future.

@sergiev
Copy link
Author

sergiev commented Jan 17, 2025

From SWINv2 paper abstract:

We present techniques for scaling Swin Transformer [35]
up to 3 billion parameters and making it capable of train-
ing with images of up to 1,536×1,536 resolution.

We tackle issues of training instability, and study how to
effectively transfer models pre-trained at low resolutions to
higher resolution ones...

@ZhengPeng7
Copy link
Owner

Thanks. I know and did try it before, but I met some errors which were not easy to solve.
I'll still try it again when I'm free.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants