请问这个怎么部署原始swin呢？

hi，我注意到模型的attention部分存在  masked self-attention计算，如果用到原始swin中，它里面shifted window也会存在一个mask计算，  是不是会重叠不好部署呢？