-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
Hi, thank you for sharing this amazing work!
I was visualizing the attention masks as images and noticed that the height and width dimensions seem to be swapped.
I traced through the code and couldn't find any correction logic in the SelfAttention module, so I wanted to confirm whether this is a bug or if I'm misunderstanding something.
utils/input.py (line 35-36)
# modify attention mask for object idx based on the bounding box
def get_attmask_w_box(att_masks, idx, box, image_size):
x1, y1, x2, y2 = int(np.round(box[0]*image_size)), int(np.round(box[1]*image_size)), int(np.round(box[2]*image_size)), int(np.round(box[3]*image_size))
att_masks[idx][x1:x2, y1:y2] = 1
return att_masks
dataset/decode_item.py (line 702-703)
if self.return_att_masks:
box = boxes[i]
image_size = 64
x1, y1, x2, y2 = int(np.round(box[0]*image_size)), int(np.round(box[1]*image_size)), int(np.round(box[2]*image_size)), int(np.round(box[3]*image_size))
att_masks[i][x1:x2, y1:y2] = 1
Should this be [y1:y2, x1:x2]?
Could you please confirm if this is intentional or if I'm missing something?
Thank you!
Metadata
Metadata
Assignees
Labels
No labels