Seeking suggestion regarding combining VC features with other visual features

Hi, 
Thanks for your fantastic work! I am trying to apply your work for videos. For that, I am trying to combine the VC features with the I3D features. While doing so, I am facing a few challenges. First of all, I have seen that for each frame of a video I get VC features with Nx1024 size where N represents the detected bounding boxes in the object which doesn't match with the size of I3D features. So, I was doing elementwise addition of all the features of the N bounding boxes to get a single feature representation of shape 1024. 

Do you think it's a good idea? Will the features be preserved if I do addition like this? If not, do you have a better idea on how to do it so that I can combine with the I3D features? 

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seeking suggestion regarding combining VC features with other visual features #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Seeking suggestion regarding combining VC features with other visual features #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions