Hi,
Thank you for your excellent work!
I have a question about the tuning procedure for the Grounding DINO model mentioned in your paper. Could you please provide more information about how to achieve it? I.e., batch size, GPU devices, whether freeze text process network, etc...
Best,
Rui