Hello,
The report mentions that the NLQ is based on GroundNLQ, but could you explain specifically how it is implemented?
Are you replacing the InternVideo features with EgoVideo features?
If so, I would like to conduct a replication experiment using the GroundNLQ code. Could you explain the steps for replacing InternVideo features with EgoVideo features?
Hello,
The report mentions that the NLQ is based on GroundNLQ, but could you explain specifically how it is implemented?
Are you replacing the InternVideo features with EgoVideo features?
If so, I would like to conduct a replication experiment using the GroundNLQ code. Could you explain the steps for replacing InternVideo features with EgoVideo features?