I trained model on GrailQA and tested on dev data. Here is the (EM) result:
Overall : 75.9
IID : 81.8
Comp : 68.82
Zshot : 76.29
Though overall number is same as mentioned in paper but IID, Comp and Zshot numbers seems to have different trend than test data. Can you please confirm if the above numbers are correct?
Also Can you please elaborate on how "grail_combined_tiara.json" was created?
I trained model on GrailQA and tested on dev data. Here is the (EM) result:
Overall : 75.9
IID : 81.8
Comp : 68.82
Zshot : 76.29
Though overall number is same as mentioned in paper but IID, Comp and Zshot numbers seems to have different trend than test data. Can you please confirm if the above numbers are correct?
Also Can you please elaborate on how "grail_combined_tiara.json" was created?