Pytorch implementation of the paper - VQA: Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf).
$ git clone https://github.com/tbmoon/basic_vqa.git2. Download and unzip the dataset from official url of VQA: https://visualqa.org/download.html
$ cd basic_vqa/utils
$ chmod +x download_and_unzip_datasets.csh
$ ./download_and_unzip_datasets.csh$ python resize_images.py --input_dir='../datasets/Images' --output_dir='../datasets/Resized_Images'
$ python make_vacabs_for_questions_answers.py --input_dir='../datasets'
$ python build_vqa_inputs.py$ python train.py-
Loss and Accuracy on VQA datasets v2
-
Experiment 0: Prediction of '<UNK>' is accepted as Answer.
-
Experiment 1: Prediction of '<UNK>' is not accepted as Answer.
-
-
Paper implementation
- Paper: VQA: Visual Question Answering
- URL: https://arxiv.org/pdf/1505.00468.pdf
-
Pytorch tutorial
-
Preprocessing:
- Tensorflow implementation of N2NNM
- Github: https://github.com/ronghanghu/n2nmn
