- Change 'coin_howto_overlap_captions.pickle' in
generate_freq_words_script.pyto where the captions are stored - Run
python generate_freq_words_script.pyin the same directory as xdc_main.py and video_main.py - For training change the files listed in the next section
- Then run
python xdc_main.py --exp {path experiment results} --cliplen 64 --batch 5 --workers 16 --modal joint --k 100 --verbose --contrastive --alpha 0.5where adding contrastive flag runs the contrastive approach. Alpha is the weight of the cross entropy loss. 1- alpha is the weight of the contrastive loss during training. - For single modal approaches run
video-main.py --nmb_cluster 70 --epochs 100 --batch 5 --sobel --workers 16 --modal video_only --exp ./video_only_64 --lr 0.0001 --verbose --cliplen 64switch to --modal text_only for text only
- Change paths in build_paths in Utils.py
- COIN_Dataset.py is the Dataset object
- How to load data:
from COIN_Dataset import COIN
from Utils import build_paths
root, dictionary_pickle, metadata_path = build_paths()
train_dataset = COIN(root, dictionary_pickle, metadata_path, train=True, do_crop=False)
val_dataset = COIN(root, dictionary_pickle, metadata_path, train=False, do_crop=False)