Code for the Comics Pick-a-Panel dataset and baselines.
The dataset is available at HuggingFace: VLR-CVC/ComicsPAP
python evaluate.py --split val --skill sequence_filling char_coherence visual_closure text_closure caption_relevance --model PATH_TO_MODEL --eval_batch_size BATCH_SIZE --dataset_cache PATH_TO_SAVE_DATASET --single_imagepython sft.py --skill sequence_filling char_coherence visual_closure text_closure caption_relevance --model PATH_TO_MODEL --batch_size BATCH_SIZE --max_steps TOTAL_STEPS --eval_steps EVAL_AND_SAVE_STEPS --dataset_cache PATH_TO_SAVE_DATASET --single_image