With the newest commit, I am seeing this error when running run_sample_image.sh:
ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'}).
I have a fix for you - add these lines to vision_generation.py at line 60:
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
if prefix_tokenizer.pad_token is None:
prefix_tokenizer.pad_token = prefix_tokenizer.eos_token