I hope the authors can address this issue, but if not:
This can maybe act as a warning to people who are thinking about trying out this code. Training a model with the given code takes an inordinate amount of resources and time. Training seems to be set up to run on at least 4 GPUs of RTX A6000 level (I have access to 2 of them and their 48GB GPU RAM each is not enough). When reducing batch size to workable sizes, the model takes multiples of the time that even large-scale models like LXMERT take to train.
Be warned, that you'll likely need to invest considerable effort into the code to make it run efficiently.
I hope the authors can address this issue, but if not:
This can maybe act as a warning to people who are thinking about trying out this code. Training a model with the given code takes an inordinate amount of resources and time. Training seems to be set up to run on at least 4 GPUs of RTX A6000 level (I have access to 2 of them and their 48GB GPU RAM each is not enough). When reducing batch size to workable sizes, the model takes multiples of the time that even large-scale models like LXMERT take to train.
Be warned, that you'll likely need to invest considerable effort into the code to make it run efficiently.