Skip to content

Training demands unrealistic amounts of resources #3

@dreichCSL

Description

@dreichCSL

I hope the authors can address this issue, but if not:

This can maybe act as a warning to people who are thinking about trying out this code. Training a model with the given code takes an inordinate amount of resources and time. Training seems to be set up to run on at least 4 GPUs of RTX A6000 level (I have access to 2 of them and their 48GB GPU RAM each is not enough). When reducing batch size to workable sizes, the model takes multiples of the time that even large-scale models like LXMERT take to train.
Be warned, that you'll likely need to invest considerable effort into the code to make it run efficiently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions