-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi, thank you very much for the code. I recently tried to train this model, but I encountered some problems when enabling distributed training. When I tried to run this train_torch.py file using the python command, I got the error KeyError: 'LOCAL_RANK'. There seems to be no LOCAL_RANK RANK WORLD_SIZE in my environment. I can only enable one GPU when I add them manually. Did you set some additional parameters to enable distributed training when training this model? Or do you have any insights into the problem I encountered? Thank you very much for your time.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels