I have a trouble with multi-GPU training:
graph_map.shape in fn _train_batch file base: torch.Size([2, 5479]) cuda:0
graph_map.shape: in fn forward file solver torch.Size([1, 5479]) cuda:0
graph_map.shape: in fn forward file solver torch.Size([1, 5479]) cuda:1
It seems that the dataparallel divided the graph_map by a wrong way. But it does not work if you simply set dim = 1 for dataparallel obviously.
How can I train the model by multi-GPU?
I have a trouble with multi-GPU training:
It seems that the dataparallel divided the graph_map by a wrong way. But it does not work if you simply set dim = 1 for dataparallel obviously.
How can I train the model by multi-GPU?