-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Hi, I have some questions about the training speed when using multi-gpus with mxnet as the backend for keras. According to https://mxnet.incubator.apache.org/how_to/multi_devices.html, which said "By default, MXNet partitions a data batch evenly among the available GPUs. Assume a batch size b and assume there are k GPUs, then in one iteration each GPU will perform forward and backward on b/k examples. The gradients are then summed over all GPUs before updating the model." I think when the batch size b is fixed, each gpu calculates gradients on b/k examples, compared to the gradients calculation on b examples with single gpu, the former should comsume less time. As a result, with the same batch size, the speed of weights updating by using multi-gpus should be faster than that by using single gpu for each iteration. But through the experiments, I found the speed of training using multi-gpus is slower than that using single gpu .
below are parts of my code, where I used the fully-connected network
model = Sequential()
model.add(Dropout(0.1,input_shape=(2056,)))
model.add(Dense(2800,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2800,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2800,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(257))
model.summary()
opt=SGD()
NUM_GPU = 4
gpu_list = []
for i in range(NUM_GPU):
gpu_list.append('gpu(%d)' % i)
batch_size=128
model.compile(loss=my_loss, optimizer=opt, context=gpu_list)
I don't know whether my understanding is right, why no speed improvment can be obatined with multi-gpus? Can anyone solve my questions? Thanks!