-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
I run the experiment on CIFAR-100 using Single GPU:
channel_search_distributed.py --dataset=cifar100 --dataset_dir=$DATA_DIR$ --gpu=0 --batch_size=128 --learning_rate=0.15 --arch=resnet_cifar --depth=20 --drop_rate=0.05 --base_drop_rate=0.05
But the result I got is
2020-05-28 20:49:53,457 epoch 19 lr 1.000000e-03
2020-05-28 20:49:53,457 drop rates:
2020-05-28 20:49:53,457 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
2020-05-28 20:49:53,999 train 000 9.374524e-01 72.656250 94.531250
2020-05-28 20:49:56,114 train 100 1.043171e+00 69.477104 92.844988
2020-05-28 20:49:58,226 train 200 1.062567e+00 69.212531 92.502332
2020-05-28 20:50:00,341 train 300 1.066392e+00 69.235361 92.366591
2020-05-28 20:50:01,448 train acc 69.302222
2020-05-28 20:50:01,976 valid 000 1.466250e+00 55.468750 85.937500
2020-05-28 20:50:02,282 valid acc 59.540000
2020-05-28 20:50:02,834 valid 000 1.335375e+00 59.375000 87.500000
2020-05-28 20:50:03,379 test acc 60.660000
2020-05-28 20:50:03,936 valid 000 1.663021e+00 54.687500 84.375000
I found the result reported in paper is 71.57% after search.
Is the 71.57% able to be achieved when setting larger epoch and larger search iter?
Another question is after searching channels, model is trained from scratch in the code, can we finetune the model instead of training from scratch?
Thank for your helps
Metadata
Metadata
Assignees
Labels
No labels