I train Resnet18 a4w4 with the following command with the latest repo: `python main.py -a resnet18 --bit 4 --gpu 0 -b 256` The best top1 acc is only **69.01**, why it is 1.7% lower (70.7) than the result reported in the paper?