[Not a Bug] Regarding nGPT and Quantization

Do the authors have any insight on whether nGPT quantizes better/more than standard GPT?
The faster convergence in FP16, along with all weights/activations being normalized, would seem to imply it most likely would be so.

Did the authors try this with any of their trained models?