Isn't this a bit misleading? How could you interpret this as moment matching when you're just adding noise from the high variance of the estimator? The whole paper is explained in this way but if you were to write the paper that way shouldn't you prove that the method works for larger M? It's a big oversell...