-
Notifications
You must be signed in to change notification settings - Fork 0
Description
The standard deviation/variance, as calculated in the GaussianMixtureModel class from the DNN output, does not scale with the range of the data. So, if the data's variance at an x is only 0.001, the the variance bottoms out before it can reach that value. When it bottoms out, say at 0.05, the parameters/weights/biases become saturated and prevent the means and mixing coefficients from being properly modeled. If the variance scales with the output range though, then this could be fixed. A variance of 0.001, in a data set ranging over 0.01, could scale from a variance of 0.1, well within the range of the network's output.
Scaling from tanh range could occur after the loss function is calculated, or before the loss function is calculated. I think it should happen after the loss function is calculated, and the variance should be calculated such that it can be accurate within that small range. Then, however, all output of the GMM must be scaled up from tanh, as it will be trained for the tanh range. Additionally, the component means will need to have their scaling removed for training.
For more information, see the jupyter notebook running the GMM.