-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Some aspects of model hyperparameters can be encoded as parameters, and thus can be explored during hypernet training. For example, the number of neural network hidden channels can be encoded using a channel/dropout mask that is a parameter of
Other approaches we might consider:
- skip connections + attention mechanisms could choose the number of layers
We might also consider this an entirely stochastic component that enriches the variety of hypernetworks, for instance, along with sampling z (stochastic seed) we can randomly sample bernoulli variables that can be used to select different hyper parameters. For instance, selection of batch normalization - this is an attractive avenue as we could encode almost any hyper parameter, however, we would probably need to condition z (stochastic seed) on this hyper-parameter selection. For instance, if we use batchnorm, then we should only sample seeds that generate batchnorm relevant weights (theta).