Skip to content

Utility of hypernets for integrated hyper-parameter selection  #9

@nathanieljevans

Description

@nathanieljevans

Some aspects of model hyperparameters can be encoded as parameters, and thus can be explored during hypernet training. For example, the number of neural network hidden channels can be encoded using a channel/dropout mask that is a parameter of $f_{\theta}$. While, users will still need to use a validation set to select the optimal performing functions after training the IEN, they do not have to run multiple training sessions with unique hyper-parameters. This can improve ease of use, and may be more efficient. (we should evaluate this).

Other approaches we might consider:

  • skip connections + attention mechanisms could choose the number of layers

We might also consider this an entirely stochastic component that enriches the variety of hypernetworks, for instance, along with sampling z (stochastic seed) we can randomly sample bernoulli variables that can be used to select different hyper parameters. For instance, selection of batch normalization - this is an attractive avenue as we could encode almost any hyper parameter, however, we would probably need to condition z (stochastic seed) on this hyper-parameter selection. For instance, if we use batchnorm, then we should only sample seeds that generate batchnorm relevant weights (theta).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions