Skip to content

Utility of hypernetworks for global exploration of multimodal loss surfaces  #12

@nathanieljevans

Description

@nathanieljevans

General idea: the hypernetwork effective initializes an infinite (many in practice) weights. Optimization should therefore explore a far wider range of local optimas. See figure for toy example of this idea.

image

We should do two things to explore this:

  1. Examine the diversity in learned weights: given a known multimodal loss task (dataset+target) we should be able to use UMAP to visualize the final trained weights to see that sampled weights cluster in distinct optimas. We could actually use this to visualize the loss surface by plotting the loss of each weight sample in the umap embedding space! Would be a very cool figure.

  2. We can also have a post-hoc weight selection step to select learned theta samples that minimize a validation loss. This can be seen as exploring multiple minimas and then selecting the resulting weight of the global optima. See figure above for toy example; we select the weights in the deepest trough.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions