Utility of hypernetworks for global exploration of multimodal loss surfaces 

General idea: the hypernetwork effective initializes an infinite (many in practice) weights. Optimization should therefore explore a far wider range of local optimas. See figure for toy example of this idea. 

![image](https://github.com/user-attachments/assets/06802b5c-3394-4035-a662-34739a851636)

We should do two things to explore this: 

1. Examine the diversity in learned weights: given a known multimodal loss task (dataset+target) we should be able to use UMAP to visualize the final trained weights to see that sampled weights cluster in distinct optimas. We could actually use this to visualize the loss surface by plotting the loss of each weight sample in the umap embedding space! Would be a very cool figure. 

2. We can also have a post-hoc weight selection step to select learned theta samples that minimize a validation loss. This can be seen as exploring multiple minimas and then selecting the resulting weight of the global optima. See figure above for toy example; we select the weights in the deepest trough. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utility of hypernetworks for global exploration of multimodal loss surfaces #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Utility of hypernetworks for global exploration of multimodal loss surfaces #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions