Exploding gradients

Hi, thanks for your awesome work! I am trying to apply this work to my 4-task regression problem, where the labels for each task have their unique range (from 10^-2 to 10^2). Therefore, I am normalizing my outputs to keep the outputs in (0, 1) for each task. However, once I do this, the gradients explode to nan during training for ParetoMTL. When I'm not doing this, the training goes through, but the losses for the task with the smallest labels (10^-2) have huge losses. Do you have any suggestions on how I might fix this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exploding gradients #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Exploding gradients #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions