-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Hi, thanks for your awesome work! I am trying to apply this work to my 4-task regression problem, where the labels for each task have their unique range (from 10^-2 to 10^2). Therefore, I am normalizing my outputs to keep the outputs in (0, 1) for each task. However, once I do this, the gradients explode to nan during training for ParetoMTL. When I'm not doing this, the training goes through, but the losses for the task with the smallest labels (10^-2) have huge losses. Do you have any suggestions on how I might fix this?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels