Skip to content

aux and l1 loss may not be worthwhile for topksae #1

@Lewington-pitsos

Description

@Lewington-pitsos

Just a quick comment: I did a little bit of profiling with this code (thanks by the way, great to have a clean reference) and it seems that for the TopKSAE at least the aux_loss and the l1_loss do not significantly alter training performance. The l2_loss alone seems to work just as well.

      l2_loss = (x_reconstruct.float() - x.float()).pow(2).mean()
      variance = ((x - x.mean(0)) ** 2).mean()
      l1_norm = acts_topk.float().abs().sum(-1).mean()
      l1_loss = self.cfg["l1_coeff"] * l1_norm
      l0_norm = (acts_topk > 0).float().sum(-1).mean()
      aux_loss = self.get_auxiliary_loss(x, x_reconstruct, acts)
      loss = l2_loss + l1_loss + aux_loss

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions