Reduce mish error by an alternative without softplus op #2618

ChinChangYang · 2025-11-07T09:48:40Z

Fix the high numerical error in mish activation #2359.

Algorithm:

e = exp(x)
mish = x / (1 + 2 / (e * (e + 2)))

Evaluation:

In the following experiments, the mean absolute errors are evaluated by the method in #2359 (comment).

Before this change, NE generates high numerical error:

Mean Absolute Errors Across Samples:
  var_17:
    NE:  2.955052
    GPU: 0.000998

With the new algorithm, NE generates low numerical error:

Mean Absolute Errors Across Samples:
  var_17:
    NE:  0.001744
    GPU: 0.001516

A tester reported that the new mish function generates NaN only when x is -Inf in the float16 format.

Performance:

This change has been adopted in KataGo Core ML backend ChinChangYang/KataGo#7. The performance of the KataGo model with the new mish activation (7.15 ms) is similar to the original mish implementation (7.03 ms).

Conclusion:

Overall, the change enhances the accuracy and reliability of the mish activation in Core ML models.

Reduce mish error by an alternative without softplus op

b411347

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce mish error by an alternative without softplus op #2618

Reduce mish error by an alternative without softplus op #2618

Uh oh!

ChinChangYang commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Reduce mish error by an alternative without softplus op #2618

Are you sure you want to change the base?

Reduce mish error by an alternative without softplus op #2618

Uh oh!

Conversation

ChinChangYang commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant