pow fixup instruction?

Found some interesting things in the `pow` function
```sh
echo "kernel void test(uint pos [[thread_position_in_grid]], device float* out, const device float2* in) { out[pos] = metal::pow(in[pos].x, in[pos].y); }" | python3 compiler_explorer.py -
```
```
   0: 72091004             get_sr           r2, sr80 (thread_position_in_grid.x)
   4: 0501440e00c43200     device_load      0, i32, xy, r0_r1, u2_u3, r2, unsigned, lsl 1
   c: 3800                 wait             0
   e: 8a0d80c6             log2             r3.cache, r0.cache.abs
  12: 9a8dc6222800         fmul32           r3.cache, r3.discard, r1.cache
  18: 8a0dc6d2             exp2             r3.cache, r3.discard
  1c: 3a81c0222cc61200     fmadd32          r0, r0.discard, r1.discard, r3.discard
  24: 4501400e00c01200     device_store     0, i32, x, r0, u0_u1, r2, unsigned, 0
  2c: 8800                 stop             
```
The only difference from `powr` (which doesn't handle negative x) is the `.abs` on the input and the `fmadd32` at the end
I don't think adding the product of the inputs will magically fix up the result for negative numbers (and I ran the function to make sure it actually does calculate pow)
The fmadd32 has bit 52 set (which is currently unused in our decoder), so maybe that's what makes it special?

I'm not currently running an OS supported by hwtestbed, so I'll leave actually testing this to someone who is

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pow fixup instruction? #51

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

pow fixup instruction? #51

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions