Skip to content

use correct LLVM intrinsic for min/max on floats#153343

Merged
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
RalfJung:min-max-fix
Mar 15, 2026
Merged

use correct LLVM intrinsic for min/max on floats#153343
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
RalfJung:min-max-fix

Conversation

@RalfJung
Copy link
Copy Markdown
Member

@RalfJung RalfJung commented Mar 3, 2026

View all comments

The Rust minnum/maxnum intrinsics are documented to return the other argument when one input is an SNaN. However, the LLVM lowering we currently choose for them does not match those semantics: we lower them to minnum/maxnum, which (since llvm/llvm-project#172012) is documented to non-deterministically return the other argument or NaN when one input is an SNaN.

LLVM does have an intrinsic with the intended semantics: minimumnum/maximumnum. Let's use that instead. We can set the nsz flag since we treat signed zero ordering as non-deterministic.

Also rename the intrinsics to follow the IEEE 2019 naming, since that is mostly (and in particular, as far as NaN are concerned) now what we do. Also, minimum_number and minimum are less easy to mix up than minnum and minimum.

r? @nikic
Cc @tgross35
Fixes #149537
Fixes #151286
(The issues are only fixed when using the latest supported LLVM, but I don't think we usually track problems specific to people compiling rustc with old versions of LLVM)

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 3, 2026

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @oli-obk, @lcnr

Some changes occurred to the CTFE machinery

cc @oli-obk, @lcnr

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 3, 2026
);
// `nsz` in minimumnum/maximumnum is special: its only effect is to make signed-zero
// ordering non-deterministic.
unsafe { llvm::LLVMRustSetNoSignedZeros(call) };
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if the way I wired up nsz is correct.^^

Copy link
Copy Markdown
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me once the questions for other backends are answered.

View changes since this review

if (auto I = dyn_cast<Instruction>(unwrap<Value>(V))) {
I->setHasNoSignedZeros(true);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C bindings have a native LLVMSetFastMathFlags(), we should probably switch to that. But I guess we should do that consistently for the existing LLVMRustSetAlgebraicMath/LLVMRustSetAllowReassoc/LLVMRustSetFastMath as well, so I don't particularly mind this in the meantime.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't know why we have a separate wrapper for each flag configuration here, but I figured I'd follow the existing pattern.

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

There are some odd things happening in CI

2026-03-03T12:31:32.4740066Z rustc-LLVM ERROR: Cannot select: 0xff67d41c22a0: f128 = fcanonicalize nsz 0xff67d41c2a10
2026-03-03T12:31:32.4740629Z   0xff67d41c2a10: f128 = AArch64ISD::CSEL 0xff67d41c5660, 0xff67d41c52e0, Constant:i32<11>, 0xff67d41c2930:1

Why did fcanonicalize end up with nsz? That was meant just for minimumnum.
And also it seems like f128 minimumnum is just broken on aarch64?

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions?

@nikic
Copy link
Copy Markdown
Contributor

nikic commented Mar 3, 2026

That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions?

Ah yes, that's a good point. I believe minimumnum used to have some selection failures that were only fixed in LLVM 22.

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

I guess that makes sense, the test fails on old LLVM where we (have to) use the wrong intrinsic.

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

@bors try jobs=x86_64-gnu,aarch64-gnu

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 3, 2026
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering


try-job: x86_64-gnu
try-job: aarch64-gnu
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

This is very strange, I tested the fallback implementation locally and it passes. Why does it fail on the aarch runner?

And it's also a very strange return value. The inputs are from_bits(0x7fbfffff) and -9.0 and the output is from_bits(0x7fffffff)?!?

@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Mar 3, 2026

☀️ Try build successful (CI)
Build commit: 5fc4d3f (5fc4d3f9142818f2f2b292605ba61c2b9b55f112, parent: 1b7d722f429f09c87b08b757d89c689c6cf7f6e7)

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

now with the commit that always forces the fallback impl to be used
@bors try jobs=x86_64-gnu,aarch64-gnu,x86_64-gnu-gcc

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 3, 2026
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering


try-job: x86_64-gnu
try-job: aarch64-gnu
try-job: x86_64-gnu-gcc
@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Mar 3, 2026

Seems like LLVM 20 straight-up miscompiles code like this

fn minimum_num(x: f32, y: f32) -> f32 {
    if x.is_nan() || y >= x {
        y
    } else {
        // Either y < x or y is a NaN.
        x
    }
}

const SNAN: f32 = f32::from_bits(f32::NAN.to_bits() - 1);

fn main() {
    dbg!(minimum_num(-9.0, std::hint::black_box(SNAN)));
}

I tried this on an aarch64 dev desktop: with Rust 1.87, an optimized build prints NaN, with latest stable Rust, it prints -9.0.

How do we handle library tests that trigger miscompilations on old LLVM versions...? We could just remove the black_box, but -- it'd be a shame to reduce test coverage on newer LLVM just because we also still test old LLVM.

Are we anywhere close to dropping LLVM 20? :D

@rust-bors rust-bors bot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 15, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Mar 15, 2026

💔 Test for ac14465 failed: CI. Failed jobs:

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 15, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@RalfJung
Copy link
Copy Markdown
Member Author

Well, let's just land it with the fallback impl for LLVM 21 then, since minimumnum apparently still has issues. #153866 landed so this should be good to go.

@bors r=nikic rollup=never

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Mar 15, 2026

📋 This PR cannot be approved because it currently has the following label: S-blocked.

@RalfJung RalfJung removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-blocked Status: Blocked on something else such as an RFC or other implementation work. labels Mar 15, 2026
@RalfJung
Copy link
Copy Markdown
Member Author

@bors r=nikic rollup=never

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Mar 15, 2026

📌 Commit c7220f4 has been approved by nikic

It is now in the queue for this repository.

@rust-bors rust-bors bot added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Mar 15, 2026
@rust-bors

This comment has been minimized.

@rust-bors rust-bors bot added merged-by-bors This PR was explicitly merged by bors. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 15, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Mar 15, 2026

☀️ Test successful - CI
Approved by: nikic
Duration: 3h 9m 54s
Pushing f125037 to main...

@rust-bors rust-bors bot merged commit f125037 into rust-lang:main Mar 15, 2026
12 checks passed
@rustbot rustbot added this to the 1.96.0 milestone Mar 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 9e973d8 (parent) -> f125037 (this PR)

Test differences

Show 17 test diffs

Stage 1

  • [ui] tests/ui/float/minmax.rs: [missing] -> ignore (ignored when the LLVM version 21.1.2 is older than 22.0.0) (J2)

Stage 2

  • [ui] tests/ui/float/minmax.rs: [missing] -> ignore (ignored when the LLVM version 21.1.2 is older than 22.0.0) (J0)
  • [ui] tests/ui/float/minmax.rs: [missing] -> pass (J1)

Additionally, 14 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard f125037ccddbeb162bce09213548314988da97a6 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-various-2: 39m 12s -> 45m 8s (+15.1%)
  2. test-various: 1h 51m -> 2h 8m (+14.6%)
  3. pr-check-2: 40m 19s -> 45m 19s (+12.4%)
  4. x86_64-gnu-llvm-21-2: 1h 31m -> 1h 42m (+12.0%)
  5. aarch64-gnu-llvm-21-2: 47m 15s -> 52m 44s (+11.6%)
  6. tidy: 2m 30s -> 2m 45s (+10.1%)
  7. optional-x86_64-gnu-parallel-frontend: 2h 32m -> 2h 47m (+9.6%)
  8. aarch64-gnu: 2h 5m -> 2h 17m (+9.2%)
  9. aarch64-gnu-debug: 1h 9m -> 1h 15m (+8.0%)
  10. dist-s390x-linux: 1h 34m -> 1h 27m (-7.5%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (f125037): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary 1.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.4% [0.8%, 1.9%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results (primary -2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.8% [-2.8%, -2.8%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.8% [-2.8%, -2.8%] 1

Binary size

Results (primary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.2%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.1%, 0.2%] 4

Bootstrap: 484.216s -> 479.863s (-0.90%)
Artifact size: 394.84 MiB -> 394.80 MiB (-0.01%)

@RalfJung RalfJung deleted the min-max-fix branch March 18, 2026 07:07
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Mar 28, 2026
…alebzulawski,antoyo

simd_fmin/fmax: make semantics and name consistent with scalar intrinsics

This is the SIMD version of rust-lang#153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.

In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.

In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (rust-lang/stdarch#2056).

Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?

Also see rust-lang#153395.
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Mar 28, 2026
…alebzulawski,antoyo

simd_fmin/fmax: make semantics and name consistent with scalar intrinsics

This is the SIMD version of rust-lang#153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.

In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.

In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (rust-lang/stdarch#2056).

Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?

Also see rust-lang#153395.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Mar 28, 2026
…alebzulawski,antoyo

simd_fmin/fmax: make semantics and name consistent with scalar intrinsics

This is the SIMD version of rust-lang#153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.

In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.

In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (rust-lang/stdarch#2056).

Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?

Also see rust-lang#153395.
rust-timer added a commit that referenced this pull request Mar 29, 2026
Rollup merge of #154043 - RalfJung:simd-min-max, r=Amanieu,calebzulawski,antoyo

simd_fmin/fmax: make semantics and name consistent with scalar intrinsics

This is the SIMD version of #153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.

In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.

In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (rust-lang/stdarch#2056).

Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?

Also see #153395.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Mar 29, 2026
…ski,antoyo

simd_fmin/fmax: make semantics and name consistent with scalar intrinsics

This is the SIMD version of rust-lang/rust#153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.

In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.

In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (rust-lang/stdarch#2056).

Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?

Also see rust-lang/rust#153395.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. merged-by-bors This PR was explicitly merged by bors. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

8 participants