Skip to content

MIPS/expandAtomicBinOp: Remove tailing kill dead register operands#186055

Merged
wzssyqa merged 1 commit intollvm:mainfrom
wzssyqa:mips-compiler-rt-atomic
Mar 13, 2026
Merged

MIPS/expandAtomicBinOp: Remove tailing kill dead register operands#186055
wzssyqa merged 1 commit intollvm:mainfrom
wzssyqa:mips-compiler-rt-atomic

Conversation

@wzssyqa
Copy link
Copy Markdown
Contributor

@wzssyqa wzssyqa commented Mar 12, 2026

Some trailing kill/dead register operands may added by MachineInstr::addRegisterKilled or MachineInstr::addRegisterDead, which uses the overlap registers same with the operand 1-4.

Let's remove them here as only 5 operands are assert existing.

Some trailing kill/dead register operands may added by
MachineInstr::addRegisterKilled or MachineInstr::addRegisterDead,
which uses the overlap registers same with the operand 1-4.

Let's remove them here as only 5 operands are assert existing.
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 12, 2026

@llvm/pr-subscribers-backend-mips

Author: YunQiang Su (wzssyqa)

Changes

Some trailing kill/dead register operands may added by MachineInstr::addRegisterKilled or MachineInstr::addRegisterDead, which uses the overlap registers same with the operand 1-4.

Let's remove them here as only 5 operands are assert existing.


Full diff: https://github.com/llvm/llvm-project/pull/186055.diff

2 Files Affected:

  • (modified) llvm/lib/Target/Mips/MipsExpandPseudo.cpp (+27)
  • (added) llvm/test/CodeGen/Mips/atomic-min-max-LiveVariables.ll (+131)
diff --git a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
index 78f2e5db40f9d..b39eb9863b042 100644
--- a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
+++ b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
@@ -882,6 +882,33 @@ bool MipsExpandPseudo::expandAtomicBinOp(MachineBasicBlock &BB,
   assert((OldVal != Ptr) && "Clobbered the wrong ptr reg!");
   assert((OldVal != Incr) && "Clobbered the wrong reg!");
   if (IsMin || IsMax) {
+    // Remove trailing kill/dead register operands added by
+    // MachineInstr::addRegisterKilled. These are super-register markers that
+    // must correspond to one of the physical registers in operands 1-4.
+    // The kill/dead markers may also appear on preceding subregs.
+    const TargetRegisterInfo *TRI = STI->getRegisterInfo();
+    while (I->getNumOperands() > 5) {
+      auto &Op = I->getOperand(I->getNumOperands() - 1);
+      // Check if this register overlaps with any physical register in
+      // operands 1-4 that has kill/dead marker (i.e., it's a super-register
+      // marker for subregs).
+      bool HasOverlapWithKill = false;
+      for (unsigned i = 1; i <= 4; ++i) {
+        auto &RefOp = I->getOperand(i);
+        if (RefOp.isReg() && RefOp.getReg().isPhysical() &&
+            TRI->regsOverlap(Op.getReg(), RefOp.getReg()) &&
+            (RefOp.isKill() || RefOp.isDead())) {
+          HasOverlapWithKill = true;
+          break;
+        }
+      }
+      // Remove if HasOverlapWithKill is true or Op has kill/dead marker.
+      bool HasKillOrDead = Op.isReg() && Op.getReg().isPhysical() &&
+                           (Op.isKill() || Op.isDead());
+      if (!HasOverlapWithKill && !HasKillOrDead)
+        break;
+      I->removeOperand(I->getNumOperands() - 1);
+    }
 
     assert(I->getNumOperands() == 5 &&
            "Atomics min|max|umin|umax use an additional register");
diff --git a/llvm/test/CodeGen/Mips/atomic-min-max-LiveVariables.ll b/llvm/test/CodeGen/Mips/atomic-min-max-LiveVariables.ll
new file mode 100644
index 0000000000000..d07c7f61ece41
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/atomic-min-max-LiveVariables.ll
@@ -0,0 +1,131 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=mips64-elf -O2 -mcpu=mips64r2 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R2-O2
+; RUN: llc -mtriple=mips64-elf -O3 -mcpu=mips64r2 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R2-O3
+; RUN: llc -mtriple=mips64el-elf -O2 -mcpu=mips64r2 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R2-O2
+; RUN: llc -mtriple=mips64el-elf -O3 -mcpu=mips64r2 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R2-O3
+; RUN: llc -mtriple=mips64-elf -O2 -mcpu=mips64r6 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R6-O2
+; RUN: llc -mtriple=mips64-elf -O3 -mcpu=mips64r6 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R6-O3
+; RUN: llc -mtriple=mips64el-elf -O2 -mcpu=mips64r6 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R6-O2
+; RUN: llc -mtriple=mips64el-elf -O3 -mcpu=mips64r6 -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=R6-O3
+
+declare void @exit(i32) noreturn
+
+define void @test_func(ptr %ptr, i32 %val, i1 %cond) nounwind {
+; R2-O2-LABEL: test_func:
+; R2-O2:       # %bb.0: # %entry
+; R2-O2-NEXT:    move $2, $4
+; R2-O2-NEXT:    sll $1, $6, 0
+; R2-O2-NEXT:    andi $1, $1, 1
+; R2-O2-NEXT:    beqz $1, .LBB0_4
+; R2-O2-NEXT:    sll $4, $5, 0
+; R2-O2-NEXT:  # %bb.1: # %bb5
+; R2-O2-NEXT:    .insn
+; R2-O2-NEXT:  .LBB0_2: # %bb5
+; R2-O2-NEXT:    # =>This Inner Loop Header: Depth=1
+; R2-O2-NEXT:    ll $1, 0($2)
+; R2-O2-NEXT:    sltu $5, $1, $4
+; R2-O2-NEXT:    move $3, $1
+; R2-O2-NEXT:    movn $3, $4, $5
+; R2-O2-NEXT:    sc $3, 0($2)
+; R2-O2-NEXT:    beqz $3, .LBB0_2
+; R2-O2-NEXT:    nop
+; R2-O2-NEXT:  # %bb.3: # %bb5
+; R2-O2-NEXT:    jr $ra
+; R2-O2-NEXT:    nop
+; R2-O2-NEXT:  .LBB0_4: # %bb4
+; R2-O2-NEXT:    daddiu $sp, $sp, -16
+; R2-O2-NEXT:    sd $ra, 8($sp) # 8-byte Folded Spill
+; R2-O2-NEXT:    jal exit
+; R2-O2-NEXT:    nop
+;
+; R2-O3-LABEL: test_func:
+; R2-O3:       # %bb.0: # %entry
+; R2-O3-NEXT:    sll $1, $6, 0
+; R2-O3-NEXT:    move $2, $4
+; R2-O3-NEXT:    andi $1, $1, 1
+; R2-O3-NEXT:    beqz $1, .LBB0_4
+; R2-O3-NEXT:    sll $4, $5, 0
+; R2-O3-NEXT:  # %bb.1: # %bb5
+; R2-O3-NEXT:    .insn
+; R2-O3-NEXT:  .LBB0_2: # %bb5
+; R2-O3-NEXT:    # =>This Inner Loop Header: Depth=1
+; R2-O3-NEXT:    ll $1, 0($2)
+; R2-O3-NEXT:    sltu $5, $1, $4
+; R2-O3-NEXT:    move $3, $1
+; R2-O3-NEXT:    movn $3, $4, $5
+; R2-O3-NEXT:    sc $3, 0($2)
+; R2-O3-NEXT:    beqz $3, .LBB0_2
+; R2-O3-NEXT:    nop
+; R2-O3-NEXT:  # %bb.3: # %bb5
+; R2-O3-NEXT:    jr $ra
+; R2-O3-NEXT:    nop
+; R2-O3-NEXT:  .LBB0_4: # %bb4
+; R2-O3-NEXT:    daddiu $sp, $sp, -16
+; R2-O3-NEXT:    sd $ra, 8($sp) # 8-byte Folded Spill
+; R2-O3-NEXT:    jal exit
+; R2-O3-NEXT:    nop
+;
+; R6-O2-LABEL: test_func:
+; R6-O2:       # %bb.0: # %entry
+; R6-O2-NEXT:    move $2, $4
+; R6-O2-NEXT:    sll $1, $6, 0
+; R6-O2-NEXT:    andi $1, $1, 1
+; R6-O2-NEXT:    beqz $1, .LBB0_4
+; R6-O2-NEXT:    sll $4, $5, 0
+; R6-O2-NEXT:  # %bb.1: # %bb5
+; R6-O2-NEXT:    .insn
+; R6-O2-NEXT:  .LBB0_2: # %bb5
+; R6-O2-NEXT:    # =>This Inner Loop Header: Depth=1
+; R6-O2-NEXT:    ll $1, 0($2)
+; R6-O2-NEXT:    sltu $5, $1, $4
+; R6-O2-NEXT:    seleqz $3, $1, $5
+; R6-O2-NEXT:    selnez $5, $4, $5
+; R6-O2-NEXT:    or $3, $3, $5
+; R6-O2-NEXT:    sc $3, 0($2)
+; R6-O2-NEXT:    beqzc $3, .LBB0_2
+; R6-O2-NEXT:    nop
+; R6-O2-NEXT:  # %bb.3: # %bb5
+; R6-O2-NEXT:    jrc $ra
+; R6-O2-NEXT:  .LBB0_4: # %bb4
+; R6-O2-NEXT:    daddiu $sp, $sp, -16
+; R6-O2-NEXT:    sd $ra, 8($sp) # 8-byte Folded Spill
+; R6-O2-NEXT:    jal exit
+; R6-O2-NEXT:    nop
+;
+; R6-O3-LABEL: test_func:
+; R6-O3:       # %bb.0: # %entry
+; R6-O3-NEXT:    sll $1, $6, 0
+; R6-O3-NEXT:    move $2, $4
+; R6-O3-NEXT:    andi $1, $1, 1
+; R6-O3-NEXT:    beqz $1, .LBB0_4
+; R6-O3-NEXT:    sll $4, $5, 0
+; R6-O3-NEXT:  # %bb.1: # %bb5
+; R6-O3-NEXT:    .insn
+; R6-O3-NEXT:  .LBB0_2: # %bb5
+; R6-O3-NEXT:    # =>This Inner Loop Header: Depth=1
+; R6-O3-NEXT:    ll $1, 0($2)
+; R6-O3-NEXT:    sltu $5, $1, $4
+; R6-O3-NEXT:    seleqz $3, $1, $5
+; R6-O3-NEXT:    selnez $5, $4, $5
+; R6-O3-NEXT:    or $3, $3, $5
+; R6-O3-NEXT:    sc $3, 0($2)
+; R6-O3-NEXT:    beqzc $3, .LBB0_2
+; R6-O3-NEXT:    nop
+; R6-O3-NEXT:  # %bb.3: # %bb5
+; R6-O3-NEXT:    jrc $ra
+; R6-O3-NEXT:  .LBB0_4: # %bb4
+; R6-O3-NEXT:    daddiu $sp, $sp, -16
+; R6-O3-NEXT:    sd $ra, 8($sp) # 8-byte Folded Spill
+; R6-O3-NEXT:    jal exit
+; R6-O3-NEXT:    nop
+entry:
+  br i1 %cond, label %bb5, label %bb4
+
+bb4:
+  call void @exit(i32 %val)
+  unreachable
+
+bb5:
+  %old = atomicrmw umax ptr %ptr, i32 %val monotonic, align 4
+  ret void
+}

@wzssyqa wzssyqa added this to the LLVM 22.x Release milestone Mar 13, 2026
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Mar 13, 2026
@wzssyqa wzssyqa merged commit d2c6e4c into llvm:main Mar 13, 2026
12 checks passed
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status Mar 13, 2026
@wzssyqa
Copy link
Copy Markdown
Contributor Author

wzssyqa commented Mar 13, 2026

/cherry-pick d2c6e4c

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 13, 2026

/pull-request #186288

@llvm-ci
Copy link
Copy Markdown

llvm-ci commented Mar 13, 2026

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime-2 running on rocm-worker-hw-02 while building llvm at step 10 "Add check check-libc-amdgcn-amd-amdhsa".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/24463

Here is the relevant piece of the build log for the reference
Step 10 (Add check check-libc-amdgcn-amd-amdhsa) failure: test (failure)
...
[ RUN      ] LlvmLibcStrtoumaxTest.DecodeInOtherBases
[       OK ] LlvmLibcStrtoumaxTest.DecodeInOtherBases (70 ms)
[ RUN      ] LlvmLibcStrtoumaxTest.CleanBaseSixteenDecode
[       OK ] LlvmLibcStrtoumaxTest.CleanBaseSixteenDecode (45 us)
[ RUN      ] LlvmLibcStrtoumaxTest.MessyBaseSixteenDecode
[       OK ] LlvmLibcStrtoumaxTest.MessyBaseSixteenDecode (9 us)
[ RUN      ] LlvmLibcStrtoumaxTest.AutomaticBaseSelection
[       OK ] LlvmLibcStrtoumaxTest.AutomaticBaseSelection (18 us)
Ran 7 tests.  PASS: 7  FAIL: 0
[3300/3335] Running hermetic test libc.test.src.inttypes.imaxabs_test.__hermetic__
FAILED: libc/test/src/inttypes/libc.test.src.inttypes.imaxabs_test.__hermetic__.__cmd__ /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-amdgcn-amd-amdhsa-bins/libc/test/src/inttypes/libc.test.src.inttypes.imaxabs_test.__hermetic__.__cmd__ 
cd /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-amdgcn-amd-amdhsa-bins/libc/test/src/inttypes && /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/bin/llvm-gpu-loader /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-amdgcn-amd-amdhsa-bins/libc/test/src/inttypes/libc.test.src.inttypes.imaxabs_test.__hermetic__.__build__
[==========] Running 3 tests from 1 test suite.
[ RUN      ] LlvmLibcImaxAbsTest.Zero
[       OK ] LlvmLibcImaxAbsTest.Zero (1 us)
[ RUN      ] LlvmLibcImaxAbsTest.Positive
[       OK ] LlvmLibcImaxAbsTest.Positive (1 us)
[ RUN      ] LlvmLibcImaxAbsTest.Negative
[       OK ] LlvmLibcImaxAbsTest.Negative (1 us)
Ran 3 tests.  PASS: 3  FAIL: 0
[3301/3335] Running hermetic test libc.test.src.locale.locale_test.__hermetic__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcLocale.DefaultLocale
[       OK ] LlvmLibcLocale.DefaultLocale (2 us)
Ran 1 tests.  PASS: 1  FAIL: 0
[3302/3335] Running hermetic test libc.test.src.locale.localeconv_test.__hermetic__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcLocale.DefaultLocale
[       OK ] LlvmLibcLocale.DefaultLocale (2 us)
Ran 1 tests.  PASS: 1  FAIL: 0
[3303/3335] Running hermetic test libc.test.src.time.nanosleep_test.__hermetic__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcNanosleep.SmokeTest
[       OK ] LlvmLibcNanosleep.SmokeTest (1000 ms)
Ran 1 tests.  PASS: 1  FAIL: 0
ninja: build stopped: subcommand failed.
FAILED: runtimes/CMakeFiles/check-libc-amdgcn-amd-amdhsa /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/CMakeFiles/check-libc-amdgcn-amd-amdhsa 
cd /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-amdgcn-amd-amdhsa-bins && /usr/bin/cmake --build /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-amdgcn-amd-amdhsa-bins/ --target check-libc --config Release
ninja: build stopped: subcommand failed.

@alexrp
Copy link
Copy Markdown
Member

alexrp commented Mar 13, 2026

Does this fix #179088?

@wzssyqa
Copy link
Copy Markdown
Contributor Author

wzssyqa commented Mar 13, 2026

Oh, yes, it is.

albertbolt1 pushed a commit to albertbolt1/llvm-project that referenced this pull request Mar 13, 2026
…lvm#186055)

Some trailing kill/dead register operands may added by
MachineInstr::addRegisterKilled or MachineInstr::addRegisterDead, which
uses the overlap registers same with the operand 1-4.

Let's remove them here as only 5 operands are assert existing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants