Skip to content

Conversation

@preames
Copy link
Collaborator

@preames preames commented Oct 30, 2025

We have two mechanisms used in inline spilled for folding a load into a consuming instruction. One is used for stack reloads, the other for other load instructions (usually argument loads). We currently only implement optimizations for the first case, but stackmaps have generic support in target independent code for the other. We can go ahead and set the flag to enable that optimization.

The primary motivation for this is that if we enable load rematerialization without it, we run into crashes where we can't make progress through rematerialization.

We probably should enable the other foldMemoryOperand hook for RISCV specific instructions, but that's separate optimization.

We have two mechanisms used in inline spilled for folding a load into a consuming instruction. One is used for stack reloads, the other for other load instructions (usually argument loads). We currently only implement optimizations for the first case, but stackmaps have generic support in target independent code for the other. We can go ahead and set the flag to enable that optimization. The primary motivation for this is that if we enable load rematerialization without it, we run into crashes where we can't make progress through rematerialization. We probably should enable the other foldMemoryOperand hook for RISCV specific instructions, but that's separate optimization.
@llvmbot
Copy link
Member

llvmbot commented Oct 30, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

We have two mechanisms used in inline spilled for folding a load into a consuming instruction. One is used for stack reloads, the other for other load instructions (usually argument loads). We currently only implement optimizations for the first case, but stackmaps have generic support in target independent code for the other. We can go ahead and set the flag to enable that optimization.

The primary motivation for this is that if we enable load rematerialization without it, we run into crashes where we can't make progress through rematerialization.

We probably should enable the other foldMemoryOperand hook for RISCV specific instructions, but that's separate optimization.


Full diff: https://github.com/llvm/llvm-project/pull/165761.diff

5 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+3-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoD.td (+1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoF.td (+1)
  • (modified) llvm/test/CodeGen/RISCV/rv64-stackmap.ll (+4-4)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 912b82d294f44..069f2b0c1e885 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -869,7 +869,7 @@ std::optional<unsigned> getFoldedOpcode(MachineFunction &MF, MachineInstr &MI, } } -// This is the version used during inline spilling +// This is the version used during InlineSpiller::spillAroundUses MachineInstr *RISCVInstrInfo::foldMemoryOperandImpl( MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops, MachineBasicBlock::iterator InsertPt, int FrameIndex, LiveIntervals *LIS, diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td index 7c89686ebfb3c..9cb53fb27a2d2 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td @@ -768,7 +768,7 @@ def BGE : BranchCC_rri<0b101, "bge">; def BLTU : BranchCC_rri<0b110, "bltu">; def BGEU : BranchCC_rri<0b111, "bgeu">; -let IsSignExtendingOpW = 1 in { +let IsSignExtendingOpW = 1, canFoldAsLoad = 1 in { def LB : Load_ri<0b000, "lb">, Sched<[WriteLDB, ReadMemBase]>; def LH : Load_ri<0b001, "lh">, Sched<[WriteLDH, ReadMemBase]>; def LW : Load_ri<0b010, "lw">, Sched<[WriteLDW, ReadMemBase]>; @@ -889,8 +889,10 @@ def CSRRCI : CSR_ii<0b111, "csrrci">; /// RV64I instructions let Predicates = [IsRV64] in { +let canFoldAsLoad = 1 in { def LWU : Load_ri<0b110, "lwu">, Sched<[WriteLDW, ReadMemBase]>; def LD : Load_ri<0b011, "ld">, Sched<[WriteLDD, ReadMemBase]>; +} def SD : Store_rri<0b011, "sd">, Sched<[WriteSTD, ReadStoreData, ReadMemBase]>; let IsSignExtendingOpW = 1 in { diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoD.td b/llvm/lib/Target/RISCV/RISCVInstrInfoD.td index afac37d6337d4..4ffe3e62ac501 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfoD.td +++ b/llvm/lib/Target/RISCV/RISCVInstrInfoD.td @@ -71,6 +71,7 @@ defvar DExtsRV64 = [DExt, ZdinxExt]; //===----------------------------------------------------------------------===// let Predicates = [HasStdExtD] in { +let canFoldAsLoad = 1 in def FLD : FPLoad_r<0b011, "fld", FPR64, WriteFLD64>; // Operands for stores are in the order srcreg, base, offset rather than diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoF.td b/llvm/lib/Target/RISCV/RISCVInstrInfoF.td index 6571d998246a7..b30f8ec820c15 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfoF.td +++ b/llvm/lib/Target/RISCV/RISCVInstrInfoF.td @@ -330,6 +330,7 @@ class PseudoFROUND<DAGOperand Ty, ValueType vt, ValueType intvt = XLenVT> //===----------------------------------------------------------------------===// let Predicates = [HasStdExtF] in { +let canFoldAsLoad = 1 in def FLW : FPLoad_r<0b010, "flw", FPR32, WriteFLD32>; // Operands for stores are in the order srcreg, base, offset rather than diff --git a/llvm/test/CodeGen/RISCV/rv64-stackmap.ll b/llvm/test/CodeGen/RISCV/rv64-stackmap.ll index c3183a1a3e036..9aefa90684dd3 100644 --- a/llvm/test/CodeGen/RISCV/rv64-stackmap.ll +++ b/llvm/test/CodeGen/RISCV/rv64-stackmap.ll @@ -38,8 +38,8 @@ ; CHECK-NEXT: .quad liveConstant ; CHECK-NEXT: .quad 0 ; CHECK-NEXT: .quad 1 -; CHECK-NEXT: .quad spilledValue -; CHECK-NEXT: .quad 144 +; CHECK-NEXT: .quad liveArgs +; CHECK-NEXT: .quad 0 ; CHECK-NEXT: .quad 1 ; CHECK-NEXT: .quad directFrameIdx ; CHECK-NEXT: .quad 48 @@ -278,7 +278,7 @@ define void @liveConstant() { ; ; Verify 28 stack map entries. ; -; CHECK-LABEL: .word .L{{.*}}-spilledValue +; CHECK-LABEL: .word .L{{.*}}-liveArgs ; CHECK-NEXT: .half 0 ; CHECK-NEXT: .half 28 ; @@ -290,7 +290,7 @@ define void @liveConstant() { ; CHECK-NEXT: .half 2 ; CHECK-NEXT: .half 0 ; CHECK-NEXT: .word -define void @spilledValue(i64 %arg0, i64 %arg1, i64 %arg2, i64 %arg3, i64 %arg4, i64 %l0, i64 %l1, i64 %l2, i64 %l3, i64 %l4, i64 %l5, i64 %l6, i64 %l7, i64 %l8, i64 %l9, i64 %l10, i64 %l11, i64 %l12, i64 %l13, i64 %l14, i64 %l15, i64 %l16, i64 %l17, i64 %l18, i64 %l19, i64 %l20, i64 %l21, i64 %l22, i64 %l23, i64 %l24, i8 %l25, i16 zeroext %l26, i32 signext %l27) { +define void @liveArgs(i64 %arg0, i64 %arg1, i64 %arg2, i64 %arg3, i64 %arg4, i64 %l0, i64 %l1, i64 %l2, i64 %l3, i64 %l4, i64 %l5, i64 %l6, i64 %l7, i64 %l8, i64 %l9, i64 %l10, i64 %l11, i64 %l12, i64 %l13, i64 %l14, i64 %l15, i64 %l16, i64 %l17, i64 %l18, i64 %l19, i64 %l20, i64 %l21, i64 %l22, i64 %l23, i64 %l24, i8 %l25, i16 zeroext %l26, i32 signext %l27) { entry: call void (i64, i32, ptr, i32, ...) @llvm.experimental.patchpoint.void(i64 11, i32 28, ptr null, i32 5, i64 %arg0, i64 %arg1, i64 %arg2, i64 %arg3, i64 %arg4, i64 %l0, i64 %l1, i64 %l2, i64 %l3, i64 %l4, i64 %l5, i64 %l6, i64 %l7, i64 %l8, i64 %l9, i64 %l10, i64 %l11, i64 %l12, i64 %l13, i64 %l14, i64 %l15, i64 %l16, i64 %l17, i64 %l18, i64 %l19, i64 %l20, i64 %l21, i64 %l22, i64 %l23, i64 %l24, i8 %l25, i16 %l26, i32 %l27) ret void 
@preames preames requested review from lukel97 and topperc October 30, 2025 18:24
Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@preames preames enabled auto-merge (squash) October 31, 2025 14:10
@preames preames merged commit e72876a into llvm:main Oct 31, 2025
7 of 9 checks passed
@topperc
Copy link
Collaborator

topperc commented Oct 31, 2025

Was the title supposed to say "Mark" instead of "Mask"?

@preames
Copy link
Collaborator Author

preames commented Oct 31, 2025

Was the title supposed to say "Mark" instead of "Mask"?

Yep, that was a typo.

DEBADRIBASAK pushed a commit to DEBADRIBASAK/llvm-project that referenced this pull request Nov 3, 2025
…lvm#165761) We have two mechanisms used in inline spilled for folding a load into a consuming instruction. One is used for stack reloads, the other for other load instructions (usually argument loads). We currently only implement optimizations for the first case, but stackmaps have generic support in target independent code for the other. We can go ahead and set the flag to enable that optimization. The primary motivation for this is that if we enable load rematerialization without it, we run into crashes where we can't make progress through rematerialization. We probably should enable the other foldMemoryOperand hook for RISCV specific instructions, but that's a separate optimization.
lukel97 added a commit that referenced this pull request Nov 24, 2025
In some workloads we see an argument passed on the stack where it is loaded, only for it to be immediately spilled to a different slot on the stack and then reloaded from that spill slot later on. We can avoid the unnecessary spill by marking loads as rematerializable and just directly loading from where the argument was originally passed on the stack. TargetTransformInfo::isReMaterializableImpl checks to make sure that any loads are `MI.isDereferenceableInvariantLoad()`, so we should be able to move the load down to the remat site. This gives a 14.8% reduction in spills in 544.nab_r on rva23u64 -O3, and a few other smaller reductions on llvm-test-suite. I didn't find any benchmarks where the number of spills/reloads increased. Related: #165761
aadeshps-mcw pushed a commit to aadeshps-mcw/llvm-project that referenced this pull request Nov 26, 2025
In some workloads we see an argument passed on the stack where it is loaded, only for it to be immediately spilled to a different slot on the stack and then reloaded from that spill slot later on. We can avoid the unnecessary spill by marking loads as rematerializable and just directly loading from where the argument was originally passed on the stack. TargetTransformInfo::isReMaterializableImpl checks to make sure that any loads are `MI.isDereferenceableInvariantLoad()`, so we should be able to move the load down to the remat site. This gives a 14.8% reduction in spills in 544.nab_r on rva23u64 -O3, and a few other smaller reductions on llvm-test-suite. I didn't find any benchmarks where the number of spills/reloads increased. Related: llvm#165761
Priyanshu3820 pushed a commit to Priyanshu3820/llvm-project that referenced this pull request Nov 26, 2025
In some workloads we see an argument passed on the stack where it is loaded, only for it to be immediately spilled to a different slot on the stack and then reloaded from that spill slot later on. We can avoid the unnecessary spill by marking loads as rematerializable and just directly loading from where the argument was originally passed on the stack. TargetTransformInfo::isReMaterializableImpl checks to make sure that any loads are `MI.isDereferenceableInvariantLoad()`, so we should be able to move the load down to the remat site. This gives a 14.8% reduction in spills in 544.nab_r on rva23u64 -O3, and a few other smaller reductions on llvm-test-suite. I didn't find any benchmarks where the number of spills/reloads increased. Related: llvm#165761
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants