[MachineScheduler][AMDGPU] Allow scheduling of single-MI regions #128739

lucas-rami · 2025-02-25T16:31:25Z

The MI scheduler skips regions containing a single MI during scheduling. This can prevent targets that perform multi-stage scheduling and move MIs between regions during some stages to reason correctly about the entire IR, since some MIs will not be assigned to a region at the beginning.

This makes the machine scheduler no longer skip single-MI regions. Only a few unit tests are affected (mainly those which check for the scheduler's debug output).

llvmbot · 2025-02-25T16:31:51Z

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-backend-amdgpu

Author: Lucas Ramirez (lucas-rami)

Changes

The MI scheduler skips regions containing a single MI during scheduling. This can prevent targets that perform multi-stage scheduling and move MIs between regions during some stages to reason correctly about the entire IR, since some MIs will not be assigned to a region at the beginning.

This adds a flag to ScheduleDAGInstrs to tell the scheduler to not skip over single-MI regions. Only the AMDGPU target currently enables this flag, so scheduling behavior is unaffected for all other targets.

Full diff: https://github.com/llvm/llvm-project/pull/128739.diff

5 Files Affected:

(modified) llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h (+8)
(modified) llvm/lib/CodeGen/MachineScheduler.cpp (+2-1)
(modified) llvm/lib/CodeGen/ScheduleDAGInstrs.cpp (+3-2)
(modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp (+4-1)
(modified) llvm/test/CodeGen/AMDGPU/debug-value-scheduler-liveins.mir (+2)

diff --git a/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h b/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h index aaa10e684687c..82240745c2772 100644 --- a/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h +++ b/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h @@ -124,6 +124,9 @@ namespace llvm { /// rescheduling). bool RemoveKillFlags; + /// True if regions with a single MI should be scheduled. + bool ScheduleSingleMIRegions = false; + /// The standard DAG builder does not normally include terminators as DAG /// nodes because it does not create the necessary dependencies to prevent /// reordering. A specialized scheduler can override @@ -288,6 +291,11 @@ namespace llvm { return Topo.IsReachable(SU, TargetSU); } + /// Whether regions with a single MI should be scheduled. + bool shouldScheduleSingleMIRegions() const { + return ScheduleSingleMIRegions; + } + /// Returns an iterator to the top of the current scheduling region. MachineBasicBlock::iterator begin() const { return RegionBegin; } diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp index 0da7535031a7d..b9903ee832d31 100644 --- a/llvm/lib/CodeGen/MachineScheduler.cpp +++ b/llvm/lib/CodeGen/MachineScheduler.cpp @@ -770,6 +770,7 @@ void MachineSchedulerBase::scheduleRegions(ScheduleDAGInstrs &Scheduler, MBBRegionsVector MBBRegions; getSchedRegions(&*MBB, MBBRegions, Scheduler.doMBBSchedRegionsTopDown()); + bool ScheduleSingleMI = Scheduler.shouldScheduleSingleMIRegions(); for (const SchedRegion &R : MBBRegions) { MachineBasicBlock::iterator I = R.RegionBegin; MachineBasicBlock::iterator RegionEnd = R.RegionEnd; @@ -780,7 +781,7 @@ void MachineSchedulerBase::scheduleRegions(ScheduleDAGInstrs &Scheduler, Scheduler.enterRegion(&*MBB, I, RegionEnd, NumRegionInstrs); // Skip empty scheduling regions (0 or 1 schedulable instructions). - if (I == RegionEnd || I == std::prev(RegionEnd)) { + if (I == RegionEnd || (!ScheduleSingleMI && I == std::prev(RegionEnd))) { // Close the current region. Bundle the terminator if needed. // This invalidates 'RegionEnd' and 'I'. Scheduler.exitRegion(); diff --git a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp index a26804707dd1f..cd652659dfdef 100644 --- a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp +++ b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp @@ -116,8 +116,9 @@ ScheduleDAGInstrs::ScheduleDAGInstrs(MachineFunction &mf, bool RemoveKillFlags) : ScheduleDAG(mf), MLI(mli), MFI(mf.getFrameInfo()), RemoveKillFlags(RemoveKillFlags), - UnknownValue(UndefValue::get( - Type::getVoidTy(mf.getFunction().getContext()))), Topo(SUnits, &ExitSU) { + UnknownValue( + UndefValue::get(Type::getVoidTy(mf.getFunction().getContext()))), + Topo(SUnits, &ExitSU) { DbgValues.clear(); const TargetSubtargetInfo &ST = mf.getSubtarget(); diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp index 176586e3fbbb6..dbab18b7ae46f 100644 --- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp +++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp @@ -759,7 +759,10 @@ GCNScheduleDAGMILive::GCNScheduleDAGMILive( MFI(*MF.getInfo<SIMachineFunctionInfo>()), StartingOccupancy(MFI.getOccupancy()), MinOccupancy(StartingOccupancy), RegionLiveOuts(this, /*IsLiveOut=*/true) { - + // We want regions with a single MI to be scheduled so that we can reason on + // them correctlt during scheduling stages that move MIs between regions (e.g. + // rematerialization). + ScheduleSingleMIRegions = true; LLVM_DEBUG(dbgs() << "Starting occupancy is " << StartingOccupancy << ".\n"); if (RelaxedOcc) { MinOccupancy = std::min(MFI.getMinAllowedOccupancy(), StartingOccupancy); diff --git a/llvm/test/CodeGen/AMDGPU/debug-value-scheduler-liveins.mir b/llvm/test/CodeGen/AMDGPU/debug-value-scheduler-liveins.mir index d415346b49b28..2a08c52e447ba 100644 --- a/llvm/test/CodeGen/AMDGPU/debug-value-scheduler-liveins.mir +++ b/llvm/test/CodeGen/AMDGPU/debug-value-scheduler-liveins.mir @@ -2,6 +2,8 @@ # RUN: llc -mtriple=amdgcn -mcpu=gfx908 -passes=machine-scheduler %s -o - -debug-only=machine-scheduler 2>&1 | FileCheck %s # REQUIRES: asserts +# CHECK: ********** MI Scheduling ********** +# CHECK-NEXT: test_get_liveins:%bb.0 # CHECK: ********** MI Scheduling ********** # CHECK-NEXT: test_get_liveins:%bb.1 # CHECK: Region live-in pressure: VGPRs: 1 AGPRs: 0, SGPRs: 0, LVGPR WT: 0, LSGPR WT: 0

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp

Co-authored-by: Jay Foad <jay.foad@gmail.com>

arsenm

Testcase where this makes a difference?

arsenm · 2025-02-26T03:21:29Z

llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h

 bool RemoveKillFlags;

+ /// True if regions with a single MI should be scheduled.
+ bool ScheduleSingleMIRegions = false;


Over-configuration? Can you just unconditionally do this?

I removed the flag and made this the generic scheduler behavior. The only unit test that breaks in a way I cannot really understand is misched-branch-targets.mir. I "fixed" it anyway, but not sure if the change break something here.

shiltian

Am I looking at the right diff here? The only code change shown here is the removal of I == std::prev(RegionEnd).

lucas-rami · 2025-02-26T17:51:41Z

Testcase where this makes a difference?

I spent some time trying to come up with one but didn't manage to. This change is motivated by ongoing pre-RA rematerialization work on the AMDGPU backend (see #125885) which requires that we are able to track regions with a single MI. The current implementation follows a different logic to identify rematerializable instructions, and I can't manage to make this fail/succeed depending on whether single-MI regions are filtered out/in.

Am I looking at the right diff here? The only code change shown here is the removal of I == std::prev(RegionEnd).

A few unit tests across various targets were also updated manually, and some previous changes reverted.

llvm/lib/CodeGen/MachineScheduler.cpp

arsenm

I removed the comment about the flag from the message

nikic · 2025-02-27T20:31:02Z

This has some compile-time impact (https://llvm-compile-time-tracker.com/compare.php?from=c0b5451129bba52e33cd7957d58af897a58d14c6&to=15e295d30aa356a0ab1d83e477375cf3ef314947&stat=instructions:u). Can we not do this on targets that don't need it?

jayfoad · 2025-02-28T10:42:09Z

The MI scheduler skips regions containing a single MI during scheduling. This can prevent targets that perform multi-stage scheduling and move MIs between regions during some stages to reason correctly about the entire IR, since some MIs will not be assigned to a region at the beginning.

I thought the scheduler only ever built the DAG for one region at a time. What's the mechanism for having DAG for multiple regions live at the same time?

lucas-rami · 2025-02-28T15:21:12Z

Can we not do this on targets that don't need it?

I am ok reintroducing the flag to only conditionally do this for targets that care (off-by-default).

I thought the scheduler only ever built the DAG for one region at a time. What's the mechanism for having DAG for multiple regions live at the same time?

The AMDGPU backend doesn't actually schedule in the ScheduleDAGInstrs::schedule hook, it only collects the regions there. Scheduling only happens in the ScheduleDAGInstrs::finalizeSchedule hook, once all regions have been collected. In the end it still schedule regions one at a time, but there may be some instruction movement across regions between scheduling stages.

…m#128739) The MI scheduler skips regions containing a single MI during scheduling. This can prevent targets that perform multi-stage scheduling and move MIs between regions during some stages to reason correctly about the entire IR, since some MIs will not be assigned to a region at the beginning. This makes the machine scheduler no longer skip single-MI regions. Only a few unit tests are affected (mainly those which check for the scheduler's debug output).

Following 15e295d the machine scheduler no longer filters-out single-MI regions when emitting regions to schedule. While this has no functional impact at the moment, it generally has a negative compile-time impact (see #128739). Since all targets but AMDGPU do not care for this behavior, this introduces an off-by-default flag to `ScheduleDAGInstrs` to control whether such regions are going to be scheduled, effectively reverting 15e295d for all targets but AMDGPU (currently the only target enabling this flag).

) Following 15e295d the machine scheduler no longer filters-out single-MI regions when emitting regions to schedule. While this has no functional impact at the moment, it generally has a negative compile-time impact (see llvm#128739). Since all targets but AMDGPU do not care for this behavior, this introduces an off-by-default flag to `ScheduleDAGInstrs` to control whether such regions are going to be scheduled, effectively reverting 15e295d for all targets but AMDGPU (currently the only target enabling this flag). Change-Id: Ib38f9b7e8d2bb1073cb43ed7e58eaf251ffdce48

Allow scheduling of regions with single MI

aeed954

lucas-rami added backend:AMDGPU mi-sched machine instruction scheduler labels Feb 25, 2025

lucas-rami requested review from arsenm, bcahoon, davemgreen and jrbyrnes and removed request for arsenm and bcahoon February 25, 2025 16:32

Revert spurious formatting change

1d43cbf

jayfoad reviewed Feb 25, 2025

View reviewed changes

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp Outdated Show resolved Hide resolved

Fix comment typo

252bce6

Co-authored-by: Jay Foad <jay.foad@gmail.com>

arsenm reviewed Feb 26, 2025

View reviewed changes

Remove flag; this is now the default scheduler behavior

db36039

llvmbot added backend:ARM backend:PowerPC backend:X86 labels Feb 26, 2025

shiltian reviewed Feb 26, 2025

View reviewed changes

arsenm reviewed Feb 27, 2025

View reviewed changes

llvm/lib/CodeGen/MachineScheduler.cpp Outdated Show resolved Hide resolved

Update comment in machine scheduler

0f41fe8

arsenm approved these changes Feb 27, 2025

View reviewed changes

lucas-rami merged commit 15e295d into llvm:main Feb 27, 2025
6 of 10 checks passed

huaatian mentioned this pull request Feb 28, 2025

fix live interval empty issue huaatian/llvm-project#1

Open

lucas-rami mentioned this pull request Mar 4, 2025

[MachineScheduler] Optional scheduling of single-MI regions #129704

Merged

lucas-rami deleted the single-mi-region-scheduling branch November 17, 2025 10:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MachineScheduler][AMDGPU] Allow scheduling of single-MI regions #128739

[MachineScheduler][AMDGPU] Allow scheduling of single-MI regions #128739

lucas-rami commented Feb 25, 2025 •

edited

Loading

llvmbot commented Feb 25, 2025 •

edited

Loading

Uh oh!

arsenm left a comment

arsenm Feb 26, 2025

lucas-rami Feb 26, 2025

shiltian left a comment

lucas-rami commented Feb 26, 2025 •

edited

Loading

Uh oh!

arsenm left a comment

Uh oh!

nikic commented Feb 27, 2025

jayfoad commented Feb 28, 2025

lucas-rami commented Feb 28, 2025 •

edited

Loading

Labels

6 participants

[MachineScheduler][AMDGPU] Allow scheduling of single-MI regions #128739

[MachineScheduler][AMDGPU] Allow scheduling of single-MI regions #128739

Conversation

lucas-rami commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

llvmbot commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

arsenm Feb 26, 2025

Choose a reason for hiding this comment

lucas-rami Feb 26, 2025

Choose a reason for hiding this comment

shiltian left a comment

Choose a reason for hiding this comment

lucas-rami commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

nikic commented Feb 27, 2025

jayfoad commented Feb 28, 2025

lucas-rami commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

6 participants

lucas-rami commented Feb 25, 2025 •

edited

Loading

llvmbot commented Feb 25, 2025 •

edited

Loading

lucas-rami commented Feb 26, 2025 •

edited

Loading

lucas-rami commented Feb 28, 2025 •

edited

Loading