Skip to content

Conversation

@khei4
Copy link
Contributor

@khei4 khei4 commented Oct 13, 2023

Fixes #66403

This patch did two things for the stack-move optimization reduction pattern.

  • Insert lifetime.start before the dominator of all src/dest uses.
  • Retain lifetime.end if they are possibly last use for some execution path
    • It's possibly the last use if it isn't post-dominated by any use of src or dest.
@khei4 khei4 requested a review from vitalybuka October 13, 2023 13:36
@khei4 khei4 requested a review from nikic as a code owner October 13, 2023 13:36
@khei4 khei4 force-pushed the perf/stack-move-multi-bb-lifetime-spans branch from 3c85b97 to f4781f1 Compare October 13, 2023 13:37
@llvmbot
Copy link
Member

llvmbot commented Oct 13, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Kohei Asano (khei4)

Changes

Fixes #66403

This patch did two things for the stack-move optimization reduction pattern.

  • Insert lifetime.start before the dominator of all src/dest uses.
  • Retain lifetime.end if they are possibly last use for some execution path
    • It's possibly the last use if it isn't post-dominated by any use of src or dest.

Patch is 25.72 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/68990.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp (+93-6)
  • (modified) llvm/test/Transforms/MemCpyOpt/lifetime-missing.ll (+1)
  • (modified) llvm/test/Transforms/MemCpyOpt/stack-move.ll (+122)
diff --git a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp index 688bcfa57158975..7e5cdd76ef514da 100644 --- a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp +++ b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp @@ -1424,6 +1424,46 @@ bool MemCpyOptPass::performMemCpyToMemSetOptzn(MemCpyInst *MemCpy, return true; } +namespace { + +using InsertionPt = PointerUnion<Instruction *, BasicBlock *>; +/// Find the nearest Instruction or BasicBlock that dominates both I1 and +/// I2. +static InsertionPt findNearestCommonDominator(InsertionPt I1, InsertionPt I2, + DominatorTree *DT) { + auto GetParent = [](InsertionPt I) { + if (auto *BB = dyn_cast<BasicBlock *>(I)) + return BB; + return cast<Instruction *>(I)->getParent(); + }; + BasicBlock *BB1 = GetParent(I1); + BasicBlock *BB2 = GetParent(I2); + if (BB1 == BB2) { + // BasicBlock InsertionPt means the terminator. + if (isa<BasicBlock *>(I1)) + return I2; + if (isa<BasicBlock *>(I2)) + return I1; + return cast<Instruction *>(I1)->comesBefore(cast<Instruction *>(I2)) ? I1 + : I2; + } + + // These checks are necessary, because findNearestCommonDominator for NodeT + // doesn't handle these. + if (!DT->isReachableFromEntry(BB2)) + return I1; + if (!DT->isReachableFromEntry(BB1)) + return I2; + + BasicBlock *DomBB = DT->findNearestCommonDominator(BB1, BB2); + if (BB2 == DomBB) + return I2; + if (BB1 == DomBB) + return I1; + return DomBB; +} + +} // namespace // Attempts to optimize the pattern whereby memory is copied from an alloca to // another alloca, where the two allocas don't have conflicting mod/ref. If // successful, the two allocas can be merged into one and the transfer can be @@ -1466,13 +1506,27 @@ bool MemCpyOptPass::performStackMoveOptzn(Instruction *Load, Instruction *Store, return false; // Check that src and dest are never captured, unescaped allocas. Also - // find the nearest common dominator and postdominator for all users in + // find the nearest common dominator and post-dominator for all users in // order to shrink wrap the lifetimes, and instructions with noalias metadata // to remove them. SmallVector<Instruction *, 4> LifetimeMarkers; + InsertionPt StartPt = nullptr; + SmallSet<Instruction *, 4> EndPts; SmallSet<Instruction *, 4> NoAliasInstrs; bool SrcNotDom = false; + auto InsertIfNotPDom = [&](Instruction *I) { + for (auto EndPt : EndPts) { + if (PDT->dominates(EndPt, I)) + return; + else if (PDT->dominates(I, EndPt)) { + EndPt = I; + return; + } + } + EndPts.insert(I); + return; + }; // Recursively track the user and check whether modified alias exist. auto IsDereferenceableOrNull = [](Value *V, const DataLayout &DL) -> bool { @@ -1514,6 +1568,13 @@ bool MemCpyOptPass::performStackMoveOptzn(Instruction *Load, Instruction *Store, Worklist.push_back(UI); continue; case UseCaptureKind::NO_CAPTURE: { + if (!StartPt) { + StartPt = UI; + EndPts.insert(UI); + } else { + StartPt = findNearestCommonDominator(StartPt, UI, DT); + InsertIfNotPDom(UI); + } if (UI->isLifetimeStartOrEnd()) { // We note the locations of these intrinsic calls so that we can // delete them later if the optimization succeeds, this is safe @@ -1623,12 +1684,38 @@ bool MemCpyOptPass::performStackMoveOptzn(Instruction *Load, Instruction *Store, // Drop metadata on the source alloca. SrcAlloca->dropUnknownNonDebugMetadata(); - // TODO: Reconstruct merged lifetime markers. - // Remove all other lifetime markers. if the original lifetime intrinsics - // exists. + // Reconstruction of lifetime markers. + // 1. Insert new lifetime.start to the dominator for all src and dest uses. + // 2. Remove original markers except possibly last lifetime.end  if (!LifetimeMarkers.empty()) { - for (Instruction *I : LifetimeMarkers) - eraseInstruction(I); + LLVMContext &C = SrcAlloca->getContext(); + IRBuilder<> Builder(C); + ConstantInt *AllocaSize = ConstantInt::get(Type::getInt64Ty(C), -1); + if (!Size.isScalable()) + AllocaSize = ConstantInt::get(Type::getInt64Ty(C), Size); + // Create a new lifetime start marker before the first user of src or alloca + // users. + MemoryAccess *StartMA; + if (auto *DomI = dyn_cast_if_present<Instruction *>(StartPt)) { + Builder.SetInsertPoint(DomI->getParent(), DomI->getIterator()); + auto *Start = Builder.CreateLifetimeStart(SrcAlloca, AllocaSize); + StartMA = MSSAU->createMemoryAccessBefore(Start, nullptr, + MSSA->getMemoryAccess(DomI)); + } else { + auto *DomB = cast<BasicBlock *>(StartPt); + Builder.SetInsertPoint(DomB->getTerminator()); + auto *Start = Builder.CreateLifetimeStart(SrcAlloca, AllocaSize); + StartMA = MSSAU->createMemoryAccessInBB( + Start, nullptr, Start->getParent(), MemorySSA::BeforeTerminator); + } + MSSAU->insertDef(cast<MemoryDef>(StartMA), /*RenameUses=*/true); + + // Remove all lifetime markers except . if the original lifetime intrinsics + // exists. + for (Instruction *I : LifetimeMarkers) { + if (EndPts.find(I) == EndPts.end()) + eraseInstruction(I); + } } // As this transformation can cause memory accesses that didn't previously diff --git a/llvm/test/Transforms/MemCpyOpt/lifetime-missing.ll b/llvm/test/Transforms/MemCpyOpt/lifetime-missing.ll index 0626f09702f7e21..e3190cee45d2da4 100644 --- a/llvm/test/Transforms/MemCpyOpt/lifetime-missing.ll +++ b/llvm/test/Transforms/MemCpyOpt/lifetime-missing.ll @@ -15,6 +15,7 @@ define void @test() { ; CHECK-NEXT: entry: ; CHECK-NEXT: [[AGG_TMP_SROA_14:%.*]] = alloca [20 x i8], align 4 ; CHECK-NEXT: [[AGG_TMP_SROA_14_128_SROA_IDX:%.*]] = getelementptr i8, ptr [[AGG_TMP_SROA_14]], i64 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 20, ptr [[AGG_TMP_SROA_14]]) ; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr [[AGG_TMP_SROA_14_128_SROA_IDX]], i8 0, i64 1, i1 false) ; CHECK-NEXT: [[AGG_TMP3_SROA_35_128_SROA_IDX:%.*]] = getelementptr i8, ptr [[AGG_TMP_SROA_14]], i64 4 ; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr inttoptr (i64 4 to ptr), i8 0, i64 1, i1 false) diff --git a/llvm/test/Transforms/MemCpyOpt/stack-move.ll b/llvm/test/Transforms/MemCpyOpt/stack-move.ll index 6089c0a4d7cf507..998b742e480bb73 100644 --- a/llvm/test/Transforms/MemCpyOpt/stack-move.ll +++ b/llvm/test/Transforms/MemCpyOpt/stack-move.ll @@ -24,9 +24,11 @@ declare i32 @use_writeonly(ptr noundef) memory(write) define void @basic_memcpy() { ; CHECK-LABEL: define void @basic_memcpy() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -67,9 +69,11 @@ define i32 @use_not_dominated_by_src_alloca() { define void @basic_memmove() { ; CHECK-LABEL: define void @basic_memmove() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -92,9 +96,11 @@ define void @basic_memmove() { define void @load_store() { ; CHECK-LABEL: define void @load_store() { ; CHECK-NEXT: [[SRC:%.*]] = alloca i32, align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 4, ptr [[SRC]]) ; CHECK-NEXT: store i32 42, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 4, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca i32, align 4 @@ -118,9 +124,11 @@ define void @load_store_scalable(<vscale x 4 x i32> %x) { ; CHECK-LABEL: define void @load_store_scalable ; CHECK-SAME: (<vscale x 4 x i32> [[X:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca <vscale x 4 x i32>, align 16 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 -1, ptr [[SRC]]) ; CHECK-NEXT: store <vscale x 4 x i32> [[X]], ptr [[SRC]], align 16 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 -1, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca <vscale x 4 x i32> @@ -144,9 +152,11 @@ define void @load_store_scalable(<vscale x 4 x i32> %x) { define void @align_up() { ; CHECK-LABEL: define void @align_up() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 8 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -169,10 +179,12 @@ define void @align_up() { define void @remove_extra_lifetime_intrinsics() { ; CHECK-LABEL: define void @remove_extra_lifetime_intrinsics() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP3:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -221,11 +233,13 @@ define void @no_lifetime() { define void @alias_no_mod() { ; CHECK-LABEL: define void @alias_no_mod() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: [[DEST_ALIAS:%.*]] = getelementptr [[STRUCT_FOO]], ptr [[SRC]], i32 0, i32 0 ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[SRC_ALIAS:%.*]] = getelementptr [[STRUCT_FOO]], ptr [[SRC]], i32 0, i32 0 ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -258,9 +272,11 @@ define void @alias_no_mod() { define void @remove_scoped_noalias() { ; CHECK-LABEL: define void @remove_scoped_noalias() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]), !alias.scope !0 ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -282,9 +298,11 @@ define void @remove_scoped_noalias() { define void @remove_alloca_metadata() { ; CHECK-LABEL: define void @remove_alloca_metadata() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]), !alias.scope !0 ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4, !annotation !3 @@ -307,9 +325,11 @@ define void @remove_alloca_metadata() { define void @noalias_on_lifetime() { ; CHECK-LABEL: define void @noalias_on_lifetime() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]), !alias.scope !0 ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]), !noalias !0 ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -331,9 +351,11 @@ define void @noalias_on_lifetime() { define void @src_ref_dest_ref_after_copy() { ; CHECK-LABEL: define void @src_ref_dest_ref_after_copy() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_readonly(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_readonly(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -355,9 +377,11 @@ define void @src_ref_dest_ref_after_copy() { define void @src_mod_dest_mod_after_copy() { ; CHECK-LABEL: define void @src_mod_dest_mod_after_copy() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_writeonly(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_writeonly(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -378,6 +402,7 @@ define void @src_mod_dest_mod_after_copy() { define void @avoid_memory_use_last_user_crash() { ; CHECK-LABEL: define void @avoid_memory_use_last_user_crash() { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[V:%.*]] = load i32, ptr [[SRC]], align 4 ; CHECK-NEXT: ret void @@ -396,6 +421,7 @@ define void @avoid_memory_use_last_user_crash() { define void @terminator_lastuse() personality i32 0 { ; CHECK-LABEL: define void @terminator_lastuse() personality i32 0 { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: [[RV:%.*]] = invoke i32 @use_nocapture(ptr [[SRC]]) @@ -430,6 +456,7 @@ define void @multi_bb_memcpy(i1 %b) { ; CHECK-LABEL: define void @multi_bb_memcpy ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca i32, align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 4, ptr [[SRC]]) ; CHECK-NEXT: store i32 42, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: br label [[BB0:%.*]] @@ -437,6 +464,7 @@ define void @multi_bb_memcpy(i1 %b) { ; CHECK-NEXT: br label [[BB1:%.*]] ; CHECK: bb1: ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 4, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca i32, align 4 @@ -462,11 +490,13 @@ define void @multi_bb_load_store(i1 %b) { ; CHECK-LABEL: define void @multi_bb_load_store ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca i32, align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 4, ptr [[SRC]]) ; CHECK-NEXT: store i32 42, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) ; CHECK-NEXT: br label [[BB0:%.*]] ; CHECK: bb0: ; CHECK-NEXT: [[TMP2:%.*]] = call i32 @use_nocapture(ptr nocapture [[SRC]]) +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 4, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca i32, align 4 @@ -534,6 +564,7 @@ define void @multi_bb_simple_br(i1 %b) { ; CHECK-LABEL: define void @multi_bb_simple_br ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: [[TMP1:%.*]] = call i32 @use_nocapture(ptr nocapture noundef [[SRC]]) ; CHECK-NEXT: br i1 [[B]], label [[BB0:%.*]], label [[BB1:%.*]] @@ -544,6 +575,7 @@ define void @multi_bb_simple_br(i1 %b) { ; CHECK-NEXT: [[TMP3:%.*]] = call i32 @use_nocapture(ptr nocapture noundef [[SRC]]) ; CHECK-NEXT: br label [[BB2]] ; CHECK: bb2: +; CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 12, ptr nocapture [[SRC]]) ; CHECK-NEXT: ret void ; %src = alloca %struct.Foo, align 4 @@ -574,6 +606,7 @@ define void @multi_bb_dom_test0(i1 %b) { ; CHECK-LABEL: define void @multi_bb_dom_test0 ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: br i1 [[B]], label [[BB0:%.*]], label [[BB1:%.*]] ; CHECK: bb0: ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 @@ -657,6 +690,7 @@ define void @multi_bb_pdom_test0(i1 %b) { ; CHECK-LABEL: define void @multi_bb_pdom_test0 ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: br i1 [[B]], label [[BB0:%.*]], label [[BB1:%.*]] ; CHECK: bb0: @@ -697,6 +731,7 @@ define void @multi_bb_pdom_test1(i1 %b) { ; CHECK-LABEL: define void @multi_bb_pdom_test1 ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32 20, i32 30 }, ptr [[SRC]], align 4 ; CHECK-NEXT: br i1 [[B]], label [[BB0:%.*]], label [[BB1:%.*]] ; CHECK: bb0: @@ -735,6 +770,7 @@ define void @multi_bb_pdom_test2(i1 %b) { ; CHECK-LABEL: define void @multi_bb_pdom_test2 ; CHECK-SAME: (i1 [[B:%.*]]) { ; CHECK-NEXT: [[SRC:%.*]] = alloca [[STRUCT_FOO:%.*]], align 4 +; CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 12, ptr [[SRC]]) ; CHECK-NEXT: store [[STRUCT_FOO]] { i32 10, i32... [truncated] 
@github-actions
Copy link

github-actions bot commented Oct 13, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@khei4 khei4 marked this pull request as draft October 14, 2023 05:47
@khei4 khei4 force-pushed the perf/stack-move-multi-bb-lifetime-spans branch from f4781f1 to a720474 Compare October 14, 2023 06:18
@khei4 khei4 marked this pull request as ready for review October 14, 2023 07:58
@fmayer
Copy link
Contributor

fmayer commented Dec 22, 2023

Are you planning on submitting this eventually? For use-after-scope in sanitizers the more lifetime annotations the better.

@khei4
Copy link
Contributor Author

khei4 commented Dec 27, 2023

@fmayer
Sorry for being late for replying!

Are you planning on submitting this eventually? For use-after-scope in sanitizers the more lifetime annotations the better.

Yes, I believe this patch resolves #66403.

I'm also okay with you taking over this or an alternative patch if you're interested :). I'm a little busy for a few months.

@fmayer
Copy link
Contributor

fmayer commented Jan 3, 2024

I'm also okay with you taking over this or an alternative patch if you're interested :). I'm a little busy for a few months.

Can you elaborate a bit what is missing? Shepherding through review, or is it not done?

@khei4
Copy link
Contributor Author

khei4 commented Jan 4, 2024

Can you elaborate a bit what is missing?

Sure.

Shepherding through review, or is it not done?

Not done yet. So, I guess the first thing to be done is to verify this is correct to some kind. This patch introduces more lifetime markers than the current main, so if it's better to be based on this patch, checking that the sanitizer won't crash for sufficient tests may be sufficient (I only checked the correctness for this patch with ninja -Cbuild check-llvm including added test on this patch, but I'm not sure Address sanitizer tests works for e.g. stage-2 build).

Thank you for pinging, if you need more concrete things about this patch, please ping me again.

@mvelbaum
Copy link

@khei4 is there a PR here for https://reviews.llvm.org/D159075 or did that patch get lost?

@khei4
Copy link
Contributor Author

khei4 commented Jan 22, 2024

@khei4 is there a PR here for https://reviews.llvm.org/D159075, or did that patch get lost?

Thanks! Not yet. I'm now preparing to rebase the patch, and it'll be ported with more descriptions to Github in a few weeks :)
I think it's better to separate here the lifetime-marker missing regression, and the above patch, improving the algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants