- Notifications
You must be signed in to change notification settings - Fork 15.3k
[Clang] Mark this pointer in destructors dead_on_return #166276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Clang] Mark this pointer in destructors dead_on_return #166276
Conversation
This helps to clean up any dead stores that come up at the end of the destructor. The motivating example was a refactoring in libc++'s basic_string implementation in 8dae17b that added a zeroing store into the destructor, causing a large performance regression on an internal workload.
7fb6339 to 7a3dec4 Compare | @llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Aiden Grossman (boomanaiden154) ChangesThis helps to clean up any dead stores that come up at the end of the destructor. The motivating example was a refactoring in libc++'s basic_string implementation in 8dae17b that added a zeroing store into the destructor, causing a large performance regression on an internal workload. We also saw a ~0.2% performance increase on an internal server workload when enabling this. I also tested this against all of the non-flaky tests in our large C++ codebase and found a minimal number of issues that all happened to be in user code. Patch is 5.29 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/166276.diff 109 Files Affected:
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index efacb3cc04c01..ee6e13fd1c1a5 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -2767,7 +2767,8 @@ void CodeGenModule::ConstructAttributeList(StringRef Name, } // Apply `nonnull`, `dereferenceable(N)` and `align N` to the `this` argument, - // unless this is a thunk function. + // unless this is a thunk function. Add dead_on_return to the `this` argument + // in base class destructors to aid in DSE. // FIXME: fix this properly, https://reviews.llvm.org/D100388 if (FI.isInstanceMethod() && !IRFunctionArgs.hasInallocaArg() && !FI.arg_begin()->type->isVoidPointerType() && !IsThunk) { @@ -2800,6 +2801,15 @@ void CodeGenModule::ConstructAttributeList(StringRef Name, .getAsAlign(); Attrs.addAlignmentAttr(Alignment); + if (isa_and_nonnull<CXXDestructorDecl>( + CalleeInfo.getCalleeDecl().getDecl())) { + auto *ClassDecl = dyn_cast<CXXRecordDecl>( + CalleeInfo.getCalleeDecl().getDecl()->getDeclContext()); + if (ClassDecl->getNumBases() == 0 && ClassDecl->getNumVBases() == 0) { + Attrs.addAttribute(llvm::Attribute::DeadOnReturn); + } + } + ArgAttrs[IRArgs.first] = llvm::AttributeSet::get(getLLVMContext(), Attrs); } diff --git a/clang/test/CodeGen/paren-list-agg-init.cpp b/clang/test/CodeGen/paren-list-agg-init.cpp index e30777ecc07d6..561bf2b5eb9c4 100644 --- a/clang/test/CodeGen/paren-list-agg-init.cpp +++ b/clang/test/CodeGen/paren-list-agg-init.cpp @@ -394,9 +394,9 @@ namespace gh61145 { // a.k.a. Vec::Vec(Vec&&) // CHECK-NEXT: call void @_ZN7gh611453VecC1EOS0_(ptr noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]], ptr noundef nonnull align 1 dereferenceable(1) [[V]]) // a.k.a. S1::~S1() - // CHECK-NEXT: call void @_ZN7gh611452S1D1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]]) + // CHECK-NEXT: call void @_ZN7gh611452S1D1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[AGG_TMP_ENSURED]]) // a.k.a.Vec::~Vec() - // CHECK-NEXT: call void @_ZN7gh611453VecD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[V]]) + // CHECK-NEXT: call void @_ZN7gh611453VecD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[V]]) // CHECK-NEXT: ret void template <int I> void make1() { @@ -416,9 +416,9 @@ namespace gh61145 { // CHECK-NEXT: [[C:%.*c.*]] = getelementptr inbounds nuw [[STRUCT_S2]], ptr [[AGG_TMP_ENSURED]], i32 0, i32 // CHECK-NEXT: store i8 0, ptr [[C]], align 1 // a.k.a. S2::~S2() - // CHECK-NEXT: call void @_ZN7gh611452S2D1Ev(ptr noundef nonnull align 1 dereferenceable(2) [[AGG_TMP_ENSURED]]) + // CHECK-NEXT: call void @_ZN7gh611452S2D1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(2) [[AGG_TMP_ENSURED]]) // a.k.a. Vec::~Vec() - // CHECK-NEXT: call void @_ZN7gh611453VecD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[V]]) + // CHECK-NEXT: call void @_ZN7gh611453VecD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[V]]) // CHECK-NEXT: ret void template <int I> void make2() { diff --git a/clang/test/CodeGen/temporary-lifetime.cpp b/clang/test/CodeGen/temporary-lifetime.cpp index 04087292b2c70..44d1235f15c86 100644 --- a/clang/test/CodeGen/temporary-lifetime.cpp +++ b/clang/test/CodeGen/temporary-lifetime.cpp @@ -24,12 +24,12 @@ void Test1() { // CHECK-DTOR: call void @llvm.lifetime.start.p0(ptr nonnull %[[ADDR:.+]]) // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR:[^ ]+]]) // CHECK-DTOR: call void @_Z3FooIRK1AEvOT_ - // CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR]]) + // CHECK-DTOR: call void @_ZN1AD1Ev(ptr dead_on_return nonnull {{[^,]*}} %[[VAR]]) // CHECK-DTOR: call void @llvm.lifetime.end.p0(ptr nonnull %[[ADDR]]) // CHECK-DTOR: call void @llvm.lifetime.start.p0(ptr nonnull %[[ADDR:.+]]) // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR:[^ ]+]]) // CHECK-DTOR: call void @_Z3FooIRK1AEvOT_ - // CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR]]) + // CHECK-DTOR: call void @_ZN1AD1Ev(ptr dead_on_return nonnull {{[^,]*}} %[[VAR]]) // CHECK-DTOR: call void @llvm.lifetime.end.p0(ptr nonnull %[[ADDR]]) // CHECK-DTOR: } @@ -61,9 +61,9 @@ void Test2() { // CHECK-DTOR: call void @llvm.lifetime.start.p0(ptr nonnull %[[ADDR2:.+]]) // CHECK-DTOR: call void @_ZN1AC1Ev(ptr nonnull {{[^,]*}} %[[VAR2:[^ ]+]]) // CHECK-DTOR: call void @_Z3FooIRK1AEvOT_ - // CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR2]]) + // CHECK-DTOR: call void @_ZN1AD1Ev(ptr dead_on_return nonnull {{[^,]*}} %[[VAR2]]) // CHECK-DTOR: call void @llvm.lifetime.end.p0(ptr nonnull %[[ADDR2]]) - // CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[VAR1]]) + // CHECK-DTOR: call void @_ZN1AD1Ev(ptr dead_on_return nonnull {{[^,]*}} %[[VAR1]]) // CHECK-DTOR: call void @llvm.lifetime.end.p0(ptr nonnull %[[ADDR1]]) // CHECK-DTOR: } @@ -155,12 +155,12 @@ void Test7() { // CHECK-DTOR: call void @llvm.lifetime.start.p0(ptr nonnull %[[ADDR:.+]]) // CHECK-DTOR: call void @_Z3BazI1AET_v({{.*}} %[[SLOT:[^ ]+]]) // CHECK-DTOR: call void @_Z3FooI1AEvOT_({{.*}} %[[SLOT]]) - // CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[SLOT]]) + // CHECK-DTOR: call void @_ZN1AD1Ev(ptr dead_on_return nonnull {{[^,]*}} %[[SLOT]]) // CHECK-DTOR: call void @llvm.lifetime.end.p0(ptr nonnull %[[ADDR]]) // CHECK-DTOR: call void @llvm.lifetime.start.p0(ptr nonnull %[[ADDR:.+]]) // CHECK-DTOR: call void @_Z3BazI1AET_v({{.*}} %[[SLOT:[^ ]+]]) // CHECK-DTOR: call void @_Z3FooI1AEvOT_({{.*}} %[[SLOT]]) - // CHECK-DTOR: call void @_ZN1AD1Ev(ptr nonnull {{[^,]*}} %[[SLOT]]) + // CHECK-DTOR: call void @_ZN1AD1Ev(ptr dead_on_return nonnull {{[^,]*}} %[[SLOT]]) // CHECK-DTOR: call void @llvm.lifetime.end.p0(ptr nonnull %[[ADDR]]) // CHECK-DTOR: } Foo(Baz<A>()); diff --git a/clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp b/clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp index 3c2a624bd4f95..e05f8133321c7 100644 --- a/clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp +++ b/clang/test/CodeGenCXX/amdgcn-automatic-variable.cpp @@ -75,7 +75,7 @@ int x; // CHECK-NEXT: [[A:%.*]] = alloca [[CLASS_A:%.*]], align 4, addrspace(5) // CHECK-NEXT: [[A_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[A]] to ptr // CHECK-NEXT: call void @_ZN1AC1Ev(ptr noundef nonnull align 4 dereferenceable(4) [[A_ASCAST]]) -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 4 dereferenceable(4) [[A_ASCAST]]) +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 4 dereferenceable(4) [[A_ASCAST]]) // CHECK-NEXT: ret void // void func3() { diff --git a/clang/test/CodeGenCXX/amdgcn-func-arg.cpp b/clang/test/CodeGenCXX/amdgcn-func-arg.cpp index a5f83dc91b038..bc20c33ec4d0f 100644 --- a/clang/test/CodeGenCXX/amdgcn-func-arg.cpp +++ b/clang/test/CodeGenCXX/amdgcn-func-arg.cpp @@ -43,9 +43,9 @@ void func_with_indirect_arg(A a) { // CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[AGG_TMP_ASCAST]], ptr align 4 [[A_ASCAST]], i64 4, i1 false) // CHECK-NEXT: [[AGG_TMP_ASCAST_ASCAST:%.*]] = addrspacecast ptr [[AGG_TMP_ASCAST]] to ptr addrspace(5) // CHECK-NEXT: call void @_Z22func_with_indirect_arg1A(ptr addrspace(5) noundef [[AGG_TMP_ASCAST_ASCAST]]) -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 4 dereferenceable(4) [[AGG_TMP_ASCAST]]) +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 4 dereferenceable(4) [[AGG_TMP_ASCAST]]) // CHECK-NEXT: call void @_Z17func_with_ref_argR1A(ptr noundef nonnull align 4 dereferenceable(4) [[A_ASCAST]]) -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 4 dereferenceable(4) [[A_ASCAST]]) +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 4 dereferenceable(4) [[A_ASCAST]]) // CHECK-NEXT: ret void // void test_indirect_arg_auto() { @@ -61,7 +61,7 @@ void test_indirect_arg_auto() { // CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[AGG_TMP_ASCAST]], ptr align 4 addrspacecast (ptr addrspace(1) @g_a to ptr), i64 4, i1 false) // CHECK-NEXT: [[AGG_TMP_ASCAST_ASCAST:%.*]] = addrspacecast ptr [[AGG_TMP_ASCAST]] to ptr addrspace(5) // CHECK-NEXT: call void @_Z22func_with_indirect_arg1A(ptr addrspace(5) noundef [[AGG_TMP_ASCAST_ASCAST]]) -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 4 dereferenceable(4) [[AGG_TMP_ASCAST]]) +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 4 dereferenceable(4) [[AGG_TMP_ASCAST]]) // CHECK-NEXT: call void @_Z17func_with_ref_argR1A(ptr noundef nonnull align 4 dereferenceable(4) addrspacecast (ptr addrspace(1) @g_a to ptr)) // CHECK-NEXT: ret void // diff --git a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp index 4eafa720e0cb4..a764ba31539eb 100644 --- a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp +++ b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp @@ -217,7 +217,7 @@ void ArrayInit() { // CHECK: [[ARRAY_DESTROY_BODY2]]: // CHECK-NEXT: %arraydestroy.elementPast = phi ptr [ %1, %cleanup ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY2]] ] // CHECK-NEXT: %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1 - // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element) + // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr dead_on_return noundef nonnull align 8 dereferenceable(8) %arraydestroy.element) // CHECK-NEXT: %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr // CHECK-NEXT: br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE2]], label %[[ARRAY_DESTROY_BODY2]] @@ -265,7 +265,7 @@ void ArraySubobjects() { // CHECK: [[ARRAY_DESTROY_BODY]]: // CHECK-NEXT: %arraydestroy.elementPast = phi ptr [ %0, %if.then ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY]] ] // CHECK-NEXT: %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1 - // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element) + // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr dead_on_return noundef nonnull align 8 dereferenceable(8) %arraydestroy.element) // CHECK-NEXT: %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr2 // CHECK-NEXT: br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE]], label %[[ARRAY_DESTROY_BODY]] @@ -277,7 +277,7 @@ void ArraySubobjects() { // CHECK: [[ARRAY_DESTROY_BODY2]]: // CHECK-NEXT: %arraydestroy.elementPast4 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element5, %[[ARRAY_DESTROY_BODY2]] ] // CHECK-NEXT: %arraydestroy.element5 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast4, i64 -1 - // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element5) + // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr dead_on_return noundef nonnull align 8 dereferenceable(8) %arraydestroy.element5) // CHECK-NEXT: %arraydestroy.done6 = icmp eq ptr %arraydestroy.element5, [[ARRAY_BEGIN]] // CHECK-NEXT: br i1 %arraydestroy.done6, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]] @@ -384,7 +384,7 @@ void NewArrayInit() { // CHECK: arraydestroy.body: // CHECK-NEXT: %arraydestroy.elementPast = phi ptr [ %{{.*}}, %if.then ], [ %arraydestroy.element, %arraydestroy.body ] // CHECK-NEXT: %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1 - // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element) + // CHECK-NEXT: call void @_ZN6PrintyD1Ev(ptr dead_on_return noundef nonnull align 8 dereferenceable(8) %arraydestroy.element) // CHECK-NEXT: %arraydestroy.done = icmp eq ptr %arraydestroy.element, %0 // CHECK-NEXT: br i1 %arraydestroy.done, label %arraydestroy.done{{.*}}, label %arraydestroy.body diff --git a/clang/test/CodeGenCXX/cxx2a-destroying-delete.cpp b/clang/test/CodeGenCXX/cxx2a-destroying-delete.cpp index 24b1a4dd42977..af29120feb5bb 100644 --- a/clang/test/CodeGenCXX/cxx2a-destroying-delete.cpp +++ b/clang/test/CodeGenCXX/cxx2a-destroying-delete.cpp @@ -41,11 +41,11 @@ void glob_delete_A(A *a) { ::delete a; } // CHECK: icmp eq ptr %[[a]], null // CHECK: br i1 -// CHECK-ITANIUM: call void @_ZN1AD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %[[a]]) +// CHECK-ITANIUM: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 8 dereferenceable(8) %[[a]]) // CHECK-ITANIUM-NEXT: call void @_ZdlPvm(ptr noundef %[[a]], i64 noundef 8) -// CHECK-MSABI64: call void @"??1A@@QEAA@XZ"(ptr noundef nonnull align 8 dereferenceable(8) %[[a]]) +// CHECK-MSABI64: call void @"??1A@@QEAA@XZ"(ptr dead_on_return noundef nonnull align 8 dereferenceable(8) %[[a]]) // CHECK-MSABI64-NEXT: call void @"??3@YAXPEAX_K@Z"(ptr noundef %[[a]], i64 noundef 8) -// CHECK-MSABI32: call x86_thiscallcc void @"??1A@@QAE@XZ"(ptr noundef nonnull align 4 dereferenceable(4) %[[a]]) +// CHECK-MSABI32: call x86_thiscallcc void @"??1A@@QAE@XZ"(ptr dead_on_return noundef nonnull align 4 dereferenceable(4) %[[a]]) // CHECK-MSABI32-NEXT: call void @"??3@YAXPAXI@Z"(ptr noundef %[[a]], i32 noundef 4) struct B { diff --git a/clang/test/CodeGenCXX/for-range.cpp b/clang/test/CodeGenCXX/for-range.cpp index 088a34647c374..b9706855f658c 100644 --- a/clang/test/CodeGenCXX/for-range.cpp +++ b/clang/test/CodeGenCXX/for-range.cpp @@ -53,7 +53,7 @@ extern B array[5]; // CHECK: for.body: // CHECK-NEXT: [[TMP2:%.*]] = load ptr, ptr [[__BEGIN1]], align 8 // CHECK-NEXT: call void @_ZN1BC1ERKS_(ptr noundef nonnull align 1 dereferenceable(1) [[B]], ptr noundef nonnull align 1 dereferenceable(1) [[TMP2]]) -// CHECK-NEXT: call void @_ZN1BD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[B]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT: call void @_ZN1BD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[B]]) #[[ATTR3:[0-9]+]] // CHECK-NEXT: br label [[FOR_INC:%.*]] // CHECK: for.inc: // CHECK-NEXT: [[TMP3:%.*]] = load ptr, ptr [[__BEGIN1]], align 8 @@ -61,7 +61,7 @@ extern B array[5]; // CHECK-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN1]], align 8 // CHECK-NEXT: br label [[FOR_COND]] // CHECK: for.end: -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[A]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[A]]) #[[ATTR3]] // CHECK-NEXT: ret void // void for_array() { @@ -81,10 +81,10 @@ void for_array() { // CHECK-NEXT: call void @_ZN1AC1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[A]]) // CHECK-NEXT: call void @_ZN1CC1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[REF_TMP]]) // CHECK-NEXT: store ptr [[REF_TMP]], ptr [[__RANGE1]], align 8 -// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[__RANGE1]], align 8 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[__RANGE1]], align 8, !nonnull [[META2:![0-9]+]] // CHECK-NEXT: [[CALL:%.*]] = call noundef ptr @_Z5beginR1C(ptr noundef nonnull align 1 dereferenceable(1) [[TMP0]]) // CHECK-NEXT: store ptr [[CALL]], ptr [[__BEGIN1]], align 8 -// CHECK-NEXT: [[TMP1:%.*]] = load ptr, ptr [[__RANGE1]], align 8 +// CHECK-NEXT: [[TMP1:%.*]] = load ptr, ptr [[__RANGE1]], align 8, !nonnull [[META2]] // CHECK-NEXT: [[CALL1:%.*]] = call noundef ptr @_Z3endR1C(ptr noundef nonnull align 1 dereferenceable(1) [[TMP1]]) // CHECK-NEXT: store ptr [[CALL1]], ptr [[__END1]], align 8 // CHECK-NEXT: br label [[FOR_COND:%.*]] @@ -94,12 +94,12 @@ void for_array() { // CHECK-NEXT: [[CMP:%.*]] = icmp ne ptr [[TMP2]], [[TMP3]] // CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY:%.*]], label [[FOR_COND_CLEANUP:%.*]] // CHECK: for.cond.cleanup: -// CHECK-NEXT: call void @_ZN1CD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[REF_TMP]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1CD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[REF_TMP]]) #[[ATTR3]] // CHECK-NEXT: br label [[FOR_END:%.*]] // CHECK: for.body: // CHECK-NEXT: [[TMP4:%.*]] = load ptr, ptr [[__BEGIN1]], align 8 // CHECK-NEXT: call void @_ZN1BC1ERKS_(ptr noundef nonnull align 1 dereferenceable(1) [[B]], ptr noundef nonnull align 1 dereferenceable(1) [[TMP4]]) -// CHECK-NEXT: call void @_ZN1BD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[B]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1BD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[B]]) #[[ATTR3]] // CHECK-NEXT: br label [[FOR_INC:%.*]] // CHECK: for.inc: // CHECK-NEXT: [[TMP5:%.*]] = load ptr, ptr [[__BEGIN1]], align 8 @@ -107,7 +107,7 @@ void for_array() { // CHECK-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN1]], align 8 // CHECK-NEXT: br label [[FOR_COND]] // CHECK: for.end: -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[A]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[A]]) #[[ATTR3]] // CHECK-NEXT: ret void // void for_range() { @@ -127,10 +127,10 @@ void for_range() { // CHECK-NEXT: call void @_ZN1AC1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[A]]) // CHECK-NEXT: call void @_ZN1DC1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[REF_TMP]]) // CHECK-NEXT: store ptr [[REF_TMP]], ptr [[__RANGE1]], align 8 -// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[__RANGE1]], align 8 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[__RANGE1]], align 8, !nonnull [[META2]] // CHECK-NEXT: [[CALL:%.*]] = call noundef ptr @_ZN1D5beginEv(ptr noundef nonnull align 1 dereferenceable(1) [[TMP0]]) // CHECK-NEXT: store ptr [[CALL]], ptr [[__BEGIN1]], align 8 -// CHECK-NEXT: [[TMP1:%.*]] = load ptr, ptr [[__RANGE1]], align 8 +// CHECK-NEXT: [[TMP1:%.*]] = load ptr, ptr [[__RANGE1]], align 8, !nonnull [[META2]] // CHECK-NEXT: [[CALL1:%.*]] = call noundef ptr @_ZN1D3endEv(ptr noundef nonnull align 1 dereferenceable(1) [[TMP1]]) // CHECK-NEXT: store ptr [[CALL1]], ptr [[__END1]], align 8 // CHECK-NEXT: br label [[FOR_COND:%.*]] @@ -140,12 +140,12 @@ void for_range() { // CHECK-NEXT: [[CMP:%.*]] = icmp ne ptr [[TMP2]], [[TMP3]] // CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY:%.*]], label [[FOR_COND_CLEANUP:%.*]] // CHECK: for.cond.cleanup: -// CHECK-NEXT: call void @_ZN1DD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[REF_TMP]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1DD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[REF_TMP]]) #[[ATTR3]] // CHECK-NEXT: br label [[FOR_END:%.*]] // CHECK: for.body: // CHECK-NEXT: [[TMP4:%.*]] = load ptr, ptr [[__BEGIN1]], align 8 // CHECK-NEXT: call void @_ZN1BC1ERKS_(ptr noundef nonnull align 1 dereferenceable(1) [[B]], ptr noundef nonnull align 1 dereferenceable(1) [[TMP4]]) -// CHECK-NEXT: call void @_ZN1BD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[B]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1BD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[B]]) #[[ATTR3]] // CHECK-NEXT: br label [[FOR_INC:%.*]] // CHECK: for.inc: // CHECK-NEXT: [[TMP5:%.*]] = load ptr, ptr [[__BEGIN1]], align 8 @@ -153,7 +153,7 @@ void for_range() { // CHECK-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN1]], align 8 // CHECK-NEXT: br label [[FOR_COND]] // CHECK: for.end: -// CHECK-NEXT: call void @_ZN1AD1Ev(ptr noundef nonnull align 1 dereferenceable(1) [[A]]) #[[ATTR3]] +// CHECK-NEXT: call void @_ZN1AD1Ev(ptr dead_on_return noundef nonnull align 1 dereferenceable(1) [[A]]) #[[ATTR3]] // CHECK-NEXT: ret void // void for_member_range() { diff --git a/clang/test/CodeGenCXX/gh62818.cpp b/clang/test/CodeGenCXX/gh62818.cpp index ec91b40fca077..f903679cd6b68 100644 --- a/clang/test/CodeGenCXX/gh62818.cpp +++ b/clang/test/CodeGenCXX/gh62818.c... [truncated] |
| The actual change is in the first commit (7a3dec4). I've separated the test changes out into 6ec0952 to hopefully make review a bit easier. CC @philnik777 Who came up with the idea and requested this. |
| This optimization exploits the fact that it's undefined behavior to read from an object after its been destroyed. Given the overall shift in how the industry feels about compilers exploiting undefined behavior, I want to push to add an flag to control this. Think of the people who use I'd also like to better understand why base classes matter for this annotation. Until very recently, |
rnk left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, we should do it, this is a valuable optimization. (Commenting twice to push out the inline comments).
| .getAsAlign(); | ||
| Attrs.addAlignmentAttr(Alignment); | ||
| | ||
| if (isa_and_nonnull<CXXDestructorDecl>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code golf: You can CSE the CalleeInfo.getCalleeDecl().getDecl() if you use dyn_cast_or_nonnull.
| CalleeInfo.getCalleeDecl().getDecl())) { | ||
| auto *ClassDecl = dyn_cast<CXXRecordDecl>( | ||
| CalleeInfo.getCalleeDecl().getDecl()->getDeclContext()); | ||
| if (ClassDecl->getNumBases() == 0 && ClassDecl->getNumVBases() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to limit this to only destructors of classes with no bases? Whatever the reason (caution, incremental change, etc), comments here would be appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Virtual base subobjects aren't dead after a base subobject destructor call, but yeah, I can't think of a reason to limit this because of non-virtual bases alone. And even virtual bases are dead after a complete-object destructor call.
efriedma-quic left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the potential undefined behavior here checked by msan? Do we need to disable adding this attribute if msan is enabled?
Yes, use-after-destroy is checked by msan, but I don't believe we need to explicitly disable anything here to handle msan. Based on my understand, the msan instrumentation will add a call to CC @thurstond Who is more familiar with the internals of msan and would be able to confirm. |
This helps to clean up any dead stores that come up at the end of the destructor. The motivating example was a refactoring in libc++'s basic_string implementation in 8dae17b that added a zeroing store into the destructor, causing a large performance regression on an internal workload. We also saw a ~0.2% performance increase on an internal server workload when enabling this.
I also tested this against all of the non-flaky tests in our large C++ codebase and found a minimal number of issues that all happened to be in user code.