- Notifications
You must be signed in to change notification settings - Fork 15.3k
[CodeGen] Avoid generating trap instructions after exception restore intrinsics #109560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Right now, PrologEpilogInsertion uses `MBB.isReturnBlock()` to determine whether or not to restore CSRs. This normally works with `EH_RETURN` since the latter is marked as being a return instruction; however, the situation gets more complicated when `trap-unreachable` is involved since it's no longer the last instruction in the block. As such, with `trap-unreachable` enabled, PEI misses our block and does not emit a restore when necessary. This also causes a MachineVerifier failure as traps are not considered terminators. As there are potentially other, similar issues in other passes here, we fix this by marking `llvm.eh.return` as `noreturn` and then adding an additional check in SelectionDAG and GlobalISel before emitting trap instructions when visiting `unreachable`.
| @llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-x86 Author: duk (duk-37) ChangesRight now, PrologEpilogInsertion uses As there are potentially other, similar issues in other passes here, we fix this by marking Full diff: https://github.com/llvm/llvm-project/pull/109560.diff 5 Files Affected:
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td index 0a74a217a5f010..5e4ee4276aa97f 100644 --- a/llvm/include/llvm/IR/Intrinsics.td +++ b/llvm/include/llvm/IR/Intrinsics.td @@ -1405,8 +1405,8 @@ let IntrProperties = [IntrNoMem, IntrSpeculatable, IntrWillReturn] in { // given function it is 'const' and may be CSE'd etc. def int_eh_typeid_for : Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty], [IntrNoMem]>; -def int_eh_return_i32 : Intrinsic<[], [llvm_i32_ty, llvm_ptr_ty]>; -def int_eh_return_i64 : Intrinsic<[], [llvm_i64_ty, llvm_ptr_ty]>; +def int_eh_return_i32 : Intrinsic<[], [llvm_i32_ty, llvm_ptr_ty], [IntrNoReturn]>; +def int_eh_return_i64 : Intrinsic<[], [llvm_i64_ty, llvm_ptr_ty], [IntrNoReturn]>; // eh.exceptionpointer returns the pointer to the exception caught by // the given `catchpad`. diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp index 8e860a1f740295..7cf6e4d55c24a9 100644 --- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp +++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp @@ -3125,6 +3125,14 @@ bool IRTranslator::translateUnreachable(const User &U, MachineIRBuilder &MIRBuil // Do not emit an additional trap instruction. if (Call->isNonContinuableTrap()) return true; + // Do not emit trap instructions after EH_RETURN intrinsics. + // This can cause problems later down the line when other machine passes + // attempt to use the last instruction in a BB to determine terminator behavior. + if (const auto *II = dyn_cast<IntrinsicInst>(Call)) { + const auto IID = II->getIntrinsicID(); + if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64) + return true; + } } MIRBuilder.buildTrap(); diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 25213f587116d5..8ca1aa0278f127 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -3553,6 +3553,14 @@ void SelectionDAGBuilder::visitUnreachable(const UnreachableInst &I) { // Do not emit an additional trap instruction. if (Call->isNonContinuableTrap()) return; + // Do not emit trap instructions after EH_RETURN intrinsics. + // This can cause problems later down the line when other machine passes + // attempt to use the last instruction in a BB to determine terminator behavior. + if (const auto *II = dyn_cast<IntrinsicInst>(Call)) { + const auto IID = II->getIntrinsicID(); + if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64) + return; + } } DAG.setRoot(DAG.getNode(ISD::TRAP, getCurSDLoc(), MVT::Other, DAG.getRoot())); diff --git a/llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll new file mode 100644 index 00000000000000..1101a89f68b0b4 --- /dev/null +++ b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86.ll @@ -0,0 +1,15 @@ +; RUN: llc < %s -mtriple=i386-unknown-unknown -trap-unreachable -no-trap-after-noreturn=false -verify-machineinstrs | FileCheck %s + +;; Ensure we restore caller-saved registers before EH_RETURN, even with trap-unreachable enabled. +;; For now, we deliberately avoid generating traps altogether. + +; CHECK-LABEL: test32 +; CHECK: pushl +; CHECK: popl +; CHECK: eh_return +; CHECK-NOT: ud2 +define void @test32(i32 %offset, ptr %handler) { + call void @llvm.eh.unwind.init() + call void @llvm.eh.return.i32(i32 %offset, ptr %handler) + unreachable +} diff --git a/llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll new file mode 100644 index 00000000000000..d95bd610b5622f --- /dev/null +++ b/llvm/test/CodeGen/X86/eh-trap-unreachable-x86_64.ll @@ -0,0 +1,15 @@ +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -trap-unreachable -no-trap-after-noreturn=false -verify-machineinstrs | FileCheck %s + +;; Ensure we restore caller-saved registers before EH_RETURN, even with trap-unreachable enabled. +;; For now, we deliberately avoid generating traps altogether. + +; CHECK-LABEL: test64 +; CHECK: pushq +; CHECK: popq +; CHECK: eh_return +; CHECK-NOT: ud2 +define void @test64(i64 %offset, ptr %handler) { + call void @llvm.eh.unwind.init() + call void @llvm.eh.return.i64(i64 %offset, ptr %handler) + unreachable +} |
You can test this locally with the following command:git-clang-format --diff 2c770675ce36402b51a320ae26f369690c138dc1 3daa4022a395ba939b26fc6f239b89a878456e27 --extensions cpp -- llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cppView the diff from clang-format here.diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp index 7cf6e4d55c..02f4d52738 100644 --- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp +++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp @@ -3127,7 +3127,8 @@ bool IRTranslator::translateUnreachable(const User &U, MachineIRBuilder &MIRBuil return true; // Do not emit trap instructions after EH_RETURN intrinsics. // This can cause problems later down the line when other machine passes - // attempt to use the last instruction in a BB to determine terminator behavior. + // attempt to use the last instruction in a BB to determine terminator + // behavior. if (const auto *II = dyn_cast<IntrinsicInst>(Call)) { const auto IID = II->getIntrinsicID(); if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64) diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 8ca1aa0278..ebae2517a5 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -3555,7 +3555,8 @@ void SelectionDAGBuilder::visitUnreachable(const UnreachableInst &I) { return; // Do not emit trap instructions after EH_RETURN intrinsics. // This can cause problems later down the line when other machine passes - // attempt to use the last instruction in a BB to determine terminator behavior. + // attempt to use the last instruction in a BB to determine terminator + // behavior. if (const auto *II = dyn_cast<IntrinsicInst>(Call)) { const auto IID = II->getIntrinsicID(); if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64) |
| def int_eh_return_i32 : Intrinsic<[], [llvm_i32_ty, llvm_ptr_ty], [IntrNoReturn]>; | ||
| def int_eh_return_i64 : Intrinsic<[], [llvm_i64_ty, llvm_ptr_ty], [IntrNoReturn]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you still need to special case this in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused as to what you mean. The Clang intrinsic, __builtin_eh_return is marked as NoReturn, everything essentially treats it as noreturn due to unreachable being emitted immediately after them, and the verifier would complain if you emitted code after them (which is exactly what this PR is trying to fix).
Marking these as noreturn accurately represents their behavior and makes it so that the patch can be a bit smaller since I don't need to reorganize logic in the instruction selectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IR intrinsic noreturn attribute is separate from the machine verifier issue you are seeing. You can do this as a pre-commit (which is independently observable in IR optimizations).
You're also not making use of this attribute below?
| if (const auto *II = dyn_cast<IntrinsicInst>(Call)) { | ||
| const auto IID = II->getIntrinsicID(); | ||
| if (IID == Intrinsic::eh_return_i32 || IID == Intrinsic::eh_return_i64) | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These intrinsics are missing from the langref and I don't know what's expected of them. Should these really be terminators in the first place? Can the insert point just be adjusted?
use the last instruction in a BB to determine terminator behavior.
This comment doesn't exactly make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these really be terminators in the first place? Can the insert point just be adjusted?
Yes and no, respectively. They're old exception-handling intrinsics that lower down to ISD::EH_RETURN through the SelectionDAG, which is then lowered to target-specific EH_RETURN opcodes across different architectures. These opcodes are all marked as "return instructions" in various .td files.
The problem becomes that these are not actually return instructions, they're intrinsics. When they're lowered, we visit an unreachable instruction after the EH_RETURN opcode is emitted through the intrinsic, meaning that if trap-unreachable is enabled we emit a TRAP instruction after a terminator. You can see this behavior here: https://godbolt.org/z/5br41sWqf
We therefore also get blocks where MachineBasicBlock::isReturnBlock() returns false when it would have returned true before, as the function checks the last instruction in a block to figure out whether it's a "return block" or not. This is used, for example, in PrologEpilogInsertion to determine where to restore registers.
I'm not quite sure if this is a miscompile, but returning false in isReturnBlock is definitely not what we want to happen here. Also, the restore behavior is explicitly checked in tests, so I'm not going to bank on that. My goal here is essentially to just preserve the current behavior (and not crash in the machine verifier) when trap-unreachable is set to true.
Looking at this again, I'm not sure these intrinsics are even supported in GlobalISel on any architecture, but it would be good to keep this consistent between instruction selection backends.
arsenm left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be another case where we should move intrinsics that are really terminators into callbr or be first class terminators
| // Do not emit trap instructions after EH_RETURN intrinsics. | ||
| // This can cause problems later down the line when other machine passes | ||
| // attempt to use the last instruction in a BB to determine terminator behavior. | ||
| if (const auto *II = dyn_cast<IntrinsicInst>(Call)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be handled in isNonContinuableTrap?
Right now, PrologEpilogInsertion uses
MBB.isReturnBlock()to determine whether or not to restore CSRs. This normally works withEH_RETURNsince the latter is marked as being a return instruction; however, the situation gets more complicated whentrap-unreachableis involved since it's no longer the last instruction in the block. As such, withtrap-unreachableenabled, PEI misses our block and does not emit a restore when necessary. This also causes a MachineVerifier failure as traps are not considered terminators.As there are potentially other, similar issues in other passes here, we fix this by marking
llvm.eh.returnasnoreturnand then adding an additional check in SelectionDAG and GlobalISel before emitting trap instructions when visitingunreachable.