- Notifications
You must be signed in to change notification settings - Fork 15.3k
[XRay][X86] Fix stack alignment for custom event calls. #89360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @llvm/pr-subscribers-xray @llvm/pr-subscribers-backend-x86 Author: Ricky Zhou (rickyz) ChangesCalls to @llvm.xray.{custom,typed}event are lowered to an nop sled that can be This leads to crashes on x86, as the custom event hook can end up using SSE Fix this by wrapping custom event hooks in CALLSEQ_START/CALLSEQ_END as done for Full diff: https://github.com/llvm/llvm-project/pull/89360.diff 3 Files Affected:
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index bedec0c8974a85..e1d9793813060d 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -36254,6 +36254,31 @@ X86TargetLowering::EmitSjLjDispatchBlock(MachineInstr &MI, return BB; } +MachineBasicBlock * +X86TargetLowering::emitPatchableEventCall(MachineInstr &MI, + MachineBasicBlock *BB) const { + // Wrap patchable event calls in CALLSEQ_START/CALLSEQ_END, as they + // are lowered into calls, which may require valid stack alignment. + const TargetInstrInfo &TII = *Subtarget.getInstrInfo(); + const MIMetadata MIMD(MI); + MachineFunction &MF = *BB->getParent(); + + // Emit CALLSEQ_START right before the instruction. + BB->getParent()->getFrameInfo().setAdjustsStack(true); + unsigned AdjStackDown = TII.getCallFrameSetupOpcode(); + MachineInstrBuilder CallseqStart = + BuildMI(MF, MIMD, TII.get(AdjStackDown)).addImm(0).addImm(0).addImm(0); + BB->insert(MachineBasicBlock::iterator(MI), CallseqStart); + + // Emit CALLSEQ_END right after the instruction. + unsigned AdjStackUp = TII.getCallFrameDestroyOpcode(); + MachineInstrBuilder CallseqEnd = + BuildMI(MF, MIMD, TII.get(AdjStackUp)).addImm(0).addImm(0); + BB->insertAfter(MachineBasicBlock::iterator(MI), CallseqEnd); + + return BB; +} + MachineBasicBlock * X86TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI, MachineBasicBlock *BB) const { @@ -36484,7 +36509,7 @@ X86TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI, case TargetOpcode::PATCHABLE_EVENT_CALL: case TargetOpcode::PATCHABLE_TYPED_EVENT_CALL: - return BB; + return emitPatchableEventCall(MI, BB); case X86::LCMPXCHG8B: { const X86RegisterInfo *TRI = Subtarget.getRegisterInfo(); diff --git a/llvm/lib/Target/X86/X86ISelLowering.h b/llvm/lib/Target/X86/X86ISelLowering.h index e348ba6e8ac085..78bac6e0774630 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.h +++ b/llvm/lib/Target/X86/X86ISelLowering.h @@ -1808,6 +1808,9 @@ namespace llvm { MachineBasicBlock *EmitSjLjDispatchBlock(MachineInstr &MI, MachineBasicBlock *MBB) const; + MachineBasicBlock *emitPatchableEventCall(MachineInstr &MI, + MachineBasicBlock *MBB) const; + /// Emit flags for the given setcc condition and operands. Also returns the /// corresponding X86 condition code constant in X86CC. SDValue emitFlagsForSetcc(SDValue Op0, SDValue Op1, ISD::CondCode CC, diff --git a/llvm/test/CodeGen/X86/xray-leaf-stack-alignment.ll b/llvm/test/CodeGen/X86/xray-leaf-stack-alignment.ll new file mode 100644 index 00000000000000..f6dd8f527d39d7 --- /dev/null +++ b/llvm/test/CodeGen/X86/xray-leaf-stack-alignment.ll @@ -0,0 +1,18 @@ +; RUN: llc -mtriple=x86_64 < %s | FileCheck %s + +@leaf_func.event_id = internal constant i32 1, align 4 + +define void @leaf_func() "xray-instruction-threshold"="999" "frame-pointer"="none" nounwind { + ; CHECK-LABEL: leaf_func: + ; CHECK-NEXT: .Lfunc_begin0: + ; CHECK-NEXT: # %bb.0: + ; CHECK-NEXT: pushq %rax + ; CHECK-NEXT: movl $leaf_func.event_id, %eax + ; CHECK-NEXT: movl $8, %ecx + ; CHECK-NEXT: .p2align 1, 0x90 + ; CHECK-NEXT: .Lxray_event_sled_0: + call void @llvm.xray.customevent(ptr @leaf_func.event_id, i64 8) + ret void +} + +declare void @llvm.xray.customevent(ptr nocapture readonly, i64) |
7a27488 to 59e62e1 Compare Calls to @llvm.xray.{custom,typed}event are lowered to an nop sled that can be patched with a call to an instrumentation function at runtime. Prior to this change, x86 codegen did not guarantee that these patched calls run with the appropriate stack alignment (particularly in leaf functions, where normal stack alignment assumptions may not hold). This leads to crashes on x86, as the custom event hook can end up using SSE instructions that assume a 16-byte aligned stack. Fix this by wrapping custom event hooks in CALLSEQ_START/CALLSEQ_END as done for regular function calls. One downside of this approach is that on functions whose stacks aren't already aligned, we may end up running stack alignment fixup instructions even when instrumentation is disabled. An alternative could be to make the custom event assembly trampolines stack-alignment-agnostic. This was the case in the past, but llvm@b46c898 removed this due to complexity in maintaining CFI directives for these stack adjustments. Since we are already willing to pay the call argument setup cost for custom hooks when instrumentation is disabled, I am hoping that an extra push/pop in this hopefully uncommon unaligned stack case is tolerable. 59e62e1 to abb580e Compare | @llvm/pr-subscribers-xray since the paths at https://github.com/orgs/llvm/teams/pr-subscribers-xray don't seem to match this PR. |
MaskRay left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is similar to EmitLoweredTLSAddr
| @@ -0,0 +1,18 @@ | |||
| ; RUN: llc -mtriple=x86_64 < %s | FileCheck %s | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A file-level comment ;; ... can be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test can be added to xray-custom-log.ll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the test to xray-custom-log.ll ? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per discussions on https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178 , it's recommended for reviewers to resolve comments, even if they are really straightforward:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, moved, thanks (and thanks for the heads up re resolving comments - still getting used to using Github :-))
Add file comment to test.
Yeah, I copied the approach taken there - I kept them separate mostly because I couldn't think of a great name capturing a nice conceptual overlap ("emit thing that will turn into a function call, either at compile time or in the case of xray, conditionally at runtime") - I guess I can also imagine the two diverging in the future - but lmk if you prefer otherwise! |
Use MachineFunction::getFrameInfo
Move test into xray-custom-log.ll.
Adding a new function seems fine for now! |
Calls to @llvm.xray.{custom,typed}event are lowered to an nop sled that can be
patched with a call to an instrumentation function at runtime. Prior to this
change, x86 codegen did not guarantee that these patched calls run with the
appropriate stack alignment (particularly in leaf functions, where normal stack
alignment assumptions may not hold).
This leads to crashes on x86, as the custom event hook can end up using SSE
instructions that assume a 16-byte aligned stack.
Fix this by wrapping custom event hooks in CALLSEQ_START/CALLSEQ_END as done for
regular function calls. One downside of this approach is that on functions whose
stacks aren't already aligned, we may end up running stack alignment fixup
instructions even when instrumentation is disabled. An alternative could be to
make the custom event assembly trampolines stack-alignment-agnostic. This was
the case in the past, but
b46c898
removed this due to complexity in maintaining CFI directives for these stack
adjustments. Since we are already willing to pay the call argument setup cost
for custom hooks when instrumentation is disabled, I am hoping that an extra
push/pop in this hopefully uncommon unaligned stack case is tolerable.