Skip to content

Conversation

@lwshanbd
Copy link
Contributor

@lwshanbd lwshanbd commented Feb 3, 2025

According to #114339
Implement isTruncateFree and isZExtFree methods for NVPTX target to indicate that truncating from i64 to i32 and zero-extending from i32 to i64 are free operations. This can help the backend make more efficient code generation decisions.

Implement isTruncateFree and isZExtFree methods for NVPTX target to indicate that truncating from i64 to i32 and zero-extending from i32 to i64 are free operations. This can help the backend make more efficient code generation decisions.
@lwshanbd
Copy link
Contributor Author

lwshanbd commented Feb 3, 2025

@llvmbot
Copy link
Member

llvmbot commented Feb 3, 2025

@llvm/pr-subscribers-backend-nvptx

Author: Baodi Shan (lwshanbd)

Changes

According to #114339
Implement isTruncateFree and isZExtFree methods for NVPTX target to indicate that truncating from i64 to i32 and zero-extending from i32 to i64 are free operations. This can help the backend make more efficient code generation decisions.


Full diff: https://github.com/llvm/llvm-project/pull/125580.diff

2 Files Affected:

  • (modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.h (+22)
  • (added) llvm/test/CodeGen/NVPTX/free-truncate-zext.ll (+20)
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h index 5adf69d621552f..f4a7ce9d45c0e4 100644 --- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h +++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h @@ -150,6 +150,28 @@ class NVPTXTargetLowering : public TargetLowering { DstTy->getPrimitiveSizeInBits() == 32; } + bool isTruncateFree(EVT SrcVT, EVT DstVT) const override { + // Truncating from i64 to i32 is free + if (SrcVT.isInteger() && DstVT.isInteger()) + return SrcVT.getSizeInBits() == 64 && DstVT.getSizeInBits() == 32; + return false; + } + + bool isZExtFree(EVT FromVT, EVT ToVT) const override { + // Zero-extending from i32 to i64 is free + if (FromVT.isInteger() && ToVT.isInteger()) + return FromVT.getSizeInBits() == 32 && ToVT.getSizeInBits() == 64; + return false; + } + + bool isZExtFree(Type *SrcTy, Type *DstTy) const override { + // Zero-extending from i32 to i64 is free + if (SrcTy->isIntegerTy() && DstTy->isIntegerTy()) + return SrcTy->getPrimitiveSizeInBits() == 32 && + DstTy->getPrimitiveSizeInBits() == 64; + return false; + } + EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Ctx, EVT VT) const override { if (VT.isVector()) diff --git a/llvm/test/CodeGen/NVPTX/free-truncate-zext.ll b/llvm/test/CodeGen/NVPTX/free-truncate-zext.ll new file mode 100644 index 00000000000000..b32fe1b727eeda --- /dev/null +++ b/llvm/test/CodeGen/NVPTX/free-truncate-zext.ll @@ -0,0 +1,20 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc < %s -march=nvptx64 -mcpu=sm_90 | FileCheck %s + +define i32 @test_trunc(i64 %x, i64 %y) { +; CHECK-LABEL: test_trunc( +; CHECK: { +; CHECK-NEXT: .reg .b32 %r<4>; +; CHECK-EMPTY: +; CHECK-NEXT: // %bb.0: +; CHECK-NEXT: ld.param.u32 %r1, [test_trunc_param_0]; +; CHECK-NEXT: ld.param.u32 %r2, [test_trunc_param_1]; +; CHECK-NEXT: mad.lo.s32 %r3, %r1, %r2, 123456789; +; CHECK-NEXT: st.param.b32 [func_retval0], %r3; +; CHECK-NEXT: ret; + %a = mul i64 %x, %y + %b = add i64 %a, 123456789 + %c = and i64 %b, -1 + %trunc = trunc i64 %c to i32 + ret i32 %trunc +} 
%c = and i64 %b, -1
%trunc = trunc i64 %c to i32
ret i32 %trunc
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for zext?


bool isTruncateFree(EVT SrcVT, EVT DstVT) const override {
// Truncating from i64 to i32 is free
if (SrcVT.isInteger() && DstVT.isInteger())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do something like SrcVT == MVT::i64


bool isZExtFree(Type *SrcTy, Type *DstTy) const override {
// Zero-extending from i32 to i64 is free
if (SrcTy->isIntegerTy() && DstTy->isIntegerTy())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this should work I think SrcTy->isIntegerTy(32)

Comment on lines +167 to +173
bool isZExtFree(Type *SrcTy, Type *DstTy) const override {
// Zero-extending from i32 to i64 is free
if (SrcTy->isIntegerTy() && DstTy->isIntegerTy())
return SrcTy->getPrimitiveSizeInBits() == 32 &&
DstTy->getPrimitiveSizeInBits() == 64;
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write a test for this code?

Comment on lines +154 to +157
// Truncating from i64 to i32 is free
if (SrcVT.isInteger() && DstVT.isInteger())
return SrcVT.getSizeInBits() == 64 && DstVT.getSizeInBits() == 32;
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Truncating from i64 to i32 is free
if (SrcVT.isInteger() && DstVT.isInteger())
return SrcVT.getSizeInBits() == 64 && DstVT.getSizeInBits() == 32;
return false;
if (SrcVT.isInteger() && DstVT.isInteger())
return false;
return DstTy->getPrimitiveSizeInBits() % 32 == 0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to return false when both are isInteger?

if (!SrcTy->isIntegerTy() || !DstTy->isIntegerTy())
return false;
return SrcTy->getPrimitiveSizeInBits() == 64 &&
DstTy->getPrimitiveSizeInBits() == 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DstTy->getPrimitiveSizeInBits() == 32;
if (SrcTy->getPrimitiveSizeInBits() <= DstTy->getPrimitiveSizeInBits())
return false;
return ToVT.getSizeInBits() % 32 == 0;
Comment on lines +160 to +173
bool isZExtFree(EVT FromVT, EVT ToVT) const override {
// Zero-extending from i32 to i64 is free
if (FromVT.isInteger() && ToVT.isInteger())
return FromVT.getSizeInBits() == 32 && ToVT.getSizeInBits() == 64;
return false;
}

bool isZExtFree(Type *SrcTy, Type *DstTy) const override {
// Zero-extending from i32 to i64 is free
if (SrcTy->isIntegerTy() && DstTy->isIntegerTy())
return SrcTy->getPrimitiveSizeInBits() == 32 &&
DstTy->getPrimitiveSizeInBits() == 64;
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no longer confident that zext is free in this case.

In this simple example an instruction is issued to zero out R3. Are there counter-examples that show why it is free?

@lwshanbd
Copy link
Contributor Author

lwshanbd commented Mar 5, 2025

Hi @justinfargnoli, sorry for the late follow-up.
It seems that isTruncateFree doesn’t work well in NVPTX.

If we don’t implement isZExtFree, I’m unable to generate PTX without cvt even if keep isTruncateFree return true.
Specifically, for all trunc i64 %x to i32 operations, the resulting PTX always includes a cvt instruction.

I’m not sure how to properly implement zext, given that you mentioned it’s not free. Do you have any suggestions?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants