- Notifications
You must be signed in to change notification settings - Fork 15.3k
[AMDGPU] add function attrbute amdgpu-lib-fun #74737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| @llvm/pr-subscribers-clang-codegen Author: Yaxun (Sam) Liu (yxsamliu) ChangesAdd a function attribute "amdgpu-lib-fun" to indicate that the function needs special handling in backend. Basically it will not be internalized so that it will not be removed by DCE after internalization. This is to keep the library functions that are not called by users' code but will be called by instructions generated by LLVM passes or instruction selection, e.g. sanitizers or lowering of 128 bit integer divisioin. Full diff: https://github.com/llvm/llvm-project/pull/74737.diff 5 Files Affected:
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 121ed203829ce..676faddd2d1ac 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -2011,6 +2011,13 @@ def AMDGPUNumVGPR : InheritableAttr { let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; } +def AMDGPULibFun : InheritableAttr { + let Spellings = [Clang<"amdgpu_lib_fun">]; + let Documentation = [AMDGPULibFunDocs]; + let Subjects = SubjectList<[Function]>; + let SimpleHandler = 1; +} + def AMDGPUKernelCall : DeclOrTypeAttr { let Spellings = [Clang<"amdgpu_kernel">]; let Documentation = [Undocumented]; diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 88f7c65e6e847..c2c77d4d1d817 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2693,6 +2693,17 @@ An error will be given if: }]; } +def AMDGPULibFunDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The ``amdgpu_lib_fun`` attribute can be applied to a function for AMDGPU target +to indicate it is a library function which are handled specially in backend. +An AMDGPU library function is not internalized and can be used to fullfill +calls generated by LLVM passes or instruction selection. Unused AMDGPU library +functions will be eliminated by the backend. + }]; +} + def DocCatCallingConvs : DocumentationCategory<"Calling Conventions"> { let Content = [{ Clang supports several different calling conventions, depending on the target diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index b654e3f12af8d..1a56ee3692d07 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -356,6 +356,8 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes( if (NumVGPR != 0) F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR)); } + if (FD->getAttr<AMDGPULibFunAttr>()) + F->addFnAttr("amdgpu-lib-fun"); } /// Emits control constants used to change per-architecture behaviour in the diff --git a/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu b/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu index 89add87919c12..e319cd4809e0d 100644 --- a/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu +++ b/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu @@ -8,6 +8,9 @@ // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \ // RUN: -o - -x hip %s -munsafe-fp-atomics \ // RUN: | FileCheck -check-prefix=NO-UNSAFE-FP-ATOMICS %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa \ +// RUN: -fcuda-is-device -emit-llvm -o - -x hip %s \ +// RUN: | FileCheck %s #include "Inputs/cuda.h" @@ -15,8 +18,13 @@ __device__ void test() { // UNSAFE-FP-ATOMICS: define{{.*}} void @_Z4testv() [[ATTR:#[0-9]+]] } +__attribute__((amdgpu_lib_fun)) __device__ void lib_fun() { +// CHECK: define{{.*}} void @_Z7lib_funv() [[LIB_FUN:#[0-9]+]] +} + // Make sure this is silently accepted on other targets. // NO-UNSAFE-FP-ATOMICS-NOT: "amdgpu-unsafe-fp-atomics" // UNSAFE-FP-ATOMICS-DAG: attributes [[ATTR]] = {{.*}}"amdgpu-unsafe-fp-atomics"="true" +// CHECK-DAG: attributes [[LIB_FUN]] = {{.*}}"amdgpu-lib-fun" diff --git a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl index b0dfc97b53b2c..bce7739c7a429 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl @@ -147,6 +147,9 @@ kernel void default_kernel() { // CHECK: define{{.*}} amdgpu_kernel void @default_kernel() [[DEFAULT_KERNEL_ATTRS:#[0-9]+]] } +__attribute__((amdgpu_lib_fun)) void lib_fun() { +// CHECK: define{{.*}} void @lib_fun() [[LIB_FUN:#[0-9]+]] +} // Make sure this is silently accepted on other targets. // X86-NOT: "amdgpu-flat-work-group-size" @@ -191,3 +194,4 @@ kernel void default_kernel() { // CHECK-DAG: attributes [[A_FUNCTION]] = {{.*}} // CHECK-DAG: attributes [[DEFAULT_KERNEL_ATTRS]] = {{.*}} "amdgpu-flat-work-group-size"="1,256" +// CHECK-DAG: attributes [[LIB_FUN]] = {{.*}} "amdgpu-lib-fun" |
| @llvm/pr-subscribers-clang Author: Yaxun (Sam) Liu (yxsamliu) ChangesAdd a function attribute "amdgpu-lib-fun" to indicate that the function needs special handling in backend. Basically it will not be internalized so that it will not be removed by DCE after internalization. This is to keep the library functions that are not called by users' code but will be called by instructions generated by LLVM passes or instruction selection, e.g. sanitizers or lowering of 128 bit integer divisioin. Full diff: https://github.com/llvm/llvm-project/pull/74737.diff 5 Files Affected:
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 121ed203829ce..676faddd2d1ac 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -2011,6 +2011,13 @@ def AMDGPUNumVGPR : InheritableAttr { let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; } +def AMDGPULibFun : InheritableAttr { + let Spellings = [Clang<"amdgpu_lib_fun">]; + let Documentation = [AMDGPULibFunDocs]; + let Subjects = SubjectList<[Function]>; + let SimpleHandler = 1; +} + def AMDGPUKernelCall : DeclOrTypeAttr { let Spellings = [Clang<"amdgpu_kernel">]; let Documentation = [Undocumented]; diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 88f7c65e6e847..c2c77d4d1d817 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2693,6 +2693,17 @@ An error will be given if: }]; } +def AMDGPULibFunDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The ``amdgpu_lib_fun`` attribute can be applied to a function for AMDGPU target +to indicate it is a library function which are handled specially in backend. +An AMDGPU library function is not internalized and can be used to fullfill +calls generated by LLVM passes or instruction selection. Unused AMDGPU library +functions will be eliminated by the backend. + }]; +} + def DocCatCallingConvs : DocumentationCategory<"Calling Conventions"> { let Content = [{ Clang supports several different calling conventions, depending on the target diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index b654e3f12af8d..1a56ee3692d07 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -356,6 +356,8 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes( if (NumVGPR != 0) F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR)); } + if (FD->getAttr<AMDGPULibFunAttr>()) + F->addFnAttr("amdgpu-lib-fun"); } /// Emits control constants used to change per-architecture behaviour in the diff --git a/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu b/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu index 89add87919c12..e319cd4809e0d 100644 --- a/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu +++ b/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu @@ -8,6 +8,9 @@ // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \ // RUN: -o - -x hip %s -munsafe-fp-atomics \ // RUN: | FileCheck -check-prefix=NO-UNSAFE-FP-ATOMICS %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa \ +// RUN: -fcuda-is-device -emit-llvm -o - -x hip %s \ +// RUN: | FileCheck %s #include "Inputs/cuda.h" @@ -15,8 +18,13 @@ __device__ void test() { // UNSAFE-FP-ATOMICS: define{{.*}} void @_Z4testv() [[ATTR:#[0-9]+]] } +__attribute__((amdgpu_lib_fun)) __device__ void lib_fun() { +// CHECK: define{{.*}} void @_Z7lib_funv() [[LIB_FUN:#[0-9]+]] +} + // Make sure this is silently accepted on other targets. // NO-UNSAFE-FP-ATOMICS-NOT: "amdgpu-unsafe-fp-atomics" // UNSAFE-FP-ATOMICS-DAG: attributes [[ATTR]] = {{.*}}"amdgpu-unsafe-fp-atomics"="true" +// CHECK-DAG: attributes [[LIB_FUN]] = {{.*}}"amdgpu-lib-fun" diff --git a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl index b0dfc97b53b2c..bce7739c7a429 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl @@ -147,6 +147,9 @@ kernel void default_kernel() { // CHECK: define{{.*}} amdgpu_kernel void @default_kernel() [[DEFAULT_KERNEL_ATTRS:#[0-9]+]] } +__attribute__((amdgpu_lib_fun)) void lib_fun() { +// CHECK: define{{.*}} void @lib_fun() [[LIB_FUN:#[0-9]+]] +} // Make sure this is silently accepted on other targets. // X86-NOT: "amdgpu-flat-work-group-size" @@ -191,3 +194,4 @@ kernel void default_kernel() { // CHECK-DAG: attributes [[A_FUNCTION]] = {{.*}} // CHECK-DAG: attributes [[DEFAULT_KERNEL_ATTRS]] = {{.*}} "amdgpu-flat-work-group-size"="1,256" +// CHECK-DAG: attributes [[LIB_FUN]] = {{.*}} "amdgpu-lib-fun" |
| @llvm/pr-subscribers-backend-amdgpu Author: Yaxun (Sam) Liu (yxsamliu) ChangesAdd a function attribute "amdgpu-lib-fun" to indicate that the function needs special handling in backend. Basically it will not be internalized so that it will not be removed by DCE after internalization. This is to keep the library functions that are not called by users' code but will be called by instructions generated by LLVM passes or instruction selection, e.g. sanitizers or lowering of 128 bit integer divisioin. Full diff: https://github.com/llvm/llvm-project/pull/74737.diff 5 Files Affected:
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 121ed203829cec..676faddd2d1aca 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -2011,6 +2011,13 @@ def AMDGPUNumVGPR : InheritableAttr { let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; } +def AMDGPULibFun : InheritableAttr { + let Spellings = [Clang<"amdgpu_lib_fun">]; + let Documentation = [AMDGPULibFunDocs]; + let Subjects = SubjectList<[Function]>; + let SimpleHandler = 1; +} + def AMDGPUKernelCall : DeclOrTypeAttr { let Spellings = [Clang<"amdgpu_kernel">]; let Documentation = [Undocumented]; diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 88f7c65e6e847b..c2c77d4d1d8171 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2693,6 +2693,17 @@ An error will be given if: }]; } +def AMDGPULibFunDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The ``amdgpu_lib_fun`` attribute can be applied to a function for AMDGPU target +to indicate it is a library function which are handled specially in backend. +An AMDGPU library function is not internalized and can be used to fullfill +calls generated by LLVM passes or instruction selection. Unused AMDGPU library +functions will be eliminated by the backend. + }]; +} + def DocCatCallingConvs : DocumentationCategory<"Calling Conventions"> { let Content = [{ Clang supports several different calling conventions, depending on the target diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index b654e3f12af8d4..1a56ee3692d072 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -356,6 +356,8 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes( if (NumVGPR != 0) F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR)); } + if (FD->getAttr<AMDGPULibFunAttr>()) + F->addFnAttr("amdgpu-lib-fun"); } /// Emits control constants used to change per-architecture behaviour in the diff --git a/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu b/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu index 89add87919c12d..e319cd4809e0dd 100644 --- a/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu +++ b/clang/test/CodeGenCUDA/amdgpu-func-attrs.cu @@ -8,6 +8,9 @@ // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \ // RUN: -o - -x hip %s -munsafe-fp-atomics \ // RUN: | FileCheck -check-prefix=NO-UNSAFE-FP-ATOMICS %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa \ +// RUN: -fcuda-is-device -emit-llvm -o - -x hip %s \ +// RUN: | FileCheck %s #include "Inputs/cuda.h" @@ -15,8 +18,13 @@ __device__ void test() { // UNSAFE-FP-ATOMICS: define{{.*}} void @_Z4testv() [[ATTR:#[0-9]+]] } +__attribute__((amdgpu_lib_fun)) __device__ void lib_fun() { +// CHECK: define{{.*}} void @_Z7lib_funv() [[LIB_FUN:#[0-9]+]] +} + // Make sure this is silently accepted on other targets. // NO-UNSAFE-FP-ATOMICS-NOT: "amdgpu-unsafe-fp-atomics" // UNSAFE-FP-ATOMICS-DAG: attributes [[ATTR]] = {{.*}}"amdgpu-unsafe-fp-atomics"="true" +// CHECK-DAG: attributes [[LIB_FUN]] = {{.*}}"amdgpu-lib-fun" diff --git a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl index b0dfc97b53b2c5..bce7739c7a429f 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-attrs.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-attrs.cl @@ -147,6 +147,9 @@ kernel void default_kernel() { // CHECK: define{{.*}} amdgpu_kernel void @default_kernel() [[DEFAULT_KERNEL_ATTRS:#[0-9]+]] } +__attribute__((amdgpu_lib_fun)) void lib_fun() { +// CHECK: define{{.*}} void @lib_fun() [[LIB_FUN:#[0-9]+]] +} // Make sure this is silently accepted on other targets. // X86-NOT: "amdgpu-flat-work-group-size" @@ -191,3 +194,4 @@ kernel void default_kernel() { // CHECK-DAG: attributes [[A_FUNCTION]] = {{.*}} // CHECK-DAG: attributes [[DEFAULT_KERNEL_ATTRS]] = {{.*}} "amdgpu-flat-work-group-size"="1,256" +// CHECK-DAG: attributes [[LIB_FUN]] = {{.*}} "amdgpu-lib-fun" |
5f9af05 to b4a30a6 Compare | this patch is used by #74741 |
| ping |
| This sounds like it may be useful outside of AMDGPU back-end. |
| My use-case is more to be able to write functions like I believe the use-case here is a workaround for the issues caused by library ordering? I'm guessing this is related to the problems caused by prematurely optimizing out library functions that later passes wanted to depend on. |
| I was thinking of implementing libm/libc for nvptx, which would produce an IR library . We'll still need to keep the functions around if they are not used explicitly, because we may need them to fulfill libcalls later in the compilation pipeline. Sort of a libdevice replacement which can be used for libcall materialization. But you're right, with RDC object files used for offloading it's probably not necessary. |
That's one problem I'm unsure of how to solve currently. Right now when doing LTO, there's a list of "libfuncs" that backends can emit. If the function is one of these we can't interalize / optimize out the symbol. I was attempting to relax this in https://reviews.llvm.org/D154364 at some point, because ideally we don't want to do this if the backend doesn't use them, but we don't have that logic right now. Right now, the issue is how to handle divergence for different targets. So, for One reason I'd like this is because I'd really like to be able to provide my |
I am confused by the description of "internalized". Do you refer to LTO internalization? You can leverage |
My guess is that the function should be considered used and then thrown away by the backend. |
Yes I mean LTO internalization. We want keep them to the backend but we also want to remove them if they are not used by the backend. |
I wonder if we could just define another |
we need this attribute because AMDGPU target does not support ISA level linking, otherwise we could just link with a library after LLVM codegen. I doubt this attribute is generic enough to introduce something like llvm.used. Also, we do not need to prevent linker from discarding the symbol, therefore it is unnecessary to put them in some global variables. A simple function attribute is suffice. |
Add a function attribute "amdgpu-lib-fun" to indicate that the function needs special handling in backend. Basically it will not be internalized so that it will not be removed by DCE after internalization. This is to keep the library functions that are not called by users' code but will be called by instructions generated by LLVM passes or instruction selection, e.g. sanitizers or lowering of 128 bit integer divisioin.
b4a30a6 to 4264e7e Compare
Add a function attribute "amdgpu-lib-fun" to indicate that the function needs special handling in backend. Basically it will not be internalized so that it will not be removed by DCE after internalization. This is to keep the library functions that are not called by users' code but will be called by instructions generated by LLVM passes or instruction selection, e.g. sanitizers or lowering of 128 bit integer divisioin.