Skip to content

[SYCL][CUDA][HIP] Guidance for __launch_bounds__ in SYCL Β #8080

@abagusetty

Description

@abagusetty

Is there a way to get/mention an equivalent functionality of __launch_bounds__() in SYCL.

Porting an optimized CUDA kernel to SYCL which preserves similar launch configuration (<<<....>>>) parameters, but without the functionality of __launch_bounds__() in SYCL leads to the following error because of the kernel launch specifies too many threads for the kernel's register count

Any suggestions: (a) One solution is to tweak with the global and local iteration space for the nd_range but wasn`t sure if this would be portable & performance approach when switching to other devices (i.e., PVC, MI250x, etc).

Error:

PI CUDA ERROR:	Value: 701	Name: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES	Description: too many resources requested for launch	Function: cuda_piEnqueueKernelLaunch	Source Location: ..../llvm_sycl/sycl/plugins/cuda/pi_cuda.cpp:3214 terminate called after throwing an instance of 'sycl::_V1::runtime_error' what(): Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES) -5 (PI_ERROR_OUT_OF_RESOURCES)

Metadata

Metadata

Assignees

No one assigned

    Labels

    cudaCUDA back-endenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions