Skip to main content
-3 votes
0 answers
64 views

I am currently writing a driver for the Intel ARC GPU series (specifically I use the A750 for testing purposes) for my own operating system. I am already able to execute compute kernels that use ...
Joel Marker's user avatar
1 vote
0 answers
51 views

[Goal & Problem] I am trying to accelerate ONNX model inference on an RK3588 (Orange Pi 5) board using the Mali-G610 GPU. I have built OnnxRuntime (ORT) with the ACL (Compute Library) Execution ...
이호연's user avatar
3 votes
2 answers
100 views

I tried to do an old advent of code problem in OpenCL, but it's very slow. const char *KernelSource_part_b = "\n" \ "typedef unsigned long uint64_t; ...
Richard Clubb's user avatar
0 votes
0 answers
61 views

I want to compile and run openCL programs to do some parallel computing on my mobile device s24fe(Exynos2400e) I tried to compile clinfo but it always returns 0 in no of devices I tried various ...
Lakshit Karsoliya's user avatar
0 votes
1 answer
58 views

Consider the following OpenCL code in which each element in a vector-type variable gets its value via array lookup: float* tbl = get_data(); int4 offsets = get_offsets(); float4 my_elements = { ...
einpoklum's user avatar
  • 137k
1 vote
1 answer
104 views

I am writing a OpenCL kernel that uses atomics. As I only need to synchronize groups of 192 threads, I figured using local atomics would be ideal. However, the change from global to local atomics ...
Edward Murphy's user avatar
1 vote
0 answers
46 views

I'm using the OpenCL clBuildProgram() API function on a program created from a source string. The source is: kernel void foo(int val, write_only pipe int outPipe) { write_pipe(outPipe, &val); }...
einpoklum's user avatar
  • 137k
0 votes
0 answers
17 views

Suppose I've allocated a region of memory with clSVMAlloc(). Looking at the clEnqueueSVMMap() function, we are told that it will "allow the host to update a region of a SVM buffer". Does ...
einpoklum's user avatar
  • 137k
0 votes
0 answers
26 views

OpenCL has the mechanism of "shared virtual memory" (SVM), where the same memory region is available both in OpenCL kernel code and in host-side code - and updates on one side affect the ...
einpoklum's user avatar
  • 137k
0 votes
0 answers
16 views

Most OpenCL API calls return a status/error value, either directly or via an out-parameter (example: clCreateBuffer()). While that is not as informative as a long-form string description, it can tell ...
einpoklum's user avatar
  • 137k
0 votes
1 answer
33 views

OpenCL C supports "vector data types" - a fixed number of scalar types which may be operated on together, as though they were a single scalar, mostly: we can apply arithmetic and logic ...
einpoklum's user avatar
  • 137k
0 votes
1 answer
37 views

I'm looking at the clEnqueueWaitForEvents() OpenCL API function. As I see it, this is a real boon. You see, almost all clEnqueueXXX functions take an array-of-events, and the size of that array, to ...
einpoklum's user avatar
  • 137k
0 votes
1 answer
46 views

The OpenCL API has one object which is sort of a "kitchen sink" for a lot of stuff: The program (with handle type cl_program). It can hold: A textual program source ( ...
einpoklum's user avatar
  • 137k
1 vote
1 answer
46 views

In the following program, I compile a kernel for the first device on the first platform: const char* kernel_source_code = R"( __kernel void vectorAdd( __global float * __restrict C, ...
einpoklum's user avatar
  • 137k
0 votes
1 answer
99 views

I'm trying to run a basic kernel in OpenCL. See the snipped attached const char kernel_source[] = "__kernel void matmul(__global float* A, __global float* B, __global float* C) { int row = ...
T3chstop's user avatar

15 30 50 per page
1
2 3 4 5
386