9,136 questions
1 vote
0 answers
34 views
Microsoft.ML C#: GPU not found in K8s/Docker container
I have created a .NET app that uses Microsoft.ML.OnnxRuntime.Gpu for interference. Now I'm trying to integrate it with Azure Kubernetes. We have made the setup with Tesla T4 GPU and we confirmed it's ...
-3 votes
0 answers
64 views
Intel ARC GPU hangs when performing an untyped surface read [closed]
I am currently writing a driver for the Intel ARC GPU series (specifically I use the A750 for testing purposes) for my own operating system. I am already able to execute compute kernels that use ...
-2 votes
0 answers
66 views
Slow collision detection in Python [closed]
I am making a particle simulator in python, and noticed that my collision detection is ruining the performance the most,I am not even sure if this is a thing but is it possible to tell the GPU to do ...
0 votes
0 answers
46 views
Taking advantage of memory contiguousness in HLSL
This is a bit of a slog so bare with me. I'm currently writing a 3D S(moothed) P(article) H(ydrodynamics) simulation in Unity with a parallel HLSL backend. It's a Lagrangian method of fluid simulation,...
Tooling
0 votes
0 replies
24 views
ComfyUI + Flux 1 dev + limited RAM + same workflow: With 2 GPUs?
I am running Flux 1 dev text to image model through ComfyUI in Kaggle. Everything works but I noticed that Kaggle offers a second GPU inside the notebook. If I try to run two instances of the ComfyUI ...
-4 votes
0 answers
31 views
Tensorflow GPU use in python 3.13 [duplicate]
Im trying to use tensorflow with gpu on my windows device, i have python 3.13 venv. Is newer version of tensorflow support gou acceleration on windows. Ive read that it stopped in tensorflow version 2....
Tooling
0 votes
0 replies
85 views
Which LLMs can I run locally on RTX 1080 8GB with 48GB RAM?
I'm exploring options for running large language models locally on my workstation and would appreciate guidance on suitable models given my hardware constraints. Hardware specifications: CPU: Intel ...
1 vote
1 answer
72 views
Is passing ray resources as options when calling the function equivalent to setting them in the function's decorator?
Is @ray.remote def run_experiment(...): (...) if __name__ == '__main__': ray.init() exp_config = sys.argv[1] params_tuples, num_cpus, num_gpus = load_exp_config(exp_config) ray.get(...
0 votes
0 answers
52 views
Problems with fencing sporadic command buffer submission in Vulkan
I am invoking a compute shader, writing to it, then reading it to then write to disk. According to renderdoc the image is properly generated. Additionally, when compiled in debug mode I get the right ...
2 votes
0 answers
115 views
Implementing Arbitrary Precision Arithmetic in CubeCL for Infinite Zoom Fractals
Context I'm implementing a Julia set fractal renderer using CubeCL (a Rust GPU compute framework). I want to achieve "infinite zoom" similar to deep Mandelbrot zoom videos, which requires ...
3 votes
0 answers
112 views
How does one log the operations done on a GPU during the execution of Python code?
I have encountered a particular problem while executing a function from the transformers library of huggingface on an Intel GPU wheel of torch. Since I am doing something I normally shouldn't be ...
1 vote
0 answers
65 views
How to force allocated D3D12 resource to reside in VRAM and not be demoted to shared RAM?
For testing purposes I need a tool that will occupy some amount of VRAM, leaving a reduced available VRAM to the rest of the applications. I implemented a version that somewhat works using D3D12 API, ...
0 votes
0 answers
64 views
Utilizing GPU with RNN models which takes it's output as input [torch]
I have a machine-translation model. In this model, I calculate a vector for a given sentence and I take this vector, aggregate with each generated output of RNN and put it into RNN again for ...
0 votes
0 answers
90 views
i686 compiler with GNU and SDL3 failing to claim window
I am using an i686 system, with the compiler mingw G++. I can run code that creates a GPU device and attaches it to a window fine on that machine. However, when I attempt to run it on my i686 windows ...
0 votes
1 answer
99 views
CPU-GPU producer-consumer pattern using unified memory but GPU is in spin loop
I am trying to implement producer consumer problem in GPU-CPU. Required for some other project. GPU requests some data via Unified memory to CPU. CPU copies that data to a specific location in global ...