Subscribe to RSS

Question 1

I have created a .NET app that uses Microsoft.ML.OnnxRuntime.Gpu for interference. Now I'm trying to integrate it with Azure Kubernetes. We have made the setup with Tesla T4 GPU and we confirmed it's ...

Question 2

I am currently writing a driver for the Intel ARC GPU series (specifically I use the A750 for testing purposes) for my own operating system. I am already able to execute compute kernels that use ...

Question 3

I am making a particle simulator in python, and noticed that my collision detection is ruining the performance the most,I am not even sure if this is a thing but is it possible to tell the GPU to do ...

Question 4

This is a bit of a slog so bare with me. I'm currently writing a 3D S(moothed) P(article) H(ydrodynamics) simulation in Unity with a parallel HLSL backend. It's a Lagrangian method of fluid simulation,...

Question 5

I am running Flux 1 dev text to image model through ComfyUI in Kaggle. Everything works but I noticed that Kaggle offers a second GPU inside the notebook. If I try to run two instances of the ComfyUI ...

Question 6

Im trying to use tensorflow with gpu on my windows device, i have python 3.13 venv. Is newer version of tensorflow support gou acceleration on windows. Ive read that it stopped in tensorflow version 2....

Question 7

I'm exploring options for running large language models locally on my workstation and would appreciate guidance on suitable models given my hardware constraints. Hardware specifications: CPU: Intel ...

Question 8

Is @ray.remote def run_experiment(...): (...) if __name__ == '__main__': ray.init() exp_config = sys.argv[1] params_tuples, num_cpus, num_gpus = load_exp_config(exp_config) ray.get(...

Question 9

I am invoking a compute shader, writing to it, then reading it to then write to disk. According to renderdoc the image is properly generated. Additionally, when compiled in debug mode I get the right ...

Question 10

Context I'm implementing a Julia set fractal renderer using CubeCL (a Rust GPU compute framework). I want to achieve "infinite zoom" similar to deep Mandelbrot zoom videos, which requires ...

Question 11

I have encountered a particular problem while executing a function from the transformers library of huggingface on an Intel GPU wheel of torch. Since I am doing something I normally shouldn't be ...

Question 12

For testing purposes I need a tool that will occupy some amount of VRAM, leaving a reduced available VRAM to the rest of the applications. I implemented a version that somewhat works using D3D12 API, ...

Question 13

I have a machine-translation model. In this model, I calculate a vector for a given sentence and I take this vector, aggregate with each generated output of RNN and put it into RNN again for ...

Question 14

I am using an i686 system, with the compiler mingw G++. I can run code that creates a GPU device and attaches it to a window fine on that machine. However, when I attempt to run it on my i686 windows ...

Question 15

I am trying to implement producer consumer problem in GPU-CPU. Required for some other project. GPU requests some data via Unified memory to CPU. CPU copies that data to a specific location in global ...

Collectives™ on Stack Overflow

Microsoft.ML C#: GPU not found in K8s/Docker container

Intel ARC GPU hangs when performing an untyped surface read [closed]

Slow collision detection in Python [closed]

Taking advantage of memory contiguousness in HLSL

ComfyUI + Flux 1 dev + limited RAM + same workflow: With 2 GPUs?

Tensorflow GPU use in python 3.13 [duplicate]

Which LLMs can I run locally on RTX 1080 8GB with 48GB RAM?

Is passing ray resources as options when calling the function equivalent to setting them in the function's decorator?

Problems with fencing sporadic command buffer submission in Vulkan

Implementing Arbitrary Precision Arithmetic in CubeCL for Infinite Zoom Fractals

How does one log the operations done on a GPU during the execution of Python code?

How to force allocated D3D12 resource to reside in VRAM and not be demoted to shared RAM?

Utilizing GPU with RNN models which takes it's output as input [torch]

i686 compiler with GNU and SDL3 failing to claim window

CPU-GPU producer-consumer pattern using unified memory but GPU is in spin loop

Hot Network Questions