Skip to content

Conversation

@lxe
Copy link

@lxe lxe commented Sep 9, 2025

Implement a PEP 517 build backend to handle acceleration and wheel repair options via --config-settings instead of relying on env variables. This works much better with caching front-ends such as uv.

Usage: pip install . --config-settings="accel=cuda" or uv add . -Caccel=cuda

  • Small changes to repair Cmake cache file corruption.
  • Update README with unified hardware acceleration documentation
@absadiki
Copy link
Owner

Thanks a lot @lxe for the PR! The PEP 517 implementation and config-settings approach make sense.
However, I ran the actions and noticed some failures. Could you please take a look at the failing jobs?

@lxe
Copy link
Author

lxe commented Sep 11, 2025

Checking this out

@lxe
Copy link
Author

lxe commented Sep 11, 2025

Adding CPU backend variant ggml-cpu: -mcpu=native+dotprod+noi8mm+nosve 

but then

 /Users/runner/work/pywhispercpp/pywhispercpp/whisper.cpp/ggml/src/ggml-cpu/ggml-cpu-quants.c:1818:88: error: always_inline function 'vmmlaq_s32' requires target feature 'i8mm', but would be inlined into function 'ggml_vec_dot_q4_0_q8_0' that is compiled without support for 'i8mm' 
@lxe
Copy link
Author

lxe commented Sep 11, 2025

Looks like all the macos jobs are failing

https://github.com/absadiki/pywhispercpp/actions/runs/17462051185/job/49588808069

This is because it's detecting CPU without GGML_MACHINE_SUPPORTS_i8mm support when compiling ggml.

Something recently changed in the github action worker executors?

@lxe
Copy link
Author

lxe commented Sep 11, 2025

The last successful one was https://github.com/absadiki/pywhispercpp/actions/runs/17568832696/job/50100390553?pr=131

And it's using Xcode 15 instead of 16... maybe that's the difference? I'm not sure , I don't have a mac to test this on

@absadiki
Copy link
Owner

I don’t have a Mac either, so I’m relying on GitHub Actions 😅.
You’re right, the macOS version was updated on the GitHub Actions runners. I first tried updating the whisper.cpp submodule to see if that fixed it, but it didn’t. I then downgraded the runner to macos-14, and it seems to be working now.

Could you please pull the latest commits from main so we can run CI with your PR?

 - Implement custom PEP 517 build backend to handle acceleration options - Support CUDA, CoreML, Vulkan, OpenBLAS, and OpenVINO via --config-settings - Usage: pip install . --config-settings="accel=cuda" or uv add . -Caccel=cuda - Add wheel repair control via config-settings (repair=false to disable) - Improve CMake build robustness: - Clean corrupted CMake cache files automatically - Filter environment variables to only pass safe, relevant ones - Add alternative Python executable setting for better compatibility - Update README with unified hardware acceleration documentation - Remove legacy environment variable approach in favor of config-settings
@lxe
Copy link
Author

lxe commented Sep 14, 2025

Could you please pull the latest commits from main so we can run CI with your PR?

Done! cc @absadiki

@absadiki
Copy link
Owner

Thanks a lot @lxe, the CI now passes cleanly, and the PEP 517 backend approach looks solid.

However, I don’t see a reason for forcing the user to a limited "safe_env_vars" set. For example, I tried building with CUDA support, and, for some reason, CMake wasn’t able to detect my GPU architecture. I had to use CMAKE_CUDA_ARCHITECTURES as per the docs, but it never got applied because it wasn’t in safe_env_vars.

So I’d suggest to:

  • Always set user-specified --config-settings into the environment, filtering only obviously unsafe characters.
  • Allow any argument passed to be applied directly.

This would make the backend more flexible and fully respect user-provided build options. while also supporting caching front-ends such as uv.

what do you think ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants