- Notifications
You must be signed in to change notification settings - Fork 1.2k
Pull requests: huggingface/text-generation-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Set
uv UV_PYTHON_INSTALL_DIR explicitly #3197 opened Apr 27, 2025 by sebastianliebscher Loading… updated Nov 27, 2025
1 of 5 tasks
Trtllm backend improvements
#3231 opened May 17, 2025 by leejuyuu Loading… updated Oct 12, 2025
1 of 5 tasks
Remove
once_cell dependency from multiple Cargo.toml files and update usage in validation.rs to use std::sync::LazyLock instead of once_cell::sync::Lazy. #3334 opened Sep 28, 2025 by htiennv Loading… updated Sep 28, 2025
5 tasks
feat: expose GPU energy consumption (mJ) in responses
#3315 opened Aug 28, 2025 by JulienDelavande Loading… updated Aug 28, 2025
2 of 5 tasks
feat: support logit bias in chat request
#3186 opened Apr 22, 2025 by drbh Loading… updated Aug 14, 2025
**Add dedicated CPU-only Dockerfile and update documentation for CPU/…
#3310 opened Aug 7, 2025 by jakubgajski Loading… updated Aug 7, 2025
2 of 5 tasks
Update links Inferentia refer docs
#3154 opened Apr 9, 2025 by guspan-tanadi Loading… updated Jul 21, 2025
1 of 5 tasks
Retrieve the correct cached model batch size in Neuron config checker for Neuron Backend
#3300 opened Jul 19, 2025 by jimburtoft Loading… updated Jul 19, 2025
3 tasks
feat: allow json_schema in response format and add test
#3276 opened Jun 25, 2025 by drbh Loading… updated Jul 7, 2025
fix: enable defs references in tool calls
#3291 opened Jul 7, 2025 by drbh Loading… updated Jul 7, 2025
Disable mamba in CPU platform
#3266 opened Jun 13, 2025 by casassg Loading… updated Jun 13, 2025
3 of 5 tasks
feat: improve llava next pooling for granite vision
#3255 opened Jun 4, 2025 by drbh Loading… updated Jun 6, 2025
display available cached versions in TGI server error message of Neuron backend
#3063 opened Feb 26, 2025 by jimburtoft Loading… updated May 21, 2025
4 tasks
Kvrouter that will increase the kv-cache hits in case of multiple routing strategy
#2965 opened Jan 29, 2025 by Narsil Loading… updated Apr 30, 2025
5 tasks
README: minimum Python version is 3.10
#3194 opened Apr 25, 2025 by Frenzie Loading… updated Apr 25, 2025
1 of 5 tasks
Fix flashinfer plan call to use positional arguments for #3165
#3166 opened Apr 11, 2025 by ruckc Loading… updated Apr 17, 2025
2 of 5 tasks
Add chunked attn for L4
#3162 opened Apr 10, 2025 by mht-sharma • Draft updated Apr 11, 2025
2 of 7 tasks
Previous Next
ProTip! no:milestone will show everything without a milestone.