- Notifications
You must be signed in to change notification settings - Fork 993
Pull requests: mistralai/mistral-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix generate_mamba() argument error in interactive mode
#260 opened Nov 27, 2025 by fede-kamel Loading…
2 of 3 tasks
Fix NCCL broadcast error on CPU tensors in distributed inference
#257 opened Oct 1, 2025 by Pratham-Nayak1 Loading…
feat(model-service): add OpenAI-compatible wrapper (+ pm2 + env example) and update ignores
#254 opened Aug 27, 2025 by MCVelasquez45 Loading…
Optimize main.py for inference efficiency and GPU throughput (torch.compile, memory tuning, warp alignment)
#253 opened Aug 3, 2025 by abdullatifcodes Loading…
Fix: Proper JSON chunk handling in streaming response (OpenRouter API)
#248 opened Jul 4, 2025 by ktdjiren Loading…
[fix] Correctly pass mask in TransformerBlock.forward in transformer_layers.py
#218 opened Sep 18, 2024 by MarcSzafraniec Loading…
Fix device error when using cuda device other than cuda:0
#216 opened Aug 28, 2024 by cornzz Loading…
fix(README.md): correct verb agreement in model support statement
#166 opened May 30, 2024 by CharlesCNorton Loading…
Add CPU support to one_file_ref.py (the one file implementation)
#129 opened Feb 22, 2024 by kikirizki Loading…
Update README.md: Fix page not found for link to guardrailing
#105 opened Dec 31, 2023 by martin0258 Loading…
Previous Next
ProTip! Follow long discussions with comments:>50.