Serverless LLM Serving for Everyone.
cuda pytorch model-serving model-as-a-service huggingface-transformers large-language-models serverless-inference
- Updated
Nov 27, 2025 - Python
Serverless LLM Serving for Everyone.
LLM Inference on AWS Lambda
Python SDK and CLI for modelz.ai, which is a developer-first platform for prototyping and deploying machine learning models.
Add a description, image, and links to the serverless-inference topic page so that developers can more easily learn about it.
To associate your repository with the serverless-inference topic, visit your repo's landing page and select "manage topics."