⚡ Fast GPU transcription using faster-whisper on RunPod serverless.
- Create RunPod Endpoint: Serverless → New Endpoint
- GitHub Integration: Select this repository
- GPU: RTX 4090 or RTX 3080
- Environment Variables:
WHISPER_MODEL=mediumWHISPER_COMPUTE_TYPE=float16
curl -X POST "https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/runsync" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d @test_input.json- RTF: 0.02-0.05 (20x faster than real-time)
- 2-minute audio: ~2-6 seconds processing
- Cold start: ~10-30 seconds
rp_handler.py- Main transcription handlerrequirements.txt- Minimal dependenciesDockerfile- Container setuptest_input.json- Test payload