AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.
- Updated
Apr 2, 2025 - Python
AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.
speech to text gui for different (mostly Whisper, also Voxtral) models and backends, including whisper.cpp, mlx-whisper, faster-whisper, ctranslate2; applies pyannote for diarization
A Tool to Transcribe Audio Files with Speaker Diarization
🎥 Transform long videos into short, shareable clips effortlessly using AI-driven tools for creators and educators.
Multimedia context generation tool using off-the-shelf components. Leverages several local ML/AI tools to accomplish transcription, context clues, and llm-driven tasks. Designed with extensibility in mind. Dataset preparation tool. Adds context to video and audio inputs.
An intelligent Streamlit application to transcribe and analyze multi-speaker medical consultations. This tool automatically identifies who spoke when (diarization), transcribes their speech (ASR), and assigns their role (Clinician or Patient), even in conversations that mix English and other languages like Hindi or Tamil.
WebSocket based Python implementation that streams live audio to the Deepgram API for real-time transcription and speaker diarization.
This repository is to experiment the integration with @ggml-org/whisper.cpp for offline STT + pyannote/speaker-diarization-3.1
A simple protocol manager for your audios
AI-powered meeting transcription tool with speaker diarization using Whisper and pyannote-audio.
Add a description, image, and links to the pyannote-audio topic page so that developers can more easily learn about it.
To associate your repository with the pyannote-audio topic, visit your repo's landing page and select "manage topics."