pyannote-audio

Here are 10 public repositories matching this topic...

alperensumeroglu / ai-clips-maker

AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.

Updated Apr 2, 2025
Python

CrispStrobe / Susurrus

Star

speech to text gui for different (mostly Whisper, also Voxtral) models and backends, including whisper.cpp, mlx-whisper, faster-whisper, ctranslate2; applies pyannote for diarization

speech-to-text stt whisper pyannote-audio diarization pyannote whisper-cpp whisper-ai whispercpp ctranslate2 voxtral

Updated Nov 1, 2025
Python

Global-Health-Engineering / ghe_transcribe

Star

A Tool to Transcribe Audio Files with Speaker Diarization

multilingual speech-to-text transcription speaker-recognition pyannote-audio diarization faster-whisper

Updated Nov 12, 2025
Python

Nidurshan / ai-clips-maker

Star

🎥 Transform long videos into short, shareable clips effortlessly using AI-driven tools for creators and educators.

audio-analysis automatic-speech-recognition face-tracking speaker-diarization media-processing pyannote-audio temporal-segmentation ml-pipeline ffmpeg-python deep-learning-pipelines video-scene-detection video-transcription openai-whisper huggingface-pipelines multimodal-ai video-resizing ai-video-summarization video-clip-generation

Updated Dec 3, 2025
Python

Multimedia context generation tool using off-the-shelf components. Leverages several local ML/AI tools to accomplish transcription, context clues, and llm-driven tasks. Designed with extensibility in mind. Dataset preparation tool. Adds context to video and audio inputs.

ffmpeg cuda pytorch stt whisper speech-processing audio-processing audio-processing-with-python pyannote-audio parakeet dataset-generator accessibility-tools audio-context lmstudio

Updated Aug 7, 2025
Python

d-kavinraja / Multilingual-Speaker-Diarization-Role-Labeling

Star

An intelligent Streamlit application to transcribe and analyze multi-speaker medical consultations. This tool automatically identifies who spoke when (diarization), transcribes their speech (ASR), and assigns their role (Clinician or Patient), even in conversations that mix English and other languages like Hindi or Tamil.

pytorch whisper nlp-machine-learning langdetect pyannote-audio asr-model streamlit-webapp