YouTube Diarization

This project provides tools for downloading YouTube videos, performing speaker diarization, and transcribing the audio. It leverages state-of-the-art models for accurate speaker identification and transcription.

Features

Download audio from YouTube videos
Perform speaker diarization to identify different speakers in the audio
Transcribe the audio to text
Save detailed protocols of speaker segments and their transcriptions

Tools and Models Used

Transcription

For transcription, we use the Whisper turbo model from OpenAI. Whisper is a general-purpose speech recognition model that is trained on a large dataset of diverse audio.

Downloading Audio

For downloading audio from YouTube, we use the pytubefix library.

Diarization

For speaker diarization, we use the pyannote.audio library. pyannote.audio is a toolkit for speaker diarization that provides pre-trained models for speaker identification and segmentation.

Requirements

To install the required dependencies, run:

uv sync

Setting Up Environment Variables

You need to set the HUGGING_FACE_TOKEN environment variable. You can do this by adding the following line to your shell configuration file (e.g., .bashrc, .zshrc):

export HUGGING_FACE_TOKEN=<your_hugging_face_token>

Setting Up Redis (Optional)

To use Celery, you need to have Redis installed and running. You can install Redis using Homebrew:

brew install redis

Start the Redis server:

brew services start redis

Setting Up Flower (Optional)

To monitor Celery tasks, you can use Flower. Install Flower using pip:

uv add flower

Running Flower and Celery Workers (Optional)

To monitor Celery tasks and run Celery workers, follow these steps:

Start the Flower server:
```
uv run celery -A diarization.celery_task flower --port=5555
```
Access the Flower dashboard by navigating to http://localhost:5555 in your web browser.

Start the Celery worker:

uv run celery -A diarization.celery_task worker --loglevel=info -P threads

Installing ffmpeg

To convert the audio to WAV format, you need to have ffmpeg installed. You can install ffmpeg using Homebrew:

brew install ffmpeg

Usage

Download and Diarize a Single YouTube Video

To download and process a single YouTube video, run:

uv run diarization.py <YouTube_URL> <output_folder>

Batch Processing from a File

To process multiple YouTube URLs from a simple text file (one link per line), run:

uv run diarization.py --file <file_with_urls> <output_folder>

Running Celery Tasks (Optional)

To run Celery tasks for downloading and transcribing, start the Celery worker:

uv run celery -A diarization.celery_task worker --loglevel=info -P threads

Then, you can call the task from your code:

from diarization.celery_task import download_and_transcribe result = download_and_transcribe.delay(<YouTube_URL>, <output_folder>) print(result.get())

Running the Streamlit App with Celery Tasks

To run the Streamlit app for a user-friendly interface, use the following command:

uv run streamlit run streamlit_app.py

Output

The output will include:

The downloaded audio file in WAV format
A JSON file with diarization results
A text file with detailed protocols of speaker segments and their transcriptions

Example

uv run diarization.py https://www.youtube.com/watch?v=example <output_folder>

Applications

The diarization results can be used for various applications, including:

Training large language models (LLMs) with speaker-specific data
Implementing Retrieval-Augmented Generation (RAG) applications for more accurate and context-aware responses

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
diarization		diarization
.gitignore		.gitignore
.python-version		.python-version
README.MD		README.MD
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Diarization

Features

Tools and Models Used

Transcription

Downloading Audio

Diarization

Requirements

Setting Up Environment Variables

Setting Up Redis (Optional)

Setting Up Flower (Optional)

Running Flower and Celery Workers (Optional)

Installing ffmpeg

Usage

Download and Diarize a Single YouTube Video

Batch Processing from a File

Running Celery Tasks (Optional)

Running the Streamlit App with Celery Tasks

Output

Example

Applications

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YouTube Diarization

Features

Tools and Models Used

Transcription

Downloading Audio

Diarization

Requirements

Setting Up Environment Variables

Setting Up Redis (Optional)

Setting Up Flower (Optional)

Running Flower and Celery Workers (Optional)

Installing ffmpeg

Usage

Download and Diarize a Single YouTube Video

Batch Processing from a File

Running Celery Tasks (Optional)

Running the Streamlit App with Celery Tasks

Output

Example

Applications

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages