0417keito (0417itsuki) / Starred

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,997 3,389 Updated Mar 27, 2026

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 11,378 1,675 Updated Mar 27, 2026

rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 10,398 984 Updated Mar 22, 2026

AIGC-Audio / AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Python 10,211 862 Updated Jul 6, 2024

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,788 2,387 Updated Mar 27, 2026

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,726 798 Updated Mar 25, 2026

facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All

Python 9,003 845 Updated Nov 21, 2025

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,926 605 Updated May 3, 2024

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,715 1,271 Updated Mar 23, 2026

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,455 772 Updated May 31, 2024

openai / jukebox

Code for the paper "Jukebox: A Generative Model for Music"

Python 8,043 1,459 Updated Jun 19, 2024

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,955 778 Updated Feb 11, 2024

jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,842 1,387 Updated Dec 6, 2023

openai / guided-diffusion

Python 7,324 900 Updated Jul 2, 2024

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,206 394 Updated Jul 11, 2024

kohya-ss / sd-scripts

Python 6,961 1,167 Updated Mar 22, 2026

codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation

Python 6,521 1,326 Updated Sep 15, 2023

openai / consistency_models

Official repo for consistency models.

Python 6,475 433 Updated Mar 22, 2024

Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,083 521 Updated Jul 1, 2025

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,933 381 Updated Mar 14, 2024

spotipy-dev / spotipy

A light weight Python library for the Spotify Web API

Python 5,403 975 Updated Mar 11, 2026

microsoft / muzic

Muzic: Music Understanding and Generation with Artificial Intelligence

Python 4,901 496 Updated Oct 12, 2024

MoonInTheRiver / DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,752 798 Updated Mar 19, 2025

luosiallen / latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,612 234 Updated Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0417itsuki 0417keito

Achievements