Skip to content
View 0417keito's full-sized avatar
🦤
Working from home
🦤
Working from home

Block or report 0417keito

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
203 stars written in Python
Clear filter

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 55,679 9,492 Updated Nov 12, 2025

TensorFlow code and pre-trained models for BERT

Python 39,944 9,708 Updated Jul 23, 2024

Let us control diffusion models!

Python 33,777 3,005 Updated Feb 25, 2024

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 33,185 6,876 Updated Mar 27, 2026

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 30,258 4,003 Updated Jul 17, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,618 2,747 Updated Aug 12, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,997 3,389 Updated Mar 27, 2026

A PyTorch-based Speech Toolkit

Python 11,378 1,675 Updated Mar 27, 2026

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 10,398 984 Updated Mar 22, 2026

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Python 10,211 862 Updated Jul 6, 2024

End-to-End Speech Processing Toolkit

Python 9,788 2,387 Updated Mar 27, 2026

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,726 798 Updated Mar 25, 2026

ImageBind One Embedding Space to Bind Them All

Python 9,003 845 Updated Nov 21, 2025

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,926 605 Updated May 3, 2024

vits2 backbone with multilingual-bert

Python 8,715 1,271 Updated Mar 23, 2026

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,455 772 Updated May 31, 2024

Code for the paper "Jukebox: A Generative Model for Music"

Python 8,043 1,459 Updated Jun 19, 2024

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,955 778 Updated Feb 11, 2024

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,842 1,387 Updated Dec 6, 2023

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,206 394 Updated Jul 11, 2024
Python 6,961 1,167 Updated Mar 22, 2026

Google AI 2018 BERT pytorch implementation

Python 6,521 1,326 Updated Sep 15, 2023

Official repo for consistency models.

Python 6,475 433 Updated Mar 22, 2024

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,083 521 Updated Jul 1, 2025

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,933 381 Updated Mar 14, 2024

A light weight Python library for the Spotify Web API

Python 5,403 975 Updated Mar 11, 2026

Muzic: Music Understanding and Generation with Artificial Intelligence

Python 4,901 496 Updated Oct 12, 2024

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,752 798 Updated Mar 19, 2025

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,612 234 Updated Jun 14, 2024
Next