Skip to content
View Imiloin's full-sized avatar

Highlights

  • Pro

Block or report Imiloin

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 10,060 1,271 Updated Mar 17, 2026

HY-Motion model for 3D human motion or 3D character animation generation.

Python 2,222 178 Updated Jan 29, 2026

Concise, consistent, and legible badges in SVG and raster format

JavaScript 26,330 5,588 Updated Mar 27, 2026

Native and Compact Structured Latents for 3D Generation

Python 4,588 519 Updated Jan 10, 2026

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Python 975 86 Updated Feb 25, 2026

Open-Source Frontier Voice AI

Python 25,316 2,759 Updated Mar 28, 2026

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Jupyter Notebook 2,083 122 Updated Dec 3, 2025
Python 10,766 720 Updated Feb 9, 2026

https://hrl.boyuai.com/

Jupyter Notebook 4,625 802 Updated Nov 22, 2022

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Python 888 62 Updated Mar 16, 2026

Contexts Optical Compression

Python 22,764 2,092 Updated Jan 27, 2026

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 73,239 10,044 Updated Mar 26, 2026

ComfyUI Plugin of Nunchaku

Python 2,821 153 Updated Feb 19, 2026

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 57,444 4,755 Updated Mar 28, 2026

Multilingual Document Layout Parsing in a Single Vision-Language Model

Python 8,131 726 Updated Mar 24, 2026

💫 Toolkit to help you get started with Spec-Driven Development

Python 83,149 7,117 Updated Mar 27, 2026

[CVPR 2026] 🔥🔥 Official Repo of USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Python 1,216 76 Updated Sep 12, 2025

Using system APIs directly with adb/root privileges from normal apps through a Java process started with app_process.

Kotlin 23,398 2,184 Updated Jun 18, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 9,948 789 Updated Mar 11, 2026

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,952 2,064 Updated Mar 27, 2026

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 7,639 465 Updated Feb 10, 2026

Text-audio foundation model from Boson AI

Python 7,996 615 Updated Jan 18, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 89,415 13,648 Updated Mar 26, 2026

Copilot Chat extension for VS Code

TypeScript 9,711 1,779 Updated Mar 28, 2026

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 99,348 12,694 Updated Mar 28, 2026

The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment

Python 1,526 184 Updated Mar 12, 2026

Anthropic's educational courses

Jupyter Notebook 19,984 1,988 Updated Nov 13, 2025
Jupyter Notebook 1,345 163 Updated Mar 24, 2026

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 19,649 2,419 Updated Mar 16, 2026
Next