Highlights
- Pro
Stars
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
HY-Motion model for 3D human motion or 3D character animation generation.
Concise, consistent, and legible badges in SVG and raster format
Native and Compact Structured Latents for 3D Generation
Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Multilingual Document Layout Parsing in a Single Vision-Language Model
💫 Toolkit to help you get started with Spec-Driven Development
[CVPR 2026] 🔥🔥 Official Repo of USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
Using system APIs directly with adb/root privileges from normal apps through a Java process started with app_process.
Reference PyTorch implementation and models for DINOv3
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Text-audio foundation model from Boson AI
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Copilot Chat extension for VS Code
An open-source AI agent that brings the power of Gemini directly into your terminal.
The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment
Anthropic's educational courses
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System


