Lists (3)
Sort Name ascending (A-Z)
Starred repositories
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
[CVPR 2026] DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer, ICLR 2026
Krea Realtime 14B. An open-source realtime AI video model.
The most powerful local music generation model that outperforms most commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.
Official implementation of "OmniForcing: Unleashing Real-time Joint Audio-Visual Generation"[arXiv:2603.11647]. OmniForcing is the first framework to distill bidirectional audio-visual diffusion mo…
WiLoR: End-to-end 3D hand localization and reconstruction in-the-wild
Effortless data labeling with AI support from Segment Anything and other awesome models.
[ICLR 2026] This is the official PyTorch implementation of "QVGen: Pushing the Limit of Quantized Video Generative Models".
Open Multi-Agent Interactive Classroom — Get an immersive, multi-agent learning experience in just one click
Helios: Real Real-Time Long Video Generation Model
run agents that work for you in the background based on what you do
AI agents running research on single-GPU nanochat training automatically
Official repository for “PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss”
FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
首家工业级全流程 AI 影视生产平台。Industry-first professional AI Agent platform for controllable film & video production. From shorts to live-action with Hollywood-standard workflows.
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
Unified automatic quality assessment for speech, music, and sound.
An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…
Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)
