Stars
Python implementation of "A fully-automatic, temporal approach to single camera, glint-free 3D eye model fitting"
A lightweight and robust Python eye tracker
An elegant PyTorch deep reinforcement learning library.
🎬 火宝短剧 - 基于AI的一站式短剧生成平台 《一句话生成完整短剧,从剧本到成片全自动化》 Huobao Drama - An AI-Powered End-to-End Short Drama Generator "One Sentence to Complete Drama: Fully Automated from Script to Final Video"
[DEIMv2] Real Time Object Detection Meets DINOv3
[CVPR 2024] Code release for TransNeXt model
This repository is the code of paper 'DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark'.
This projects aims in detection of video deepfakes using deep learning techniques like RestNext and LSTM. We have achived deepfake detection by using transfer learning where the pretrained RestNext…
Implementation of my RAG system that won all categories in Enterprise RAG Challenge 2
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching
Wan: Open and Advanced Large-Scale Video Generative Models
[CVPR2025 Highlight] Video Generation Foundation Models: https://saiyan-world.github.io/goku/
A curated list of articles and codes related to face forgery generation and detection.
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
[CSUR 2026] A Survey on Deepfake Generation and Detection
Official repository for the next-generation deepfake detection dataset (DF40), comprising 40 distinct deepfake techniques, even the just released SoTAs. Our work has been accepted by NeurIPS 2024.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
Video Copy Segment Localization (VCSL) dataset and benchmark [CVPR2022]