Stars
CLI-Anything: Making ALL Software Agent-Native
A framework for efficient model inference with omni-modality models
VideoSys: An easy and efficient system for video generation
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
Janus-Series: Unified Multimodal Understanding and Generation Models
A XAI Framework to provide Contrastive Whole-output Explanation for Image Classification.
Implementation for Consistency Regularization for Domain Generalization with Logit Attribution Matching
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
A curated list of papers, code and resources pertaining to image composition/compositing or object insertion/addition/compositing, which aims to generate realistic composite image.
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
ECCV 2024 论文和开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目
collection of diffusion model papers categorized by their subareas
A curated list of recent diffusion models for video generation, editing, and various other applications.
one for all, Optimal generator with No Exception
A toolbox of ocr models and algorithms based on MindSpore
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
A toolbox of vision models and algorithms based on MindSpore
Learning Convolutional Neural Networks with Interactive Visualization.
The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.


