A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.
- Updated
Mar 25, 2026
A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
[IEEE TII 2025] Official Implementation for "Dual-Detector Reoptimization for Federated Weakly Supervised Video Anomaly Detection via Adaptive Dynamic Recursive Mapping"
Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)
[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens
Researchers who published code, models (in some cases), and demo apps (in few cases) along with their SOTA paper
[IEEE JSTARS 2026] Mamba-FCS: Joint Spatio-Frequency Feature Fusion, Change-Guided Attention, and SeK Inspired Loss for Enhanced Semantic Change Detection in Remote Sensing
[SOTA] MIDI Tempo Detection AI implementation and model (94% accuracy on any MIDI]
figsr — a frequency-domain (FFT-based) SISR architecture. Enhances detail reconstruction and inference speed, combining the strengths of CNNs and Transformers while mitigating their core limitations.
[DEPRECIATED] [339M] [88% acc] Fast full-featured drums inpainting transformer with octo-velocity
SOTA pure drums transformer which is capable of drums track generation for any source composition
This repository includes multiple competitions-solutions/tutorials in deep learning and machine learning
B.Sc. Thesis Deep Learning & NLP research on Medical Image Captioning
SOTA quality fast music transformer with symmetrical quad MIDI notes encoding
Investigation of the capabilities of foundations models in the context of time series forecasting
A multi-agent real-time local discovery system with intent parsing, live place retrieval, review synthesis, transit-aware ranking, explainable recommendations, and SSE progress streaming.
Contributions to ML tasks in the form of Tools, Videos , Notebooks, Apps and APIs
Models and examples built with TensorFlow.
Implementation of the MCNN-14 model for fashion image classification, achieving 93.08% accuracy on Fashion-MNIST. Based on our paper “An Efficient Multiple Convolutional Neural Network Model (MCNN-14) for Fashion Image Classification.”
Paper and survey of the papers surrounding semantic simiarity task
Add a description, image, and links to the sota-model topic page so that developers can more easily learn about it.
To associate your repository with the sota-model topic, visit your repo's landing page and select "manage topics."