Starred repositories
Robust Speech Recognition via Large-Scale Weak Supervision
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Unified web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Official inference framework for 1-bit LLMs
Official inference repo for FLUX.1 models
Code for the paper "Language Models are Unsupervised Multitask Learners"
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Resume builder for academics and engineers
LLM Council works together to answer your hardest questions
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
End-to-End Object Detection with Transformers
Wan: Open and Advanced Large-Scale Video Generative Models
An open source implementation of CLIP.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Minimal reproduction of DeepSeek R1-Zero
Access large language models from the command-line
🐍 Geometric Computer Vision Library for Spatial AI
Refine high-quality datasets and visual AI models
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
[ICLR 2026] RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning.
This tool has been deprecated. Use Agentic Document Extraction instead.
🐢 Open-Source Evaluation & Testing library for LLM Agents
A PyTorch native platform for training generative AI models