- University of Science and Technology of China
- Hefei, Anhui, P.R.China
- 23:12
(UTC +08:00) - https://lethe.site
Lists (2)
Sort Name ascending (A-Z)
Stars
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Reinforcement Learning via Self-Distillation (SDPO)
Machine Learning Engineering Open Book
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
RLLaVA is a user-friendly framework for multi-modal RL research and optimized for resource-constrained teams.
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems that require e…
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
My learning notes for ML SYS.
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
Unofficial implementation of Tiny Recursive Model (TRM), improvement to HRM from Sapient AI, by Alexia Jolicoeur-Martineau
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Evergreen, contamination-free, real-world, domain-specific AI evaluation framework
About Awesome things towards foundation agents. Papers / Repos / Blogs / ...
A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
Implementation for FP8/INT8 Rollout for RL training without performence drop.
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers

