The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
- Updated
Mar 10, 2026 - TypeScript
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
This is the official website for TuriX Computer-use-Agent
MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
An open-sourced end-to-end VLM-based GUI Agent
RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Official implementation of "SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience"
[AAAI-2026] Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"
[AAAI 2026 Oral] Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic alignment bottlenecks in GUI agents through efficient, guided exploration.
Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.
[NeurIPS 2025] Official repository of RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.
Official repository of the paper "Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms"
DART-GUI: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation
[AAAI 2026]Release of code, datasets and model for our work TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents
The official code for "GUI-ReWalk: Massive Data Generation for GUI Agent via Stochastic Exploration and Intent-Aware Reasoning"
🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"
Add a description, image, and links to the gui-agent topic page so that developers can more easily learn about it.
To associate your repository with the gui-agent topic, visit your repo's landing page and select "manage topics."