Skip to content

PolyX-Research/Awesome-Remote-Sensing-Agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

13 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Awesome-Remote-Sensing-Agents

Awesome License: CC BY-NC 4.0 PRs Welcome Papers 100+

๐Ÿ›ฐ๏ธ A curated collection of 100+ papers at the intersection of Intelligent Remote Sensing Agents ๐Ÿš€

Evolution of Remote Sensing Intelligence

Important

We welcome community contributions to keep this list up-to-date!

  • ๐Ÿ“ Add missing papers via Pull Request
  • ๐Ÿท๏ธ Propose new or refined categories
  • ๐Ÿ”— Report broken links or outdated entries
  • ๐Ÿ’ฌ Reach out via Contact for any discussion

If you find this survey or repository useful in your research, please cite our paper:

@article{tang2025intelligent, title={Intelligent Remote Sensing Agents: A Survey}, author={Tang, Jiaqi and Yan, Yingying and Wang, Qianzhou and Xia, Yuyang and Geng, Botong and Chen, Jianmin and Ma, Ke and Zhai, Youyang and He, Qingfeng and Shao, Weigeng and Sun, Yunjin and Dai, Junwei and Chen, Chuxi and Xu, Xiaogang and Yao, Kelu and Zhang, Lei and Wei, Wei and Chen, Qifeng and Plaza, Antonio and Zhang, Yanning}, year={2026}, url={https://github.com/PolyX-Research/Awesome-Remote-Sensing-Agents} }

๐Ÿ”ฅ News

๐Ÿ“š Contents

Badge Meaning
arXiv Preprint on arXiv
Published Published at a conference or journal
GitHub Code repository available
Application Application domain
Category Agent design category (planning, memory, tool use, etc.)

Papers

Ecological Monitoring
Title Application & Tags Links
arXivTime Star
REMSA: An LLM Agent for Foundation Model Selection in Remote Sensing
Application Paper
GitHub
VenueTime
ForestGPT and Beyond: A Trustworthy Domain-Specific Large Language Model Paving the Way to Forestry 5.0
Application Paper
VenueTime
GANDALF: A LLM-based Approach to Map Bark Beetle Outbreaks in Semantic Stories of Sentinel-2 Images
Application Paper
VenueTime
CLEAR: Climate Policy Retrieval and Summarization Using LLMs
Application Paper
VenueTime
DA4DTE: An Agentic System for Enhancing the Accessibility of Digital Twins of Earth
Application
Type
Paper
arXivTime
EarthLink: A Self-Evolving AI Agent for Climate Science
Application
Type
Paper
arXivTime
A Self-Evolving AI Agent System for Climate Science
Application Paper
arXivTime
Towards LLM Agents for Earth Observation
Application Paper
GitHub
Model
arXivTime
Accelerating Earth Science Discovery via Multi-Agent LLM Systems
Application
Type
Paper
arXivTime
GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing
Application Paper
arXivTime
CangLing-KnowFlow: A Unified Knowledge-and-Flow-fused Agent for Remote Sensing Applications
Application
Type
Paper
VenueTime
Google Earth AI and Gemini for Climate and Environmental Analysis
Application Paper
arXivTime
REO-VLM: Transforming VLM to Meet Regression Challenges in Earth Observation
Application
Type
arXivTime Star
H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model
Application Paper
GitHub
VenueTime Star
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Application Paper
GitHub
VenueTime Star
RS-LLaVA: A Large Vision-Language Model for Joint Captioning and Question Answering in Remote Sensing Imagery
Application Paper
GitHub
VenueTime Star
EarthGPT: A Universal Multimodal Large Language Model for Multisensor Image Comprehension in Remote Sensing Domain
Application Paper
GitHub
VenueTime
Transfer Learning in Environmental Remote Sensing
Application Paper
VenueTime
TREE-GPT: Modular Large Language Model Expert System for Forest Remote Sensing Image Understanding and Interactive Analysis
Application
Type
Type
Paper
VenueTime
An Agent-Based Model to Represent Space-Time Propagation of Forest-Fire Smoke
Application Paper
VenueTime
High-Resolution Mapping of Global Surface Water and Its Long-Term Changes
Application
Emergency Response
Title Application & Tags Links
arXivTime
FIRE-VLM: A Vision-Language-Driven Reinforcement Learning Framework for UAV Wildfire Tracking
Application Paper
arXivTime
UAV-CodeAgents: Scalable UAV Mission Planning via Multi-Agent ReAct and Vision-Language Reasoning
Application
Type
Type
Paper
arXivTime
Geospatial Artificial Intelligence for Satellite-based Flood Extent Mapping
Application Paper
arXivTime Star
A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and Adaptation
Application
Type
Paper
GitHub
arXivTime Star
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
Application Paper
GitHub
arXivTime
Empowering LLM Agents with Geospatial Awareness: Toward Grounded Reasoning for Wildfire Response
Application Paper
VenueTime
A Conceptual High Level Multiagent System for Wildfire Management
Application
Type
Type
Paper
VenueTime
RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images With Autonomous Agents
Application
Type
Type
Type
Paper
VenueTime
LLM-Enhanced Disaster Geolocalization Using Implicit Geoinformation from Multimodal Data: A Case Study of Hurricane Harvey
Application
Type
Type
Type
Paper
VenueTime
Knowledge-Guided Large Language Models for Enhancing Agent-Based Wildfire Spatial Simulation
Application Paper
Dataset
VenueTime
Large-Language-Model-Driven Agents for Fire Evacuation Simulation in a Cellular Automata Environment
Application Paper
VenueTime
From Perceptions to Decisions: Wildfire Evacuation Decision Prediction with Behavioral Theory-informed LLMs
Application Paper
VenueTime
ESCAPE: Evacuation Simulation Using Cognitive Agent-Based Modeling on Possible Earthquake in GAMA Platform for the Case of Kalayaan Residence Hall
Application
Type
Paper
VenueTime
Description of Wildfires Spreading and Extinguishing with the Aid of Agent-Based Models
Application Paper
Geological Exploration
Title Application & Tags Links
VenueTime Star
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
Application
Type
Type
Type
Type
Paper
GitHub
VenueTime
STA-CoT: Structured Target-Centric Agentic Chain-of-Thought for Consistent Multi-Image Geological Reasoning
Application Paper
Dataset
VenueTime Star
Automating Geospatial Vision Tasks with a Large Language Model Agent
Application Paper
GitHub
VenueTime
A Vision-Language Foundation Model-Based Multi-Modal Retrieval-Augmented Generation Framework for Remote Sensing Lithological Recognition
Application Paper
VenueTime
HI-MAFE: Hyperspectral Image Multi-Agent Deep Reinforcement Learning Feature Extraction
Application
Type
Paper
arXivTime
MineAgent: Towards Remote-Sensing Mineral Exploration with Multimodal Large Language Models
Application
Type
Type
Paper
Marine Supervision
Title Application & Tags Links
arXivTime
OceanAI: A Conversational Platform for Accurate, Transparent, Near-Real-Time Oceanographic Insights
Application
Type
Type
Paper
VenueTime
Large Language Model-Based Decision-Making for COLREGs and the Control of Autonomous Surface Vehicles
Application
Type
Paper
VenueTime
Autonomous Vehicle Maneuvering Using Visionโ€“LLM Models for Marine Surface Vehicles
Application Paper
VenueTime
OceanGPT: A Large Language Model for Ocean Science Tasks
Application Paper
VenueTime
WaterGPT: Training a Large Language Model to Become a Hydrology Expert
Application Paper
Precision Agriculture
Title Application & Tags Links
arXivTime
AgriGPT: A Large Language Model Ecosystem for Agriculture
Application
Type
Type
Paper
VenueTime
ChatLeafDisease: A Chain-of-Thought Prompting Approach for Crop Disease Classification Using Large Language Models
Application
Type
Paper
VenueTime
RS-MoE: A Visionโ€“Language Model With Mixture of Experts for Remote Sensing Image Captioning and Visual Question Answering
Application
Type
Type
VenueTime
Identifying Potential Rural Residential Areas for Land Consolidation Using a Data-Driven Agent-Based Model
Application
Type
Paper
VenueTime
Planning to โ€˜Hear the Farmerโ€™s Voiceโ€™: an Agent-Based Modelling Approach to Agricultural Land Use Planning
Application Paper
VenueTime
A Framework for Data-Driven Agent-Based Modelling of Agricultural Land Use
Application Paper
VenueTime
An Agent-Based Model to Simulate the Cultivation Pattern Change of Farmer Households in the North China Plain
Application
Type
Type
Paper
Urban Governance
Title Application & Tags Links
VenueTime
MMUEChange: A generalized LLM agent framework for intelligent multi-modal urban environment change analysis
Application
Type
Type
Type
Paper
arXivTime
AgentSense: LLMs Empower Generalizable and Explainable Web-Based Participatory Urban Sensing
Application Paper
VenueTime
SoPerModel: Leveraging Social Perception for Multi-Agent Trajectory Prediction
Application Paper
VenueTime Star
AirSpatialBot: A Spatially Aware Aerial Agent for Fine-Grained Vehicle Attribute Recognition and Retrieval
Application
Type
Type
Type
Paper
GitHub
VenueTime
LLM Agent Framework for Intelligent Change Analysis in Urban Environment Using Remote Sensing Imagery
Application
Type
Paper
VenueTime Star
UrbanLLaVA: A Multi-Modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Application
Type
Paper
GitHub
VenueTime Star
Automating Traffic Model Enhancement with AI Research Agent
Application Paper
GitHub
VenueTime
Roads: Robust Prompt-Driven Multi-Class Anomaly Detection Under Domain Shift
Application
VenueTime
A Spatiotemporalโ€“Semantic Coupling Intelligent Q&A Method for Land Use Approval Based on Knowledge Graphs and Intelligent Agents
Application
Type
Paper
arXivTime
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
Application
Type
Paper
arXivTime
GenAI-Powered Multi-Agent Paradigm for Smart Urban Mobility: Opportunities and Challenges for Integrating Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) with Intelligent Transportation Systems
Application Paper
arXivTime
AgentMove: Predicting Human Mobility Anywhere Using Large Language Model Based Agentic Framework
Application Paper
arXivTime
Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation Without Instructions
Application Paper
arXivTime Star
UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction
Application Paper
GitHub
arXivTime
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation
Application Paper
arXivTime
Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments
Application Paper
arXivTime Star
OpenCity: A Scalable Platform to Simulate Urban Activities with Massive LLM Agents
Application Paper
GitHub
VenueTime
TopoSense: Agent-Driven Topological Graph Extraction from Remote Sensing Image
Application Paper
VenueTime Star
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Application Paper
GitHub
VenueTime Star
3D Question Answering for City Scene Understanding
Application
Type
Paper
GitHub
Dataset
VenueTime Star
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Application Paper
GitHub
VenueTime
EmbodiedCity: Embodied Aerial Agent for City-Level Visual Language Navigation Using Large Language Model
Application Paper
VenueTime
AirVista: Empowering UAVs With 3D Spatial Reasoning Abilities Through A Multimodal Large Language Model Agent
Application
Type
Paper
arXivTime
LLMLight: Large Language Models as Traffic Signal Control Agents
Application Paper
arXivTime
GeoGPT: Understanding and Processing Geospatial Tasks through An Autonomous GPT
Application
Type
Type
Paper
VenueTime
Optimum landfill site selection by a hybrid multi-criteria and multi-Agent decision-making method in a temperate and humid climate: BWM-GIS-FAHP-GT
Application Paper
GitHub
VenueTime
Urban Change Detection for Multispectral Earth Observation Using Convolutional Neural Networks
Application
Others
Title Application & Tags Links
arXivTime
VICoT-Agent: A Vision-Interleaved Chain-of-Thought Framework for Interpretable Multimodal Reasoning and Scalable Remote Sensing Analysis
Type Paper
arXivTime
Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism
Type
Type
Paper
GitHub
arXivTime Star
GeoFlow: Agentic Workflow Automation for Geospatial Tasks
Type Paper
GitHub
Dataset
arXivTime
RingMo-Agent: A Unified Remote Sensing Foundation Model for Multi-Platform and Multi-Modal Reasoning
Type Paper
arXivTime
An energy-efficient learning solution for the Agile Earth Observation Satellite Scheduling Problem
Type Paper
arXivTime
Multi-Agent Geospatial Copilots for Remote Sensing Workflows
Type
Type
Type
Paper
VenueTime Star
Co-LLaVA: Efficient Remote Sensing Visual Question Answering via Model Collaboration
Type Paper
GitHub
VenueTime
GIS Copilot: Towards an Autonomous GIS Agent for Spatial Analysis
Type
Type
Paper
GitHub
Dataset
VenueTime
Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework
Type Paper
arXivTime
Chain-of-Programming (CoP): Empowering Large Language Models for Geospatial Code Generation Task
Type
Type
Paper
arXivTime Star
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent
Type
Type
Type
Paper
GitHub
VenueTime Star
Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis
Type
Type
Type
Type
Paper
GitHub
VenueTime Star
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models
Type
Type
Paper
GitHub
VenueTime
GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots
Type Paper

Datasets & Benchmarks

Training and evaluating remote sensing agents requires resources that go beyond static image-label pairs. Agents must integrate visual perception with reasoning, planning, and tool execution. We organize datasets and benchmarks into three tiers: Perception, Reasoning, and Decision-Making.

Datasets
Category Name Size Description
Perception UC Merced Land Use 2.1K images Land-use classification
AID 10K images Standard scene classification
xView ~1M instances Object detection in overhead imagery
DOTA 188K instances Oriented object detection
iSAID 655K instances Instance segmentation
xBD 850K instances Building damage assessment
EuroSAT 27K images Land use/land cover classification
LEVIR-CD 31K instances Building change detection
Topo-boundary 25K images Road topology extraction
STAR 210K instances Scene graph generation
SSGD 3.1K instances Spatial scene graph retrieval
GID 150 images Land-cover classification (GF-2)
FBP 5B pixels Country-scale semantic segmentation
WUSU 68K instances Urban semantic understanding
RealScene-ISTD 739 images Infrared UAV small-target detection
Reasoning ETH/UCY & nuScenes 9K+1.4M Trajectory prediction
AirSpatial 206K instructions Embodied spatial reasoning
LEVIR-MCI 10K pairs Semantic change understanding
GeoChat 318K instances Multimodal instruction following
SkyEye-968k 968K samples Multi-task instruction tuning
RSICap 2.6K pairs High-precision vision-language alignment
UData 353K instances Cross-modal urban reasoning
EarthVQA 208K QA pairs Relational VQA
RS-VL3M 3M pairs Vision-language pretraining
Decision-Making RS-Agent 18 tasks Expert-guided tool invocation
RescueADI 13.4K interactions Adaptive disaster response
AEOS-Bench 16.4K scenarios Constellation scheduling
Benchmarks
Category Name Feature Scale
Perception AgMTR Few-shot segmentation 5 test classes
TopoSense Graph extraction 2,685 images
TREE-GPT Interactive forest RS 3 tiles
STAR Scene graph generation 1,273 images
Univ-1652 Geo-localization 1,652 buildings
SSGD Relationship retrieval 3,130 samples
EarthView Large-scale pretraining 15 terapixels
Reasoning AirSpatial-Bench Spatial retrieval tasks 1,773 pairs
RSVQA Remote sensing QA 77Kโ€“1M questions
XLRS-Bench Ultra-high-res reasoning 45,942 questions
RescueADI Disaster interpretation 998 tasks
RSICap Image description 936 QA pairs
UrbanLLaVA Spatial reasoning โ€”
EarthVQA Relational VQA 1,809 images
City-3DQA 3D city understanding 61K pairs
Decision-Making AEOS-Bench Constellation scheduling 16,410 scenarios
ThinkGeo Tool-augmented tasks 486 tasks (1,773 steps)
RoadMind Disaster response 3 cities
RS-Agent Intent disambiguation & tool use 18 tasks
GeoBenchX Geospatial reasoning & execution 202 tasks
ShapefileGPT Geospatial workflow orchestration 42 tasks
GIS Copilot Agent-assisted GIS decisions 110 tasks

๐Ÿ“„ License

The curated list and associated code in this repository are licensed under CC BY-NC 4.0. You are free to share and adapt the material for non-commercial purposes with appropriate attribution.

The survey paper (paper/) is All Rights Reserved โ€” copyright belongs to the authors. You may read and cite the paper, but redistribution or modification of the paper itself is not permitted without explicit written permission.

โœจ Star History

Star History Chart

โœ‰๏ธ Contact

If you have any questions, suggestions, or would like to collaborate, feel free to reach out:

โฌ† Back to top

About

๐Ÿš€๐Ÿš€๐Ÿš€Official Repository of Intelligent Remote Sensing Agents: A Survey

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors