🧠 Spatial Agents

Build AI Agents That Understand, Simulate, and Act in the Real World

Quick Start • Features • Arena • Docs • Contributing • Discord

🎯 What is Spatial Agents?

Spatial Agents is an open-source framework for building AI agents with true spatial intelligence—agents that can understand the physical world, simulate environments, and take actions in reality.

┌────────────────────────────────────────────────────────────────────────────────────┐
│                                                                                    │
│                           🧠 SPATIAL AGENTS                                        │
│                                                                                    │
│      ┌──────────────┐      ┌──────────────┐      ┌──────────────┐                 │
│      │  UNDERSTAND  │  →   │   SIMULATE   │  →   │     ACT      │                 │
│      │              │      │              │      │              │                 │
│      │  3D Scenes   │      │  Physics     │      │  Navigation  │                 │
│      │  Objects     │      │  Worlds      │      │  Manipulation│                 │
│      │  Spatial     │      │  Scenarios   │      │  Real-world  │                 │
│      │  Relations   │      │  What-ifs    │      │  Execution   │                 │
│      └──────────────┘      └──────────────┘      └──────────────┘                 │
│                                                                                    │
│  ┌──────────────────────────────────────────────────────────────────────────────┐ │
│  │                          🌍 QAI EARTH ENGINE                                 │ │
│  │        Real-world geospatial data  •  Physics  •  3D environments           │ │
│  └──────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────┘

Why Spatial Intelligence?

Current AI is blind to the physical world. Spatial Agents gives AI the ability to:

See — Understand 3D scenes, object relationships, and spatial layouts
Think — Reason about physics, predict outcomes, plan paths
Simulate — Test actions in virtual environments before execution
Act — Navigate, manipulate, and operate in the real world

✨ Features

🔍 Understand

Perceive and comprehend spatial environments—3D scene understanding, object recognition, spatial relationship extraction, and depth perception.

🌐 Simulate

Run physics-accurate simulations, test hypothetical scenarios, predict outcomes, and train agents in virtual worlds before real deployment.

🎯 Act

Execute real-world actions—navigation, object manipulation, robotic control, and autonomous decision-making in physical environments.

🏟️ Arena

Benchmark and compete AI models in spatial reasoning challenges. Compare performance across navigation, physics, and 3D understanding.

🔌 Universal Model Interface

Plug in any LLM/VLM—OpenAI, Anthropic, Google, DeepSeek, Qwen, or your own custom model.

🌍 QAI Earth Integration

Connect to real-world geospatial data, maps, terrain, and environmental information.

🚀 Quick Start

Installation

pip install spatial-agents

Build a Spatial Agent

from spatial_agents import SpatialAgent, Environment
from spatial_agents.capabilities import Vision, Navigation, Manipulation

# Create an agent with spatial capabilities
agent = SpatialAgent(
    model="claude-sonnet-4-20250514",
    capabilities=[Vision, Navigation, Manipulation]
)

# Connect to an environment
env = Environment.from_location("san_francisco", radius_km=5)

# Agent understands, simulates, and acts
observation = agent.perceive(env)
plan = agent.reason("Navigate to the nearest coffee shop avoiding traffic")
result = agent.execute(plan, simulate_first=True)

Understand Spatial Scenes

from spatial_agents import SpatialAgent

agent = SpatialAgent(model="gpt-4o")

# Understand a 3D scene from images
scene = agent.understand(
    images=["room_view_1.jpg", "room_view_2.jpg"],
    query="Where is the laptop relative to the window?"
)

print(scene.objects)        # Detected objects with 3D positions
print(scene.relationships)  # Spatial relationships between objects
print(scene.answer)         # "The laptop is 2m left of the window, on the desk"

Simulate Before Acting

from spatial_agents import SpatialAgent, Simulation

agent = SpatialAgent(model="claude-opus-4-5-20250514")

# Plan a complex action
action = agent.plan("Move the robotic arm to pick up the red cube")

# Simulate first to verify safety
sim = Simulation(physics=True)
outcome = sim.run(action, steps=100)

if outcome.success and outcome.collision_free:
    agent.execute(action)  # Safe to execute in real world

🏟️ Arena

Test and compare spatial intelligence across different AI models.

from spatial_agents import Arena, Agent

arena = Arena(challenge="urban_navigation")

results = arena.compete([
    Agent(model="claude-opus-4-5-20250514", name="Claude"),
    Agent(model="gpt-4o", name="GPT-4o"),
    Agent(model="gemini-ultra", name="Gemini"),
])

print(results.leaderboard())

Leaderboard

Rank	Model	Navigation	Object Reasoning	Physics	3D Understanding	Overall
🥇	Claude Opus 4.5	94.2	91.8	89.5	92.1	91.9
🥈	GPT-4o	92.1	90.3	88.7	89.4	90.1
🥉	Gemini Ultra	89.8	88.9	90.2	87.6	89.1
4	DeepSeek V3	87.2	86.5	85.8	84.9	86.1
5	Qwen 2.5	85.4	84.2	83.1	82.8	83.9

📈 View Full Leaderboard →

🏛️ Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                            SPATIAL AGENTS                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐              │
│  │   Perception    │  │   Simulation    │  │     Action      │              │
│  │                 │  │                 │  │                 │              │
│  │ • 3D Vision     │  │ • Physics       │  │ • Navigation    │              │
│  │ • Depth         │  │ • Scenarios     │  │ • Manipulation  │              │
│  │ • Object Det.   │  │ • Prediction    │  │ • Control       │              │
│  │ • Scene Graph   │  │ • Training      │  │ • Execution     │              │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘              │
│                                                                             │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐              │
│  │     Models      │  │      Arena      │  │   Benchmarks    │              │
│  │                 │  │                 │  │                 │              │
│  │ • OpenAI        │  │ • Competitions  │  │ • SpatialIQ     │              │
│  │ • Anthropic     │  │ • Challenges    │  │ • NavScore      │              │
│  │ • Google        │  │ • Leaderboards  │  │ • PhysicsBench  │              │
│  │ • DeepSeek/Qwen │  │ • Evaluation    │  │ • 3D-Reason     │              │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘              │
│                                                                             │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │                       QAI EARTH ENGINE                                │  │
│  │     Real-world geospatial data  •  Physics  •  3D environments        │  │
│  │           Terrain  •  Buildings  •  Sensor simulation                 │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

📦 Capabilities

🔍 Perception

3D scene understanding, object detection, depth estimation, spatial relationship extraction, multi-view reconstruction.

🧭 Navigation

Path planning, obstacle avoidance, SLAM, multi-destination routing, real-world wayfinding.

🎯 Reasoning

Spatial logic, physics prediction, object permanence, "what's behind X?", relative positioning.

⚡ Physics

Force understanding, trajectory prediction, stability analysis, collision detection, material properties.

🤖 Control

Robotic manipulation, motion planning, grasping, assembly, human-robot interaction.

🌍 Geospatial

Real-world maps, satellite imagery, terrain analysis, urban environments, global positioning.

🔬 For Researchers

from spatial_agents.benchmarks import SpatialIQBenchmark

benchmark = SpatialIQBenchmark()
results = benchmark.evaluate(your_model, split="test")

print(results.to_latex())
print(results.statistical_analysis())

Cite our work:

@software{spatial_agents_2025,
  title={Spatial Agents: AI Agents That Understand, Simulate, and Act in the Real World},
  author={Qian, Dr. and QAI Lab},
  year={2025},
  url={https://github.com/qai-lab/spatial-agents}
}

🤝 Contributing

We welcome contributions! Whether you're:

🔍 Adding perception — New vision models, depth estimators
🌐 Building simulations — Physics engines, virtual environments
🎯 Creating actions — Robotics integrations, control systems
🏟️ Designing challenges — New arena benchmarks
📖 Improving docs — Help others build spatial agents

Check out our Contributing Guide to get started.

git clone https://github.com/qai-lab/spatial-agents.git
cd spatial-agents
pip install -e ".[dev]"
pytest

🌐 Community & Support

💬 Discord: discord.gg/qai-lab
💼 LinkedIn: QAI Lab
🐦 X/Twitter: @qai_lab
📧 Issues: GitHub Issues

Built with ❤️ by Dr. Qian and QAI Lab

_{⭐ Star us on GitHub — it helps the project grow!}

AI agents that see, think, simulate, and act in the physical world

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Spatial Agents

🎯 What is Spatial Agents?

Why Spatial Intelligence?

✨ Features

🔍 Understand

🌐 Simulate

🎯 Act

🏟️ Arena

🔌 Universal Model Interface

🌍 QAI Earth Integration

🚀 Quick Start

Installation

Build a Spatial Agent

Understand Spatial Scenes

Simulate Before Acting

🏟️ Arena

Leaderboard

🏛️ Architecture

📦 Capabilities

🔍 Perception

🧭 Navigation

🎯 Reasoning

⚡ Physics

🤖 Control

🌍 Geospatial

🔬 For Researchers

🤝 Contributing

🌐 Community & Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🧠 Spatial Agents

🎯 What is Spatial Agents?

Why Spatial Intelligence?

✨ Features

🔍 Understand

🌐 Simulate

🎯 Act

🏟️ Arena

🔌 Universal Model Interface

🌍 QAI Earth Integration

🚀 Quick Start

Installation

Build a Spatial Agent

Understand Spatial Scenes

Simulate Before Acting

🏟️ Arena

Leaderboard

🏛️ Architecture

📦 Capabilities

🔍 Perception

🧭 Navigation

🎯 Reasoning

⚡ Physics

🤖 Control

🌍 Geospatial

🔬 For Researchers

🤝 Contributing

🌐 Community & Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages