Skip to content

VINHYU/OpenSpatial

Repository files navigation

OpenSpatial Logo

arXiv Hugging Face License: Apache-2.0

OpenSpatial is an open-source 3D spatial understanding data engine engineered for high quality, extensive scalability, broad task diversity, and optimized efficiency.

By bridging the gap between massive 2D web data and complex 3D spatial reasoning, OpenSpatial provides a comprehensive suite for the next generation of Embodied AI and World Models.


OpenSpatial Teaser
OpenSpatial Pipeline: From 2D Web Data to 3D Spatial Understanding


🔥 News

  • [2026.04.15] 🎉 We have released the open-source subset of the OpenSpatial-3M dataset! Check it out on Hugging Face.
  • [2026.04.08] 🎉 The OpenSpatial 3D data engine is now officially open-sourced.

🚀 Key Features

  • Web Data 3D Lifting: Advanced pipelines to transform large-scale 2D web imagery into geometrically consistent 3D representations.
  • Diverse Data Generation: Automated engine for creating rich spatial understanding datasets, covering various environments and object-level details.
  • Multi-Task Integration: Support for a wide range of tasks including 3D grounding, spatial reasoning, and scene captioning.
  • Comprehensive Evaluation: Built-in benchmarking suite to evaluate spatial understanding capabilities across different model architectures.
  • High Efficiency: Optimized for large-scale data processing with scalable distributed computing support.

📊 Dataset

The OpenSpatial-3M dataset is now available on Hugging Face. It contains 3 million high-fidelity samples designed to enhance 3D spatial reasoning in large multi-modal models.

📖 Documentation

Document Description
Quick Start Data preparation, config structure, annotation pipeline usage, and running tasks end-to-end
Development Guide Adding new annotation tasks, pipeline stages, prompt templates, dataset preprocessors, and internal architecture reference

📅 Roadmap & To-Do List

  • 3D Data Engine: Open-source the core 3D spatial understanding data engine.
  • OpenSpatial-3M Dataset Release: Publicly release the large-scale 3M spatial understanding dataset. [HF Link]
  • Model Release: Release the trained spatial understanding model.
  • Evaluation Suite: Open-source the comprehensive evaluation code for spatial tasks.
  • 3D Lifting Module: Integrate the core engine for lifting 2D web data to 3D representations.
  • More Tasks: Extend support for more spatial understanding task types.

📄 Citation

If you find OpenSpatial useful for your research, please consider citing our paper:

@article{openspatial2025,
  title={OpenSpatial: An Open-Source 3D Spatial Understanding Data Engine},
  journal={arXiv preprint arXiv:2604.07296},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages