SpatialBench

Is your spatial foundation model an all-round player?

PPaper AarXiv CCode BBench DDataset MModel

Video placed below the title and above the teaser image

Benchmark Coverage

Dataset tags and density regimes. Fixed scene counts and consumed frames for the unified protocol.

Dataset	Environment	Dynamics	Viewpoint	Source	Scenes				Frames
Dataset	Environment	Dynamics	Viewpoint	Source	Single	Sparse	Medium	Dense	Frames

Findings: How to Train Your Best Spatial Foundation Models?

Finding 1

Full-Context Attention Sets the Accuracy Upper Bound on High-Memory GPUs.

At the same input budget, globally coupled feed-forward models occupy the strongest accuracy region.

Operating snapshot comparing memory, depth error, and inference time.

Finding 2

Bounded-Memory Modeling Enables Long-Sequence Reconstruction on Limited GPUs.

Streaming, chunk-wise, and TTT models trade some accuracy for dense long-horizon reconstruction.

Memory scaling curves across model paradigms.

Finding 3

Training Data Volume Correlates with Performance, but Data Quality is the Critical Factor.

Curated pseudo-GT supervision outperforms larger but noisier training mixtures at comparable scale.

Training data coverage versus benchmark performance.

Finding 4

Egocentric-View and Wrist-View Remain the Dominant OOD Failure Modes.

Embodied viewpoints expose the largest field-level generalization drop; DA-Next targets this gap.

Domain-level AUC@30 by evaluation domain.

Scene Explorer

GLB Sample

Select a scene

Choose a scene from the filtered list.

Leaderboard

Search Paradigm Sort by

Method	Paradigm	Single	Sparse		Medium				Dense				Average
Method	Paradigm	AbsRel↓	AbsRel↓	AUC@30↑	AbsRel↓	AUC@30↑	ATE↓	F-Score↑	AbsRel↓	AUC@30↑	ATE↓	F-Score↑	AbsRel↓	AUC@30↑	ATE↓	F-Score↑

Citation

Reference

@misc{peng2026spatialbench,
      title={SpatialBench: Is Your Spatial Foundation Model an All-Round Player?}, 
      author={Haosong Peng and Hao Li and Jiaqi Chen and Yuhao Pan and Runmao Yao and Yalun Dai and Fushuo Huo and Fangzhou Hong and Zhaoxi Chen and Haozhao Wang and Dingwen Zhang and Ziwei Liu and Wenchao Xu},
      year={2026},
      eprint={2605.27367},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.27367}, 
}