Thinking with Spatial Code for Physical-World Video Reasoning
Jieneng Chen*, Wenxin Ma*, Ruisheng Yuan*, Yunzhi Zhang*, Jiajun Wu†, Alan Yuille† Johns Hopkins University & Stanford University
Spatial Code — Transform visual scenes into structured, executable 3D representations for spatial reasoning.
Stay tuned!
- arXiv paper — released on March 5 → arXiv:2603.05591
- Codebase — releasing by March 17
- Reinforcement training details — releasing by March 22
- Reproducible models — releasing by March 31
If you find this work useful, please consider citing:
@article{chen2025spatialcode,
title={Thinking with Spatial Code for Physical-World Video Reasoning},
author={Chen, Jieneng and Ma, Wenxin and Yuan, Ruisheng and Zhang, Yunzhi and Wu, Jiajun and Yuille, Alan},
journal={arXiv preprint arXiv:2603.05591},
year={2025}
}