4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration Paper • 2506.22242 • Published Jun 27
UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding Paper • 2508.11952 • Published Aug 16 • 1