VLN-PE / README.md
nielsr's picture
nielsr HF Staff
Improve model card: Add metadata and links
e09e26f verified
|
raw
history blame
8.47 kB
metadata
pipeline_tag: robotics
library_name: transformers
license: mit

This repository contains models for the VLN-PE Benchmark, as presented in the paper Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities.

VLN-PE introduces a physically realistic Vision-and-Language Navigation platform supporting humanoid, quadruped, and wheeled robots, and systematically evaluates several ego-centric VLN methods in physical robotic settings.

For more details, visit the project page or the main GitHub repository.

VLN-PE Benchmark

Model Dataset/Benchmark Val Seen Val Unseen Download
TL NE FR StR OS SR SPL TL NE FR StR OS SR SPL
Zero-shot transfer evaluation from VLN-CE
Seq2Seq-Full R2R VLN-PE 7.80 7.62 20.21 3.04 19.3 15.2 12.79 7.73 7.18 18.04 3.04 22.42 16.48 14.11 model
CMA-Full R2R VLN-PE 6.62 7.37 20.06 3.95 18.54 16.11 14.61 6.58 7.09 17.07 3.79 20.86 16.93 15.24 model
Train on VLN-PE
Seq2Seq R2R VLN-PE 10.61 7.53 27.36 4.26 32.67 19.75 14.68 10.85 7.88 26.8 5.57 28.13 15.14 10.77 model
CMA R2R VLN-PE 11.13 7.59 23.71 3.19 34.94 21.58 16.1 11.16 7.98 22.64 3.27 33.11 19.15 14.05 model
RDP R2R VLN-PE 13.26 6.76 27.51 1.82 38.6 25.08 17.07 12.7 6.72 24.57 3.11 36.9 25.24 17.73 model
Seq2Seq+ R2R VLN-PE 10.22 7.75 33.43 3.19 30.09 16.86 12.54 9.88 7.85 26.27 6.52 28.79 16.56 12.7 model
CMA+ R2R VLN-PE 8.86 7.14 23.56 3.5 36.17 25.84 21.75 8.79 7.26 21.75 3.27 31.4 22.12 18.65 model

Citation

If you find our work helpful, please cite:

@inproceedings{vlnpe,
  title={Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities},
  author={Wang, Liuyi and Xia, Xinyuan and Zhao, Hui and Wang, Hanqing and Wang, Tai and Chen, Yilun and Liu, Chengju and Chen, Qijun and Pang, Jiangmiao},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2025}
}
@misc{internnav2025,
    title = {{InternNav: InternRobotics'} open platform for building generalized navigation foundation models},
    author = {InternNav Contributors},
    howpublished={\url{https://github.com/InternRobotics/InternNav}},
    year = {2025}
}