PriorEye: Geospatial Visual Priors for End-to-End Autonomous Driving
Pith reviewed 2026-07-01 05:56 UTC · model grok-4.3
The pith
Route-anchored street-level images retrieved via dual memory improve end-to-end driving performance and sensor robustness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Augmenting end-to-end driving policies with a dual-memory module that stores route-aligned geospatial visual priors in one bank and persistent fallback data in another, then uses an adaptive gate to blend them according to current-state compatibility, raises benchmark performance and confers robustness to sensor corruption.
What carries the argument
Dual-memory architecture paired with an adaptive memory gate that retrieves contextual priors and regulates their contribution based on compatibility with the current observation.
If this is right
- Performance gains appear consistently across diverse end-to-end baselines on NAVSIM-v2.
- Robustness to sensor corruption increases because priors do not rely on onboard sensors.
- Dual-memory fallback prevents unsafe behavior when priors become unreliable.
- The module can be plugged into existing end-to-end pipelines without redesigning the core policy.
Where Pith is reading between the lines
- Models could learn to anticipate road layout changes several hundred meters ahead using only the route priors.
- The same retrieval-plus-gate pattern might transfer to other sequential decision tasks that need spatial memory, such as robot navigation in warehouses.
- Reducing dependence on high-quality real-time sensors could lower hardware cost if the priors prove sufficiently reliable in open-loop tests.
- A natural next measurement would be how far ahead the priors must extend before additional gains saturate.
Load-bearing premise
High-quality route-aligned street-level visual priors can be reliably retrieved and the adaptive gate can correctly decide when to trust or ignore them without introducing new failure modes.
What would settle it
A controlled test in which the retrieved priors are deliberately replaced with mismatched or corrupted images and the gated model is compared against the unmodified baseline to check whether performance still improves or at least does not degrade.
Figures
read the original abstract
Most end-to-end autonomous driving methods rely solely on instantaneous sensor observations, limiting them to reactive behavior without the anticipatory foresight human drivers employ through prior experience. We introduce geospatial visual priors, street-level visual context anchored to the intended driving route, providing visual-spatial foresight independent of real-time sensors. We propose a memory augmentation module featuring a dual-memory architecture and an adaptive memory gate, which can be easily integrated into existing end-to-end approaches. This design pairs a contextual memory for retrieved priors with a persistent fallback memory, and dynamically regulates the influence of memories based on current state compatibility. Evaluated on the NAVSIM-v2 benchmark, our approach consistently improves performance across diverse end-to-end baselines. Furthermore, because these priors are independent of onboard sensors, our method inherently improves robustness against sensor corruption, while the dual-memory design ensures safe fallback when the retrieved priors themselves become unreliable. Our project page is available at https://ori-mrg.github.io/PriorEye.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces geospatial visual priors as street-level visual context anchored to the intended driving route to provide anticipatory foresight beyond the reactive behavior of standard end-to-end autonomous driving methods that rely solely on instantaneous sensor observations. It proposes a memory augmentation module with a dual-memory architecture (contextual memory for retrieved priors paired with persistent fallback memory) and an adaptive memory gate that dynamically regulates memory influence according to current state compatibility; the module is designed for easy integration into existing end-to-end baselines. On the NAVSIM-v2 benchmark the approach is claimed to deliver consistent performance gains across diverse baselines, inherent robustness to sensor corruption (due to prior independence from onboard sensors), and safe fallback when priors become unreliable.
Significance. If the empirical claims hold, the work could meaningfully advance end-to-end autonomous driving by supplying route-specific visual foresight that current reactive architectures lack. The dual-memory plus adaptive-gate design offers a modular route to robustness that does not require wholesale architectural replacement. However, the manuscript supplies no quantitative results, ablation studies, implementation details, or error analysis, so the actual significance and reliability of the claimed gains cannot be evaluated.
major comments (2)
- Abstract: the central claim that the method 'consistently improves performance across diverse end-to-end baselines' on NAVSIM-v2 is presented without any tables, numerical results, statistical tests, or ablation studies, rendering the efficacy assertion unverifiable and load-bearing for the paper's contribution.
- Abstract: the assertion that 'the dual-memory design ensures safe fallback when the retrieved priors themselves become unreliable' depends entirely on the adaptive memory gate correctly judging compatibility, yet no description of how compatibility is computed or trained, nor any experiments with deliberately mismatched or corrupted priors, is supplied.
minor comments (1)
- The manuscript would benefit from an explicit statement of code, data, and model availability beyond the project-page URL.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive feedback. We agree that the current version of the manuscript does not provide sufficient empirical evidence or implementation details to support the claims made in the abstract, and we will revise the paper to address these issues directly.
read point-by-point responses
-
Referee: [—] Abstract: the central claim that the method 'consistently improves performance across diverse end-to-end baselines' on NAVSIM-v2 is presented without any tables, numerical results, statistical tests, or ablation studies, rendering the efficacy assertion unverifiable and load-bearing for the paper's contribution.
Authors: We acknowledge that the abstract states performance improvements without accompanying quantitative evidence in the manuscript. In the revised version we will include tables reporting numerical results on NAVSIM-v2 across multiple baselines, ablation studies isolating the contribution of the dual-memory and gate components, and any applicable statistical tests to substantiate the claims. revision: yes
-
Referee: [—] Abstract: the assertion that 'the dual-memory design ensures safe fallback when the retrieved priors themselves become unreliable' depends entirely on the adaptive memory gate correctly judging compatibility, yet no description of how compatibility is computed or trained, nor any experiments with deliberately mismatched or corrupted priors, is supplied.
Authors: We agree that the manuscript lacks a description of the compatibility computation and training procedure for the adaptive memory gate, as well as targeted experiments. In the revision we will add a detailed explanation of how compatibility is calculated and optimized, together with experiments that deliberately introduce mismatched or corrupted priors to evaluate the fallback behavior. revision: yes
Circularity Check
No circularity: empirical architecture evaluated on external benchmark
full rationale
The paper introduces a memory augmentation module with dual-memory and adaptive gate for integrating geospatial visual priors into end-to-end driving models. It is framed entirely as an empirical addition, with performance claims resting on evaluation against the external NAVSIM-v2 benchmark rather than any derivation, fitted parameters, or self-referential equations. No load-bearing steps reduce to inputs by construction, and no mathematical chain or self-citation dependency is invoked for the core claims.
Axiom & Free-Parameter Ledger
invented entities (1)
-
geospatial visual priors
no independent evidence
Reference graph
Works this paper leans on
-
[1]
In: RSS (2019)
Bansal, M., Krizhevsky, A., Ogale, A.S.: Chauffeurnet: Learning to drive by imi- tating the best and synthesizing the worst. In: RSS (2019)
2019
-
[2]
In: NeurIPS (2025)
Behrouz, A., Zhong, P., Mirrokni, V.: Titans: Learning to memorize at test time. In: NeurIPS (2025)
2025
-
[3]
The International Journal of Robotics Research42(1-
Burnett, K., Yoon, D.J., Wu, Y., Li, A.Z., Zhang, H., Lu, S., Qian, J., Tseng, W.K., Lambert, A., Leung, K.Y., Schoellig, A.P., Barfoot, T.D.: Boreas: A multi-season autonomous driving dataset. The International Journal of Robotics Research42(1-
-
[4]
In: CoRL (2025)
Cao, W., Hallgarten, M., Li, T., Dauner, D., Gu, X., Wang, C., Miron, Y., Aiello, M., Li, H., Gilitschenski, I., Ivanovic, B., Pavone, M., Geiger, A., Chitta, K.: Pseudo-simulation for autonomous driving. In: CoRL (2025)
2025
-
[5]
In: CVPR (2021)
Casas, S., Sadat, A., Urtasun, R.: MP3: A unified model to map, perceive, predict and plan. In: CVPR (2021)
2021
-
[6]
IEEE Trans
Ceccarelli, A., Secci, F.: RGB cameras failures and their effects in autonomous driving applications. IEEE Trans. Dependable Secur. Comput.20(4) (2023)
2023
-
[7]
IEEE TPAMI46(12) (2024)
Chen, L., Wu, P., Chitta, K., Jaeger, B., Geiger, A., Li, H.: End-to-end autonomous driving: Challenges and frontiers. IEEE TPAMI46(12) (2024)
2024
-
[8]
In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers (2017)
Chen, Q., Zhu, X., Ling, Z., Wei, S., Jiang, H., Inkpen, D.: Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers (2017)
2017
-
[9]
IEEE TPAMI 45(11) (2023)
Chitta, K., Prakash, A., Jaeger, B., Yu, Z., Renz, K., Geiger, A.: Transfuser: Imi- tation with transformer-based sensor fusion for autonomous driving. IEEE TPAMI 45(11) (2023)
2023
-
[10]
In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017 (2017)
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017 (2017)
2017
-
[11]
In: NeurIPS (2024)
Dauner, D., Hallgarten, M., Li, T., Weng, X., Huang, Z., Yang, Z., Li, H., Gilitschenski, I., Ivanovic, B., Pavone, M., Geiger, A., Chitta, K.: NAVSIM: data- driven non-reactive autonomous vehicle simulation and benchmarking. In: NeurIPS (2024)
2024
-
[12]
In: CoRL (2024)
Ding, K., Chen, B., Su, Y., Gao, H., Jin, B., Sima, C., Li, X., Zhang, W., Barsch, P., Li, H., Zhao, H.: Hint-ad: Holistically aligned interpretability in end-to-end autonomous driving. In: CoRL (2024)
2024
-
[13]
In: ICLR (2025)
Dong, X., Fu, Y., Diao, S., Byeon, W., Chen, Z., Mahabaleshwarkar, A.S., Liu, S., Keirsbilck, M.V., Chen, M., Suhara, Y., Lin, Y.C., Kautz, J., Molchanov, P.: Hymba: A hybrid-head architecture for small language models. In: ICLR (2025)
2025
-
[14]
Nature Neuroscience20(11) (2017)
Epstein, R.A., Patai, E.Z., Julian, J.B., Spiers, H.J.: The cognitive map in humans: Spatial navigation and beyond. Nature Neuroscience20(11) (2017)
2017
-
[15]
In: CVPR (2020)
Gao, J., Sun, C., Zhao, H., Shen, Y., Anguelov, D., Li, C., Schmid, C.: Vectornet: Encoding hd maps and agent dynamics from vectorized representation. In: CVPR (2020)
2020
-
[16]
In: NeurIPS (2024)
Gao, S., Yang, J., Chen, L., Chitta, K., Qiu, Y., Geiger, A., Zhang, J., Li, H.: Vista: A generalizable driving world model with high fidelity and versatile controllability. In: NeurIPS (2024)
2024
-
[17]
Neural Comput.12(10) (2000) PriorEye 17
Gers, F.A., Schmidhuber, J., Cummins, F.A.: Learning to forget: Continual pre- diction with LSTM. Neural Comput.12(10) (2000) PriorEye 17
2000
-
[18]
In: CVPR (2025)
Hassan, M., Stapf, S., Rahimi, A., Rezende, P.M.B., Haghighi, Y., Brüggemann, D., Katircioglu, I., Zhang, L., Chen, X., Saha, S., Cannici, M., Aljalbout, E., Ye, B., Wang, X., Davtyan, A., Salzmann, M., Scaramuzza, D., Pollefeys, M., Favaro, P., Alahi, A.: GEM: A generalizable ego-vision multimodal world model for fine- grained ego-motion, object dynamics...
2025
-
[19]
In: Similarity- Based Pattern Recognition - Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015, Proceedings (2015)
Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: Similarity- Based Pattern Recognition - Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015, Proceedings (2015)
2015
-
[20]
In: ECCV (2022)
Hu, S., Chen, L., Wu, P., Li, H., Yan, J., Tao, D.: ST-P3: end-to-end vision-based autonomous driving via spatial-temporal feature learning. In: ECCV (2022)
2022
-
[21]
In: CVPR (2023)
Hu, Y., Yang, J., Chen, L., Li, K., Sima, C., Zhu, X., Chai, S., Du, S., Lin, T., Wang, W., Lu, L., Jia, X., Liu, Q., Dai, J., Qiao, Y., Li, H.: Planning-oriented autonomous driving. In: CVPR (2023)
2023
-
[22]
TMLR (2025)
Hwang, J., Xu, R., Lin, H., Hung, W., Ji, J., Choi, K., Huang, D., He, T., Coving- ton, P., Sapp, B., Zhou, Y., Guo, J., Anguelov, D., Tan, M.: EMMA: end-to-end multimodal model for autonomous driving. TMLR (2025)
2025
-
[23]
In: NeurIPS (2024)
Jia, X., Yang, Z., Li, Q., Zhang, Z., Yan, J.: Bench2drive: Towards multi-ability benchmarking of closed-loop end-to-end autonomous driving. In: NeurIPS (2024)
2024
-
[24]
In: ICLR (2025)
Jia, X., You, J., Zhang, Z., Yan, J.: Drivetransformer: Unified transformer for scalable end-to-end autonomous driving. In: ICLR (2025)
2025
-
[25]
arXiv preprint arXiv:2512.06865 (2025)
Jia, X., Zhang, C., Jiang, Y., Wong, S., Zhang, Z., Chen, C., Zhang, S., Zhou, X., Yang, X., Yan, J., Jiang, Y.: Spatial retrieval augmented autonomous driving. arXiv preprint arXiv:2512.06865 (2025)
-
[26]
Diffvla: Vision-language guided diffusion planning for autonomous driving
Jiang, A., Gao, Y., Sun, Z., Wang, Y., Wang, J., Chai, J., Cao, Q., Heng, Y., Jiang, H., Zhang, Z., Guo, X., Sun, H., Zhao, H.: Diffvla: Vision-language guided diffusion planning for autonomous driving. arXiv preprint arXiv:2505.19381 (2025)
-
[27]
In: ICLR (2026)
Jiang, B., Chen, S., Gao, H., Liao, B., Zhang, Q., Liu, W., Wang, X.: VADv2: End-to-end autonomous driving via probabilistic planning. In: ICLR (2026)
2026
-
[28]
In: ICCV (2023)
Jiang,B.,Chen,S.,Xu,Q.,Liao,B.,Chen,J.,Zhou,H.,Zhang,Q.,Liu,W.,Huang, C., Wang, X.: VAD: vectorized scene representation for efficient autonomous driv- ing. In: ICCV (2023)
2023
-
[29]
arXiv preprint arXiv:2506.24044 (2025)
Jiang, S., Huang, Z., Qian, K., Luo, Z., Zhu, T., Zhong, Y., Tang, Y., Kong, M., Wang, Y., Jiao, S., Ye, H., Sheng, Z., Zhao, X., Wen, T., Fu, Z., Chen, S., Jiang, K., Yang, D., Choi, S., Sun, L.: A survey on vision-language-action models for autonomous driving. arXiv preprint arXiv:2506.24044 (2025)
-
[30]
IEEE Robotics Autom
Jiang, Z., Zhu, Z., Li, P., Gao, H., Yuan, T., Shi, Y., Zhao, H., Zhao, H.: P- mapnet: Far-seeing map generator enhanced by both sdmap and hdmap priors. IEEE Robotics Autom. Lett.9(10) (2024)
2024
-
[31]
In: ICRA (2024)
Karnchanachari, N., Geromichalos, D., Tan, K.S., Li, N., Eriksen, C., Yaghoubi, S., Mehdipour,N.,Bernasconi,G.,Fong,W.K.,Guo,Y.,Caesar,H.:Towardslearning- based planning: The nuplan benchmark for real-world autonomous driving. In: ICRA (2024)
2024
-
[32]
arXiv preprint arXiv:2601.05083 (2026)
Kirby, E., Boulch, A., Xu, Y., Yin, Y., Puy, G., Zablocki, É., Bursuc, A., Gidaris, S., Marlet, R., Bartoccioni, F., Cao, A., Samet, N., Vu, T., Cord, M.: Driving on registers. arXiv preprint arXiv:2601.05083 (2026)
-
[33]
arXiv preprint arXiv:2503.12820 (2025)
Li, K., Li, Z., Lan, S., Xie, Y., Zhang, Z., Liu, J., Wu, Z., Yu, Z., Álvarez, J.M.: Hydra-mdp++: Advancing end-to-end driving via expert-guided hydra-distillation. arXiv preprint arXiv:2503.12820 (2025) 18 K. Yeon et al
-
[34]
In: ECCV (2024)
Li, Q., Jia, X., Wang, S., Yan, J.: Think2drive: Efficient reinforcement learning by thinking with latent world model for autonomous driving (in CARLA-V2. In: ECCV (2024)
2024
- [35]
-
[36]
Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation
Li, Z., Li, K., Wang, S., Lan, S., Yu, Z., Ji, Y., Li, Z., Zhu, Z., Kautz, J., Wu, Z., Jiang, Y., Álvarez, J.M.: Hydra-mdp: End-to-end multimodal planning with multi-target hydra-distillation. arXiv preprint arXiv:2406.06978 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[37]
arXiv preprint arXiv:2510.24108 (2025)
Li, Z., Yao, W., Wang, Z., Sun, X., Chen, J., Chang, N., Shen, M., Song, J., Wu, Z., Lan, S., Álvarez, J.M.: ZTRS: zero-imitation end-to-end autonomous driving with trajectory scoring. arXiv preprint arXiv:2510.24108 (2025)
-
[38]
Generalized trajectory scoring for end-to-end multimodal planning
Li, Z., Yao, W., Wang, Z., Sun, X., Chen, J., Chang, N., Shen, M., Wu, Z., Lan, S., Álvarez, J.M.: Generalized trajectory scoring for end-to-end multimodal planning. arXiv preprint arXiv:2506.06664 (2025)
-
[39]
In: CVPR (2025)
Liao, B., Chen, S., Yin, H., Jiang, B., Wang, C., Yan, S., Zhang, X., Li, X., Zhang, Y., Zhang, Q., Wang, X.: Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving. In: CVPR (2025)
2025
-
[40]
Guideflow: Constraint-guided flow matching for planning in end-to-end autonomous driving
Liu, L., Jia, C., Yu, G., Song, Z., Li, J., Jia, F., Wu, P., Hao, X., Luo, Y.: Guideflow: Constraint-guided flow matching for planning in end-to-end autonomous driving. arXiv preprint arXiv:2511.18729 (2025)
-
[41]
arXiv preprint arXiv:2509.20843 (2025)
Luo, Z., Qian, K., Wang, J., Luo, Y., Miao, J., Fu, Z., Wang, Y., Jiang, S., Huang, Z., Hu, Y., Yang, Y., Ye, H., Yang, M., Dong, X., Jiang, K., Yang, D.: Mtrdrive: Memory-tool synergistic reasoning for robust autonomous driving in corner cases. arXiv preprint arXiv:2509.20843 (2025)
-
[42]
Proceedings of the National Academy of Sciences97(8) (2000)
Maguire, E.A., Gadian, D.G., Johnsrude, I.S., Good, C.D., Ashburner, J., Frack- owiak, R.S.J., Frith, C.D.: Navigation-related structural change in the hippocampi of taxi drivers. Proceedings of the National Academy of Sciences97(8) (2000)
2000
-
[43]
In: ECCV (2024)
Marcu, A., Chen, L., Hünermann, J., Karnsund, A., Hanotte, B., Chidananda, P., Nair, S., Badrinarayanan, V., Kendall, A., Shotton, J., Arani, E., Sinavski, O.: Lingoqa: Visual question answering for autonomous driving. In: ECCV (2024)
2024
-
[44]
arXiv preprint arXiv:2601.10512 (2026)
Mazumder, K., Flohr, F.B.: Satmap: Revisiting satellite maps as prior for online HD map construction. arXiv preprint arXiv:2601.10512 (2026)
-
[45]
NVIDIA: Alpamayo-r1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail. arXiv preprint arXiv:2511.00088 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[46]
TMLR (2024)
Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Huang, P., Li, S., Misra, I., Rabbat, M., Sharma, V., Synnaeve, G., Xu, H., Jégou, H., Mairal, J., Labatut, P., Joulin, A., Bojanowski, P.: Dinov2: Learning robust visual features...
2024
-
[47]
In: CVPR (2025)
Renz, K., Chen, L., Arani, E., Sinavski, O.: Simlingo: Vision-only closed-loop au- tonomous driving with language-action alignment. In: CVPR (2025)
2025
-
[48]
I have often walked down this street before
Rosenbaum, R.S., Ziegler, M., Winocur, G., Grady, C.L., Moscovitch, M.: “I have often walked down this street before”: fMRI Studies on the hippocampus and other structures during mental navigation of an old environment. Hippocampus14(7) (2004)
2004
-
[49]
In: CVPR (2024)
Shao, H., Hu, Y., Wang, L., Song, G., Waslander, S.L., Liu, Y., Li, H.: Lmdrive: Closed-loop end-to-end driving with large language models. In: CVPR (2024)
2024
-
[50]
In: CVPR (2025) PriorEye 19
Song, Z., Jia, C., Liu, L., Pan, H., Zhang, Y., Wang, J., Zhang, X., Xu, S., Yang, L., Luo, Y.: Don’t shake the wheel: Momentum-aware planning in end-to-end au- tonomous driving. In: CVPR (2025) PriorEye 19
2025
-
[51]
In: NeurIPS (2015)
Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In: NeurIPS (2015)
2015
-
[52]
Augmenting Self-attention with Persistent Memory
Sukhbaatar, S., Grave, E., Lample, G., Jégou, H., Joulin, A.: Augmenting self- attention with persistent memory. arXiv preprint arXiv:1907.01470 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[53]
In: ICRA (2025)
Sun, W., Lin, X., Shi, Y., Zhang, C., Wu, H., Zheng, S.: Sparsedrive: End-to-end autonomous driving via sparse scene representation. In: ICRA (2025)
2025
-
[54]
In: CVPR (2025)
Sun, Y., Ochiai, H., Wu, Z., Lin, S., Kanai, R.: Associative transformer. In: CVPR (2025)
2025
-
[55]
IEEE Trans
Tampuu, A., Matiisen, T., Semikin, M., Fishman, D., Naveed, M.: A survey of end- to-end driving: Architectures and training methods. IEEE Trans. Neural Networks Learn. Syst.33(4) (2022)
2022
-
[56]
SimScale: Learning to Drive via Real-World Simulation at Scale
Tian, H., Li, T., Liu, H., Yang, J., Qiu, Y., Li, G., Wang, J., Gao, Y., Zhang, Z., Wang, L., Ye, H., Tan, T., Chen, L., Li, H.: Simscale: Learning to drive via real-world simulation at scale. arXiv preprint arXiv:2511.23369 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[57]
Tschannen, M., Gritsenko, A.A., Wang, X., Naeem, M.F., Alabdulmohsin, I., Parthasarathy, N., Evans, T., Beyer, L., Xia, Y., Mustafa, B., Hénaff, O.J., Harm- sen, J., Steiner, A., Zhai, X.: Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features. arXiv preprint arXiv:2502.14786 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[58]
In: NeurIPS (2023)
Wang, W., Dong, L., Cheng, H., Liu, X., Yan, X., Gao, J., Wei, F.: Augmenting language models with long-term memory. In: NeurIPS (2023)
2023
-
[59]
In: ECCV (2024)
Wang, X., Zhu, Z., Huang, G., Chen, X., Zhu, J., Lu, J.: Drivedreamer: Towards real-world-drive world models for autonomous driving. In: ECCV (2024)
2024
-
[60]
In: CVPR (2025)
Wang, Y., Liu, Q., Jiang, Z., Wang, T., Jiao, J., Chu, H., Gao, B., Chen, H.: RAD: retrieval-augmented decision-making of meta-actions with vision-language models in autonomous driving. In: CVPR (2025)
2025
-
[61]
In: NeurIPS (2022)
Wu, P., Jia, X., Chen, L., Yan, J., Li, H., Qiao, Y.: Trajectory-guided control prediction for end-to-end autonomous driving: A simple yet strong baseline. In: NeurIPS (2022)
2022
-
[62]
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
Wu, Y., Liang, S., Zhang, C., Wang, Y., Zhang, Y., Guo, H., Tang, R., Liu, Y.: From human memory to AI memory: A survey on memory mechanisms in the era of llms. arXiv preprint arXiv:2504.15965 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[63]
Wu,Y.,Rabe,M.N.,Hutchins,D.,Szegedy,C.:Memorizingtransformers.In:ICLR (2022)
2022
-
[64]
In: ICLR (2024)
Xiao, G., Tian, Y., Chen, B., Han, S., Lewis, M.: Efficient streaming language models with attention sinks. In: ICLR (2024)
2024
-
[65]
In: NeurIPS (2021)
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Álvarez, J.M., Luo, P.: Segformer: Simple and efficient design for semantic segmentation with transformers. In: NeurIPS (2021)
2021
-
[66]
In: CVPR (2025)
Xing, Z., Zhang, X., Hu, Y., Jiang, B., He, T., Zhang, Q., Long, X., Yin, W.: Goalflow: Goal-driven flow matching for multimodal trajectories generation in end- to-end autonomous driving. In: CVPR (2025)
2025
-
[67]
In: CVPR (2023)
Xiong, X., Liu, Y., Yuan, T., Wang, Y., Wang, Y., Zhao, H.: Neural map prior for autonomous driving. In: CVPR (2023)
2023
-
[68]
IEEE Robotics Autom
Xu, Z., Zhang, Y., Xie, E., Zhao, Z., Guo, Y., Wong, K.K., Li, Z., Zhao, H.: Drivegpt4: Interpretable end-to-end autonomous driving via large language model. IEEE Robotics Autom. Lett.9(10) (2024)
2024
-
[69]
arXiv preprint arXiv:2506.06659 (2025)
Yao, W., Li, Z., Lan, S., Wang, Z., Sun, X., Álvarez, J.M., Wu, Z.: Drivesuprim: Towards precise trajectory selection for end-to-end planning. arXiv preprint arXiv:2506.06659 (2025) 20 K. Yeon et al
-
[70]
In: RSS (2024)
Yuan, J., Sun, S., Omeiza, D., Zhao, B., Newman, P., Kunze, L., Gadd, M.: Rag-driver:Generalisabledrivingexplanationswithretrieval-augmentedin-context multi-modal large language model learning. In: RSS (2024)
2024
-
[71]
In: NeurIPS (2025)
Zeng, S., Chang, X., Xie, M., Liu, X., Bai, Y., Pan, Z., Xu, M., Wei, X.: Future- SightDrive: Thinking visually with spatio-temporal CoT for autonomous driving. In: NeurIPS (2025)
2025
-
[72]
In: ECCV (2024)
Zheng, W., Song, R., Guo, X., Zhang, C., Chen, L.: Genad: Generative end-to-end autonomous driving. In: ECCV (2024)
2024
-
[73]
Zhong, S., Xu, M., Ao, T., Shi, G.: Understanding transformer from the perspective of associative memory. arXiv preprint arXiv:2505.19488 (2025) PriorEye 21 PriorEye: Geospatial Visual Priors for End-to-End Autonomous Driving Supplementary Material A EPDMS Metric Details We adopt the Extended Predictive Driver Model Score (EPDMS) from NAVSIM v2 [4,11]. EP...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.