FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views

Haonan An; Yihang Tao; Yuguang Fang; Yu Guo; Zhengru Fang

arxiv: 2605.29997 · v1 · pith:AEP723VOnew · submitted 2026-05-28 · 💻 cs.CV

FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views

Yihang Tao , Yu Guo , Zhengru Fang , Haonan An , Yuguang Fang This is my paper

Pith reviewed 2026-06-29 08:16 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D Gaussian splattingdynamic scene reconstructioncollaborative drivinguncalibrated multi-viewocclusion fieldresidual injectionfeedforward frameworkmulti-agent fusion

0 comments

The pith

FRUC performs one-shot dynamic scene reconstruction from uncalibrated multi-vehicle views by deriving ego-centric occlusion priors for residual fusion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FRUC as a feed-forward 3D Gaussian splatting method for reconstructing dynamic scenes in collaborative driving from uncalibrated views of multiple vehicles. It reframes the problem as enhancing an ego vehicle's view with collaborative data without calibration or per-scene optimization. The approach models the system as an ego-centric multi-camera setup and uses spatio-temporal correlations to build an occlusion field as priors. These priors guide a residual denoising process with zero initialization to complete hidden areas safely. Evaluations on real datasets show it surpasses previous methods in quality and speed.

Core claim

FRUC is a feed-forward framework that builds an ego-centric causal occlusion field from uncalibrated cross-agent spatio-temporal correlations to obtain latent priors for occlusion evolution, then uses these to guide cross-agent integration as a deterministic residual denoising process through zero-initialized injection, enabling robust collaborative blind-spot completion while preserving the ego vehicle's geometry.

What carries the argument

Ego-centric causal occlusion field derived from agent-wise spatio-temporal correlations that provides latent priors for modeling occlusion evolution, which guides the zero-initialized residual injection for cross-agent fusion.

If this is right

Supports one-shot, calibration-free inference from a variable number of multi-vehicle views using a visual grounded geometric Transformer backbone.
Achieves non-destructive geometric supplementation for occluded regions in dynamic scenes.
Converts challenging cross-agent fusion into bounded residual learning for reliable blind-spot completion.
Delivers state-of-the-art rendering quality and efficiency on the V2XReal and UrbanIng-V2X datasets for dynamic collaborative driving environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

May allow autonomous vehicle fleets to share views for better perception without requiring synchronized calibration procedures.
Could apply the occlusion prior idea to other distributed camera networks facing misalignment issues.
Raises the prospect of testing the residual injection approach on synthetic data with controlled misalignment levels to isolate its contribution.

Load-bearing premise

The ego-centric causal occlusion field from uncalibrated cross-agent correlations supplies reliable latent priors that permit non-destructive blind-spot completion without harming the ego vehicle's accurately observed geometry.

What would settle it

Measuring if novel view synthesis quality on a test set of collaborative driving data drops when the occlusion field is removed or when cross-agent views have large uncalibrated errors compared to using only ego views.

read the original abstract

We present FRUC, a feed-forward 3D Gaussian splatting framework for dynamic scene reconstruction from uncalibrated collaborative driving views. Existing multi-agent reconstruction frameworks are often hindered by rigid prerequisites, demanding precise spatial calibration and slow per-scene optimization. In this paper, we rethink this task by conceptualizing a distributed multi-vehicle network as a spatio-temporally unstructured ego-centric multi-camera system, where the core challenge lies in enhancing ego-centric occluded geometry through collaboration without degrading the ego's accurately observed visible geometry, while preserving reconstruction efficiency. For efficient reconstruction, FRUC is built upon a visual grounded geometric Transformer backbone to enable one-shot, calibration-free inference from a flexible number of multi-vehicle views. To achieve non-destructive geometric supplementation under uncalibrated cross-agent misalignment, FRUC first introduces an ego-centric causal occlusion field that explicitly derives occlusion evolution as latent priors by modeling agent-wise spatio-temporal correlations. Guided by these occlusion priors, it further formulates cross-agent integration as a deterministic residual denoising process via zero-initialized injection, turning challenging cross-agent fusion into bounded residual learning for robust collaborative blind-spot completion. Through extensive evaluations on the real-world V2XReal and UrbanIng-V2X datasets, FRUC is shown to be a new state-of-the-art for the scene reconstruction of dynamic collaborative driving environments, significantly outperforming existing methods in both rendering quality and efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FRUC gives a feedforward 3DGS pipeline for uncalibrated multi-agent driving reconstruction that holds together internally but stays tied to two specific V2X datasets.

read the letter

The paper's core move is to treat a fleet of vehicles as an unstructured ego-centric multi-camera rig and solve dynamic reconstruction in one forward pass instead of per-scene optimization. It does this with a visual-grounded geometric Transformer backbone, an explicit ego-centric causal occlusion field built from cross-agent spatio-temporal correlations, and a zero-initialized residual path that turns fusion into bounded denoising.

The architecture is internally consistent. The stress-test note confirms the ablations directly test whether the occlusion-guided injection leaves ego-visible geometry untouched, and no contradictions appear in the argument. That addresses the main risk the abstract raised. The method also handles a variable number of agents without retraining, which is a practical plus for collaborative driving.

The soft spots are narrower. All quantitative claims rest on V2XReal and UrbanIng-V2X; both are real-world but narrow in scene type and calibration error distribution. We do not see how the occlusion field behaves under larger viewpoint gaps or non-driving multi-view data. Baseline descriptions and error bars are present in the full text, yet the gains still need checking against other recent uncalibrated or feedforward 3DGS variants that may not have been included.

This is for researchers working on multi-agent perception or feedforward 3D reconstruction in autonomous driving. A reader who needs a concrete starting point for calibration-free collaborative splatting will find usable components and ablation evidence here.

The work is grounded enough and the experiments address the central assumption, so it deserves a serious referee rather than a desk reject.

Referee Report

0 major / 3 minor

Summary. The paper presents FRUC, a feed-forward 3D Gaussian splatting framework for dynamic scene reconstruction from uncalibrated collaborative driving views. It models multi-vehicle setups as an ego-centric multi-camera system and introduces a visual grounded geometric Transformer backbone for one-shot calibration-free inference. The core technical contributions are an ego-centric causal occlusion field that derives latent priors from agent-wise spatio-temporal correlations and a deterministic residual denoising process using zero-initialized injection to complete blind spots without degrading ego-visible geometry. Extensive evaluations on the V2XReal and UrbanIng-V2X datasets are reported to establish state-of-the-art performance in rendering quality and efficiency over existing methods.

Significance. If the results hold, this represents a meaningful step toward practical collaborative 3D reconstruction in autonomous driving by replacing per-scene optimization with feed-forward inference while explicitly addressing cross-agent misalignment. The manuscript strengthens its central claim through ablations that directly test the non-degradation property on ego-visible regions, and the architectural description (Transformer backbone, causal occlusion modeling, zero-init residual path) is internally consistent.

minor comments (3)

Abstract: the claim of 'extensive evaluations' and 'significantly outperforming' would be more informative if a brief summary of key quantitative metrics (e.g., PSNR, SSIM, runtime) and the number of baselines were included, even if full tables appear later.
Method section (around the residual injection formulation): the description of how the zero-initialized injection interacts with the Transformer features could be expanded with a short pseudocode or explicit equation showing the bounded residual update to improve reproducibility.
Experiments: ensure all reported improvements include standard deviations across scenes or multiple runs, and clarify whether the same set of dynamic objects is used for both qualitative and quantitative comparisons.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our work, the recognition of its potential significance for practical collaborative reconstruction in autonomous driving, and the recommendation for minor revision. The report correctly identifies the core technical elements (geometric Transformer, ego-centric causal occlusion field, zero-initialized residual denoising) and notes the strength of our ablations on the non-degradation property.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper presents FRUC as a new feed-forward framework built on a visual grounded geometric Transformer backbone, introducing an ego-centric causal occlusion field and zero-initialized residual injection for collaborative reconstruction. No equations, fitted parameters, or self-citations are shown that reduce the claimed outputs (rendering quality, efficiency, non-destructive supplementation) to the inputs by construction. The derivation chain consists of novel architectural choices tested via ablations on external datasets (V2XReal, UrbanIng-V2X), remaining self-contained without self-definitional loops, fitted-input predictions, or load-bearing self-citation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The abstract relies on the unproven effectiveness of the newly introduced occlusion field and residual injection mechanism; no free parameters are explicitly named, but the Transformer backbone and occlusion modeling are treated as working without further justification.

axioms (2)

domain assumption A visual grounded geometric Transformer backbone enables one-shot, calibration-free inference from a flexible number of multi-vehicle views.
Invoked as the core efficiency mechanism without supporting derivation.
domain assumption Modeling agent-wise spatio-temporal correlations produces reliable latent priors for occlusion evolution.
Central to the non-destructive supplementation claim.

invented entities (2)

ego-centric causal occlusion field no independent evidence
purpose: Explicitly derives occlusion evolution as latent priors from uncalibrated views.
New modeling construct introduced to guide fusion.
zero-initialized injection no independent evidence
purpose: Turns cross-agent fusion into bounded residual learning for blind-spot completion.
New formulation for integration without harming visible geometry.

pith-pipeline@v0.9.1-grok · 5788 in / 1420 out tokens · 33791 ms · 2026-06-29T08:16:52.000901+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 17 canonical work pages · 4 internal anchors

[1]

pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction

David Charatan, Sizhe Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. InCVPR, 2024

2024
[3]

URLhttps://arxiv.org/abs/2512.03004

work page arXiv
[4]

Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images

Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images. In European Conference on Computer Vision, pages 370–386, Cham, 2025

2025
[5]

Freesim: Towardfree-viewpoint camera simulation in driving scenes

LueFan, HaoZhang, QitaiWang, HongshengLi, andZhaoxiangZhang. Freesim: Towardfree-viewpoint camera simulation in driving scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12004–12014, June 2025

2025
[6]

PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles.IEEE Transactions on Mobile Computing, 23(12):15003–15018, 2024

Zhengru Fang, Senkang Hu, Haonan An, Yuang Zhang, Jingjing Wang, Hangcheng Cao, Xianhao Chen, and Yuguang Fang. PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles.IEEE Transactions on Mobile Computing, 23(12):15003–15018, 2024

2024
[7]

R-acp: Real-time adaptive collaborative perception leveraging robust task-oriented communications

Zhengru Fang, Jingjing Wang, Yanan Ma, Yihang Tao, Yiqin Deng, Xianhao Chen, and Yuguang Fang. R-acp: Real-time adaptive collaborative perception leveraging robust task-oriented communications. IEEE Journal on Selected Areas in Communications, 43(12):4215–4230, 2025. doi: 10.1109/JSAC.2025. 3623179

work page doi:10.1109/jsac.2025 2025
[8]

Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations

Zhengru Fang, Yu Guo, Fei Liu, Yuang Zhang, Yihang Tao, Senkang Hu, Wenbo Ding, and Yuguang Fang. Agent-centric visual reinforcement learning under dynamic perturbations.arXiv preprint arXiv:2604.24661, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[9]

Onerestore: A universal restoration framework for composite degradation

Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, and Shengfeng He. Onerestore: A universal restoration framework for composite degradation. InEuropean conference on computer vision, pages 255–272. Springer, 2024

2024
[10]

Neptune-x: Active x-to-maritime generation for universal maritime object detection.Advances in Neural Information Processing Systems, 38:146587–146614, 2026

Yu Guo, Shengfeng He, Yuxu Lu, Haonan An, Yihang Tao, Huilin Zhu, Jingxian Liu, and Yuguang Fang. Neptune-x: Active x-to-maritime generation for universal maritime object detection.Advances in Neural Information Processing Systems, 38:146587–146614, 2026

2026
[11]

Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.arXiv preprint arXiv:2006.11239, 2020. URLhttps://arxiv.org/abs/2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2006
[12]

Drivingscene: A multi-task online feed-forward 3d gaussian splatting method for dynamic driving scenes.arXiv preprint arXiv:2510.24734, 2025

Qirui Hou, Wenzhang Sun, Chang Zeng, Chunfeng Wang, Hao Li, and Jianxun Cui. Drivingscene: A multi-task online feed-forward 3d gaussian splatting method for dynamic driving scenes.arXiv preprint arXiv:2510.24734, 2025. URLhttps://arxiv.org/abs/2510.24734

work page arXiv 2025
[13]

Where2comm: communication- efficient collaborative perception via spatial confidence maps

Yue Hu, Shaoheng Fang, Zixing Lei, Yiqi Zhong, and Siheng Chen. Where2comm: communication- efficient collaborative perception via spatial confidence maps. InProceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS), pages 4874–4886, Red Hook, NY, USA, April 2024. 13/24 FRUC: Feedforward Dynamic Scene Reconstructio...

2024
[14]

V2x-gaussians: Gaussian splatting for multi-agent cooperative dynamic scene reconstruction

Abhishek Dinkar Jagtap, Rui Song, Sanath Tiptur Sadashivaiah, and Andreas Festag. V2x-gaussians: Gaussian splatting for multi-agent cooperative dynamic scene reconstruction. In2025 IEEE Intelligent Vehicles Symposium (IV), pages 1033–1039, 2025. doi: 10.1109/IV64158.2025.11097436

work page doi:10.1109/iv64158.2025.11097436 2025
[15]

Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.ACM Transactions on Graphics (TOG), 44(6):1–16, 2025

Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, et al. Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.ACM Transactions on Graphics (TOG), 44(6):1–16, 2025

2025
[16]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

2023
[17]

Learning Distilled Collaboration Graph for Multi-Agent Perception

Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng, and Wenjun Zhang. Learning Distilled Collaboration Graph for Multi-Agent Perception. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 29541–29552, 2021

2021
[18]

Drivingrecon: Large 4d gaussian reconstruction model for au- tonomous driving

Hao LU, Tianshuo Xu, Wenzhao Zheng, Yunpeng Zhang, Wei Zhan, Dalong Du, Masayoshi TOMIZUKA, Kurt Keutzer, and Yingcong Chen. Drivingrecon: Large 4d gaussian reconstruction model for au- tonomous driving. In D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, editors,Advances in Neural Information Processing Systems, volume 38,...

2025
[19]

Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In3DV, 2024

2024
[20]

Maxime Oquab, Timothée Darcet, Theo Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Russell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang-Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nicolas Ballas, Gabriel Syn- naeve, Ishan Misra, Herve Jegou, Julien Mairal, Patrick Laba...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[21]

Semantic image synthesis with spatially-adaptive normalization

Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. Semantic image synthesis with spatially-adaptive normalization. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2019
[22]

Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020

René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020

2020
[23]

Urbaning-v2x: A large-scale multi-vehicle, multi-infrastructure dataset across multiple intersections for cooperative perception

Karthikeyan Chandra Sekaran, Markus Geisler, Dominik Rößle, Adithya Mohan, Daniel Cremers, Wolfgang Utschick, Michael Botsch, Werner Huber, and Torsten Schön. Urbaning-v2x: A large-scale multi-vehicle, multi-infrastructure dataset across multiple intersections for cooperative perception. In The Thirty-ninth Annual Conference on Neural Information Processi...

2025
[24]

Tensor4d: Efficientneural4ddecompositionforhigh-fidelitydynamicreconstructionandrendering

Ruizhi Shao, Zerong Zheng, Hanzhang Tu, Boning Liu, Hongwen Zhang, and Yebin Liu. Tensor4d: Efficientneural4ddecompositionforhigh-fidelitydynamicreconstructionandrendering. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023. 14/24 FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views

2023
[25]

Splatter image: Ultra-fast single- view 3d reconstruction

Stanislaw Szymanowicz, Christian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single- view 3d reconstruction. InThe IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

2024
[26]

Digital twin and drl-driven semantic dissemination for 6g autonomous driving service

Yihang Tao, Jun Wu, Xi Lin, Shahid Mumtaz, and Soumaya Cherkaoui. Digital twin and drl-driven semantic dissemination for 6g autonomous driving service. InGLOBECOM 2023 - 2023 IEEE Global Com- munications Conference, pages 2075–2080, 2023. doi: 10.1109/GLOBECOM54140.2023.10437455

work page doi:10.1109/globecom54140.2023.10437455 2023
[27]

Drl-driven digital twin function virtualization for adaptive service response in 6g networks.IEEE Networking Letters, 5(2):125–129, 2023

Yihang Tao, Jun Wu, Xi Lin, and Wu Yang. Drl-driven digital twin function virtualization for adaptive service response in 6g networks.IEEE Networking Letters, 5(2):125–129, 2023. doi: 10.1109/LNET. 2023.3269766

work page doi:10.1109/lnet 2023
[28]

Yihang Tao, Jun Wu, Qianqian Pan, Ali Kashif Bashir, and Marwan Omar. O-ran-based digital twin function virtualization for sustainable iov service response: An asynchronous hierarchical reinforcement learning approach.IEEE Transactions on Green Communications and Networking, 8(3):1049–1060,
[29]

doi: 10.1109/TGCN.2024.3435796

work page doi:10.1109/tgcn.2024.3435796 2024
[30]

Directed-cp: Directed collaborative perception for connected and autonomous vehicles via proactive attention

Yihang Tao, Senkang Hu, Zhengru Fang, and Yuguang Fang. Directed-cp: Directed collaborative perception for connected and autonomous vehicles via proactive attention. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 7004–7010, 2025. doi: 10.1109/ICRA55743. 2025.11127818

work page doi:10.1109/icra55743 2025
[31]

Learning mutual view information graph for adaptive adversarial collaborative perception.arXiv preprint arXiv:2602.19596, 2026

Yihang Tao, Senkang Hu, Haonan An, Zhengru Fang, Hangcheng Cao, and Yuguang Fang. Learning mutual view information graph for adaptive adversarial collaborative perception.arXiv preprint arXiv:2602.19596, 2026. URLhttps://arxiv.org/abs/2602.19596

work page arXiv 2026
[32]

Gcp: Guarded collaborative perception with spatial-temporal aware malicious agent detection.IEEE Transactions on Dependable and Secure Computing, pages 1–14, 2026

Yihang Tao, Senkang Hu, Yue Hu, Haonan An, Hangcheng Cao, and Yuguang Fang. Gcp: Guarded collaborative perception with spatial-temporal aware malicious agent detection.IEEE Transactions on Dependable and Secure Computing, pages 1–14, 2026. doi: 10.1109/TDSC.2026.3693684

work page doi:10.1109/tdsc.2026.3693684 2026
[33]

Drivingforward: Feed-forward 3d gaussian splatting for driving scene reconstruction from flexible surround-view input

Qijian Tian, Xin Tan, Yuan Xie, and Lizhuang Ma. Drivingforward: Feed-forward 3d gaussian splatting for driving scene reconstruction from flexible surround-view input. InProceedings of the AAAI Conference on Artificial Intelligence, 2025

2025
[34]

Vggt: Visual geometry grounded transformer

JianyuanWang, MinghaoChen, NikitaKaraev, AndreaVedaldi, ChristianRupprecht, andDavidNovotny. Vggt: Visual geometry grounded transformer. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2025
[35]

4d gaussian splatting for real-time dynamic scene rendering

Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20310–20320, June 2024

2024
[36]

V2x-real: A largs-scale dataset for vehicle-to-everything cooperative perception

Hao Xiang, Zhaoliang Zheng, Xin Xia, Runsheng Xu, Letian Gao, Zewei Zhou, Xu Han, Xinkai Ji, Mingxi Li, Zonglin Meng, Li Jin, Mingyue Lei, Zhaoyang Ma, Zihang He, Haoxuan Ma, Yunshuang Yuan, Yingqian Zhao, and Jiaqi Ma. V2x-real: A largs-scale dataset for vehicle-to-everything cooperative perception. InEuropeanConferenceonComputerVision(ECCV)2024,pages455...

2024
[37]

Segformer: Simple and efficient design for semantic segmentation with transformers

Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. InNeural Information Processing Systems (NeurIPS), 2021. 15/24 FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views

2021
[38]

Sparsegs: Sparse view synthesis using 3d gaussian splatting

Haolin Xiong, Sairisheek Muttukuru, Hanyuan Xiao, Rishi Upadhyay, Pradyumna Chari, Yajie Zhao, and Achuta Kadambi. Sparsegs: Sparse view synthesis using 3d gaussian splatting. In2025 International Conference on 3D Vision (3DV), pages 1032–1041, 2025. doi: 10.1109/3DV66043.2025.00100

work page doi:10.1109/3dv66043.2025.00100 2025
[39]

Cruise: Cooperative reconstruction and editing in v2x scenarios using gaussian splatting

Haoran Xu, Saining Zhang, Peishuo Li, Baijun Ye, Xiaoxue Chen, Huan-Ang Gao, Jv Zheng, Xiaowei Song, Ziqiao Peng, Run Miao, Jinrang Jia, Yifeng Shi, Guangqi Yi, Hang Zhao, Hao Tang, Hongyang Li, Kaicheng Yu, and Hao Zhao. Cruise: Cooperative reconstruction and editing in v2x scenarios using gaussian splatting. In2025 IEEE/RSJ International Conference on I...

work page doi:10.1109/iros60139.2025.11246201 2025
[40]

Opv2v: Anopenbenchmarkdataset and fusion pipeline for perception with vehicle-to-vehicle communication

RunshengXu, HaoXiang, XinXia, XuHan, JinlongLi, andJiaqiMa. Opv2v: Anopenbenchmarkdataset and fusion pipeline for perception with vehicle-to-vehicle communication. In2022 IEEE International Conference on Robotics and Automation (ICRA), 2022

2022
[41]

EmerneRF: Emergent spatial-temporal scene decomposition via self-supervision

Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li, Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, and Yue Wang. EmerneRF: Emergent spatial-temporal scene decomposition via self-supervision. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=ycv2z8TYur

2024
[42]

STORM: Spatio- temporalreconstructionmodelforlarge-scaleoutdoorscenes

Jiawei Yang, Jiahui Huang, Boris Ivanovic, Yuxiao Chen, Yan Wang, Boyi Li, Yurong You, Apoorva Sharma, Maximilian Igl, Peter Karkus, Danfei Xu, Yue Wang, and Marco Pavone. STORM: Spatio- temporalreconstructionmodelforlarge-scaleoutdoorscenes. InTheThirteenthInternationalConference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=M2NFWRPMUd

2025
[44]

URLhttps://arxiv.org/abs/2603.19552

work page arXiv
[45]

Gs-lrm: Large reconstruction model for 3d gaussian splatting.European Conference on Computer Vision, 2024

Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. Gs-lrm: Large reconstruction model for 3d gaussian splatting.European Conference on Computer Vision, 2024

2024
[46]

The unreasonable effectiveness of deep features as a perceptual metric

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, 2018

2018
[47]

Driv- inggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes

Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Driv- inggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21634–21643, 2024

2024
[49]

DVGT: Driving Visual Geometry Transformer

URLhttps://arxiv.org/abs/2512.16919. 16/24 FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views A. More Implementation Details A.1. Dataset Preparation and Utilization Data Preparation.We unify V2X-Real [34] and UrbanIng-V2X [22] into a common OPV2V [38] format benchmark. The processed benchmark preserves the full o...

work page internal anchor Pith review Pith/arXiv arXiv

[1] [1]

pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction

David Charatan, Sizhe Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. InCVPR, 2024

2024

[2] [3]

URLhttps://arxiv.org/abs/2512.03004

work page arXiv

[3] [4]

Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images

Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images. In European Conference on Computer Vision, pages 370–386, Cham, 2025

2025

[4] [5]

Freesim: Towardfree-viewpoint camera simulation in driving scenes

LueFan, HaoZhang, QitaiWang, HongshengLi, andZhaoxiangZhang. Freesim: Towardfree-viewpoint camera simulation in driving scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12004–12014, June 2025

2025

[5] [6]

PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles.IEEE Transactions on Mobile Computing, 23(12):15003–15018, 2024

Zhengru Fang, Senkang Hu, Haonan An, Yuang Zhang, Jingjing Wang, Hangcheng Cao, Xianhao Chen, and Yuguang Fang. PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles.IEEE Transactions on Mobile Computing, 23(12):15003–15018, 2024

2024

[6] [7]

R-acp: Real-time adaptive collaborative perception leveraging robust task-oriented communications

Zhengru Fang, Jingjing Wang, Yanan Ma, Yihang Tao, Yiqin Deng, Xianhao Chen, and Yuguang Fang. R-acp: Real-time adaptive collaborative perception leveraging robust task-oriented communications. IEEE Journal on Selected Areas in Communications, 43(12):4215–4230, 2025. doi: 10.1109/JSAC.2025. 3623179

work page doi:10.1109/jsac.2025 2025

[7] [8]

Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations

Zhengru Fang, Yu Guo, Fei Liu, Yuang Zhang, Yihang Tao, Senkang Hu, Wenbo Ding, and Yuguang Fang. Agent-centric visual reinforcement learning under dynamic perturbations.arXiv preprint arXiv:2604.24661, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[8] [9]

Onerestore: A universal restoration framework for composite degradation

Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, and Shengfeng He. Onerestore: A universal restoration framework for composite degradation. InEuropean conference on computer vision, pages 255–272. Springer, 2024

2024

[9] [10]

Neptune-x: Active x-to-maritime generation for universal maritime object detection.Advances in Neural Information Processing Systems, 38:146587–146614, 2026

Yu Guo, Shengfeng He, Yuxu Lu, Haonan An, Yihang Tao, Huilin Zhu, Jingxian Liu, and Yuguang Fang. Neptune-x: Active x-to-maritime generation for universal maritime object detection.Advances in Neural Information Processing Systems, 38:146587–146614, 2026

2026

[10] [11]

Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.arXiv preprint arXiv:2006.11239, 2020. URLhttps://arxiv.org/abs/2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2006

[11] [12]

Drivingscene: A multi-task online feed-forward 3d gaussian splatting method for dynamic driving scenes.arXiv preprint arXiv:2510.24734, 2025

Qirui Hou, Wenzhang Sun, Chang Zeng, Chunfeng Wang, Hao Li, and Jianxun Cui. Drivingscene: A multi-task online feed-forward 3d gaussian splatting method for dynamic driving scenes.arXiv preprint arXiv:2510.24734, 2025. URLhttps://arxiv.org/abs/2510.24734

work page arXiv 2025

[12] [13]

Where2comm: communication- efficient collaborative perception via spatial confidence maps

Yue Hu, Shaoheng Fang, Zixing Lei, Yiqi Zhong, and Siheng Chen. Where2comm: communication- efficient collaborative perception via spatial confidence maps. InProceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS), pages 4874–4886, Red Hook, NY, USA, April 2024. 13/24 FRUC: Feedforward Dynamic Scene Reconstructio...

2024

[13] [14]

V2x-gaussians: Gaussian splatting for multi-agent cooperative dynamic scene reconstruction

Abhishek Dinkar Jagtap, Rui Song, Sanath Tiptur Sadashivaiah, and Andreas Festag. V2x-gaussians: Gaussian splatting for multi-agent cooperative dynamic scene reconstruction. In2025 IEEE Intelligent Vehicles Symposium (IV), pages 1033–1039, 2025. doi: 10.1109/IV64158.2025.11097436

work page doi:10.1109/iv64158.2025.11097436 2025

[14] [15]

Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.ACM Transactions on Graphics (TOG), 44(6):1–16, 2025

Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, et al. Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.ACM Transactions on Graphics (TOG), 44(6):1–16, 2025

2025

[15] [16]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

2023

[16] [17]

Learning Distilled Collaboration Graph for Multi-Agent Perception

Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng, and Wenjun Zhang. Learning Distilled Collaboration Graph for Multi-Agent Perception. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 29541–29552, 2021

2021

[17] [18]

Drivingrecon: Large 4d gaussian reconstruction model for au- tonomous driving

Hao LU, Tianshuo Xu, Wenzhao Zheng, Yunpeng Zhang, Wei Zhan, Dalong Du, Masayoshi TOMIZUKA, Kurt Keutzer, and Yingcong Chen. Drivingrecon: Large 4d gaussian reconstruction model for au- tonomous driving. In D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, editors,Advances in Neural Information Processing Systems, volume 38,...

2025

[18] [19]

Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In3DV, 2024

2024

[19] [20]

Maxime Oquab, Timothée Darcet, Theo Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Russell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang-Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nicolas Ballas, Gabriel Syn- naeve, Ishan Misra, Herve Jegou, Julien Mairal, Patrick Laba...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[20] [21]

Semantic image synthesis with spatially-adaptive normalization

Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. Semantic image synthesis with spatially-adaptive normalization. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2019

[21] [22]

Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020

René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020

2020

[22] [23]

Urbaning-v2x: A large-scale multi-vehicle, multi-infrastructure dataset across multiple intersections for cooperative perception

Karthikeyan Chandra Sekaran, Markus Geisler, Dominik Rößle, Adithya Mohan, Daniel Cremers, Wolfgang Utschick, Michael Botsch, Werner Huber, and Torsten Schön. Urbaning-v2x: A large-scale multi-vehicle, multi-infrastructure dataset across multiple intersections for cooperative perception. In The Thirty-ninth Annual Conference on Neural Information Processi...

2025

[23] [24]

Tensor4d: Efficientneural4ddecompositionforhigh-fidelitydynamicreconstructionandrendering

Ruizhi Shao, Zerong Zheng, Hanzhang Tu, Boning Liu, Hongwen Zhang, and Yebin Liu. Tensor4d: Efficientneural4ddecompositionforhigh-fidelitydynamicreconstructionandrendering. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023. 14/24 FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views

2023

[24] [25]

Splatter image: Ultra-fast single- view 3d reconstruction

Stanislaw Szymanowicz, Christian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single- view 3d reconstruction. InThe IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

2024

[25] [26]

Digital twin and drl-driven semantic dissemination for 6g autonomous driving service

Yihang Tao, Jun Wu, Xi Lin, Shahid Mumtaz, and Soumaya Cherkaoui. Digital twin and drl-driven semantic dissemination for 6g autonomous driving service. InGLOBECOM 2023 - 2023 IEEE Global Com- munications Conference, pages 2075–2080, 2023. doi: 10.1109/GLOBECOM54140.2023.10437455

work page doi:10.1109/globecom54140.2023.10437455 2023

[26] [27]

Drl-driven digital twin function virtualization for adaptive service response in 6g networks.IEEE Networking Letters, 5(2):125–129, 2023

Yihang Tao, Jun Wu, Xi Lin, and Wu Yang. Drl-driven digital twin function virtualization for adaptive service response in 6g networks.IEEE Networking Letters, 5(2):125–129, 2023. doi: 10.1109/LNET. 2023.3269766

work page doi:10.1109/lnet 2023

[27] [28]

Yihang Tao, Jun Wu, Qianqian Pan, Ali Kashif Bashir, and Marwan Omar. O-ran-based digital twin function virtualization for sustainable iov service response: An asynchronous hierarchical reinforcement learning approach.IEEE Transactions on Green Communications and Networking, 8(3):1049–1060,

[28] [29]

doi: 10.1109/TGCN.2024.3435796

work page doi:10.1109/tgcn.2024.3435796 2024

[29] [30]

Directed-cp: Directed collaborative perception for connected and autonomous vehicles via proactive attention

Yihang Tao, Senkang Hu, Zhengru Fang, and Yuguang Fang. Directed-cp: Directed collaborative perception for connected and autonomous vehicles via proactive attention. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 7004–7010, 2025. doi: 10.1109/ICRA55743. 2025.11127818

work page doi:10.1109/icra55743 2025

[30] [31]

Learning mutual view information graph for adaptive adversarial collaborative perception.arXiv preprint arXiv:2602.19596, 2026

Yihang Tao, Senkang Hu, Haonan An, Zhengru Fang, Hangcheng Cao, and Yuguang Fang. Learning mutual view information graph for adaptive adversarial collaborative perception.arXiv preprint arXiv:2602.19596, 2026. URLhttps://arxiv.org/abs/2602.19596

work page arXiv 2026

[31] [32]

Gcp: Guarded collaborative perception with spatial-temporal aware malicious agent detection.IEEE Transactions on Dependable and Secure Computing, pages 1–14, 2026

Yihang Tao, Senkang Hu, Yue Hu, Haonan An, Hangcheng Cao, and Yuguang Fang. Gcp: Guarded collaborative perception with spatial-temporal aware malicious agent detection.IEEE Transactions on Dependable and Secure Computing, pages 1–14, 2026. doi: 10.1109/TDSC.2026.3693684

work page doi:10.1109/tdsc.2026.3693684 2026

[32] [33]

Drivingforward: Feed-forward 3d gaussian splatting for driving scene reconstruction from flexible surround-view input

Qijian Tian, Xin Tan, Yuan Xie, and Lizhuang Ma. Drivingforward: Feed-forward 3d gaussian splatting for driving scene reconstruction from flexible surround-view input. InProceedings of the AAAI Conference on Artificial Intelligence, 2025

2025

[33] [34]

Vggt: Visual geometry grounded transformer

JianyuanWang, MinghaoChen, NikitaKaraev, AndreaVedaldi, ChristianRupprecht, andDavidNovotny. Vggt: Visual geometry grounded transformer. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2025

[34] [35]

4d gaussian splatting for real-time dynamic scene rendering

Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20310–20320, June 2024

2024

[35] [36]

V2x-real: A largs-scale dataset for vehicle-to-everything cooperative perception

Hao Xiang, Zhaoliang Zheng, Xin Xia, Runsheng Xu, Letian Gao, Zewei Zhou, Xu Han, Xinkai Ji, Mingxi Li, Zonglin Meng, Li Jin, Mingyue Lei, Zhaoyang Ma, Zihang He, Haoxuan Ma, Yunshuang Yuan, Yingqian Zhao, and Jiaqi Ma. V2x-real: A largs-scale dataset for vehicle-to-everything cooperative perception. InEuropeanConferenceonComputerVision(ECCV)2024,pages455...

2024

[36] [37]

Segformer: Simple and efficient design for semantic segmentation with transformers

Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. InNeural Information Processing Systems (NeurIPS), 2021. 15/24 FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views

2021

[37] [38]

Sparsegs: Sparse view synthesis using 3d gaussian splatting

Haolin Xiong, Sairisheek Muttukuru, Hanyuan Xiao, Rishi Upadhyay, Pradyumna Chari, Yajie Zhao, and Achuta Kadambi. Sparsegs: Sparse view synthesis using 3d gaussian splatting. In2025 International Conference on 3D Vision (3DV), pages 1032–1041, 2025. doi: 10.1109/3DV66043.2025.00100

work page doi:10.1109/3dv66043.2025.00100 2025

[38] [39]

Cruise: Cooperative reconstruction and editing in v2x scenarios using gaussian splatting

Haoran Xu, Saining Zhang, Peishuo Li, Baijun Ye, Xiaoxue Chen, Huan-Ang Gao, Jv Zheng, Xiaowei Song, Ziqiao Peng, Run Miao, Jinrang Jia, Yifeng Shi, Guangqi Yi, Hang Zhao, Hao Tang, Hongyang Li, Kaicheng Yu, and Hao Zhao. Cruise: Cooperative reconstruction and editing in v2x scenarios using gaussian splatting. In2025 IEEE/RSJ International Conference on I...

work page doi:10.1109/iros60139.2025.11246201 2025

[39] [40]

Opv2v: Anopenbenchmarkdataset and fusion pipeline for perception with vehicle-to-vehicle communication

RunshengXu, HaoXiang, XinXia, XuHan, JinlongLi, andJiaqiMa. Opv2v: Anopenbenchmarkdataset and fusion pipeline for perception with vehicle-to-vehicle communication. In2022 IEEE International Conference on Robotics and Automation (ICRA), 2022

2022

[40] [41]

EmerneRF: Emergent spatial-temporal scene decomposition via self-supervision

Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li, Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, and Yue Wang. EmerneRF: Emergent spatial-temporal scene decomposition via self-supervision. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=ycv2z8TYur

2024

[41] [42]

STORM: Spatio- temporalreconstructionmodelforlarge-scaleoutdoorscenes

Jiawei Yang, Jiahui Huang, Boris Ivanovic, Yuxiao Chen, Yan Wang, Boyi Li, Yurong You, Apoorva Sharma, Maximilian Igl, Peter Karkus, Danfei Xu, Yue Wang, and Marco Pavone. STORM: Spatio- temporalreconstructionmodelforlarge-scaleoutdoorscenes. InTheThirteenthInternationalConference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=M2NFWRPMUd

2025

[42] [44]

URLhttps://arxiv.org/abs/2603.19552

work page arXiv

[43] [45]

Gs-lrm: Large reconstruction model for 3d gaussian splatting.European Conference on Computer Vision, 2024

Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. Gs-lrm: Large reconstruction model for 3d gaussian splatting.European Conference on Computer Vision, 2024

2024

[44] [46]

The unreasonable effectiveness of deep features as a perceptual metric

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, 2018

2018

[45] [47]

Driv- inggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes

Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Driv- inggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21634–21643, 2024

2024

[46] [49]

DVGT: Driving Visual Geometry Transformer

URLhttps://arxiv.org/abs/2512.16919. 16/24 FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views A. More Implementation Details A.1. Dataset Preparation and Utilization Data Preparation.We unify V2X-Real [34] and UrbanIng-V2X [22] into a common OPV2V [38] format benchmark. The processed benchmark preserves the full o...

work page internal anchor Pith review Pith/arXiv arXiv