pith. sign in

arxiv: 2602.11565 · v4 · submitted 2026-02-12 · 💻 cs.CV

Move What Matters: Parameter-Efficient Domain Adaptation via Optimal Transport Flow for Collaborative Perception

Pith reviewed 2026-05-16 03:46 UTC · model grok-4.3

classification 💻 cs.CV
keywords domain adaptationcollaborative perceptionparameter-efficient fine-tuningoptimal transportV2Xmulti-agent systemsWasserstein distanceknowledge transfer
0
0 comments X

The pith

FlowAdapt adapts collaborative perception models to new domains by using optimal transport to filter redundant frames and transfer key semantics while updating only 1% of parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to fix the failure of standard parameter-efficient fine-tuning when applied to multi-agent collaborative perception in V2X settings, where it causes both training instability and accuracy loss. Analysis points to two root causes: repeated information across consecutive sensor frames from different agents and the loss of detailed features when only late layers are updated. FlowAdapt counters this with an optimal transport approach that selects a minimal set of informative samples and routes compressed early representations forward through the network. If the method works as described, models could be deployed across varied environments with far less data and compute than full retraining requires.

Core claim

FlowAdapt is a parameter-efficient framework grounded in optimal transport theory that minimizes information transport costs across both data distributions and network hierarchies. It introduces Wasserstein Greedy Sampling to filter redundant inter-frame samples via a bounded covering radius and a Progressive Knowledge Transfer module that injects compressed early-stage representations into later stages through learnable pathways. On three benchmarks the approach reaches state-of-the-art performance while training only 1% of the parameters and shows improved sample efficiency and generalization.

What carries the argument

The optimal transport flow realized by Wasserstein Greedy Sampling for data selection and Progressive Knowledge Transfer for hierarchical representation injection.

If this is right

  • State-of-the-art accuracy is reached on three collaborative perception benchmarks while training only 1% of parameters.
  • Domain gaps between heterogeneous sensory streams are closed more reliably than with standard PEFT.
  • Sample efficiency rises because redundant frames are discarded before adaptation begins.
  • Fine-grained semantics are preserved in deep layers, leading to stronger generalization across environments.
  • Training remains stable where direct application of PEFT previously produced instability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same sampling-plus-progressive-transfer pattern could cut parameter budgets in other multi-agent fusion tasks such as joint localization and prediction.
  • Real-time V2X fleets might adopt periodic low-cost adaptation cycles instead of periodic full retraining.
  • The bounded covering radius idea could be tested as a general redundancy filter for streaming sensor data outside perception.
  • Scaling the number of agents might expose whether the transport cost minimization remains linear or requires additional hierarchical constraints.

Load-bearing premise

The two identified factors of inter-frame redundancy and semantic erosion are the primary causes of PEFT degradation, and optimal transport can reduce the associated costs without introducing new instability.

What would settle it

A controlled experiment on a fourth unseen V2X benchmark in which FlowAdapt either requires more than 5% of parameters to match full fine-tuning accuracy or exhibits higher training variance than direct PEFT baselines.

Figures

Figures reproduced from arXiv: 2602.11565 by Jianping Wang, Jin Wang, Lingzhi Li, Siao Liu, Yunjiang Xu, Zesheng Jia, Ziyao Huang.

Figure 1
Figure 1. Figure 1: FlowAdapt tackles dual challenges in collaborative per [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Left: Performance versus temporal stride (interval be￾tween consecutively selected frames) with fixed 10% sampling ra￾tio. Right: Performance vs. sampling ratio with sequential selec￾tion. Performance saturates beyond 60% (orange-shaded zone). a systematic analysis of the state-of-the-art PEFT method, CoPEFT [39], and identify two critical phenomena that hin￾der effective adaptation: (i) Inter-frame redund… view at source ↗
Figure 4
Figure 4. Figure 4: Overview of FlowAdapt. Wasserstein Greedy Sampling selects representative samples by minimizing coverage radius in [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Performance comparison under different localization [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison of detection results on DAIR [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Ablation study on key design decisions of FlowAdapt. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
read the original abstract

Fast domain adaptation remains a fundamental challenge for deploying multi-agent systems across diverse environments in Vehicle-to-Everything (V2X) collaborative perception. Despite the success of Parameter-Efficient Fine-Tuning (PEFT) in natural language processing and conventional vision tasks, directly applying PEFT to multi-agent settings leads to significant performance degradation and training instability. In this work, we conduct a detailed analysis and identify two key factors: (i) inter-frame redundancy in heterogeneous sensory streams, and (ii) erosion of fine-grained semantics in deep-layer representations under PEFT adaptation. To address these issues, we propose FlowAdapt, a parameter-efficient framework grounded in optimal transport theory, which minimizes information transport costs across both data distributions and network hierarchies. Specifically, we introduce a Wasserstein Greedy Sampling strategy to selectively filter redundant samples via a bounded covering radius. Furthermore, Progressive Knowledge Transfer module is designed to progressively inject compressed early-stage representations into later stages through learnable pathways, alleviating semantic degradation in late-stage adaptation. Extensive experiments on three benchmarks demonstrate that FlowAdapt achieves state-of-the-art performance with only 1% of trainable parameters, effectively bridging domain gaps with superior sample efficiency and generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes FlowAdapt, a parameter-efficient domain adaptation framework for multi-agent collaborative perception in V2X settings. Grounded in optimal transport theory, it identifies inter-frame redundancy in heterogeneous sensory streams and erosion of fine-grained semantics in deep-layer PEFT representations as primary causes of performance degradation, then introduces a Wasserstein Greedy Sampling strategy (bounded covering radius) to filter redundant samples and a Progressive Knowledge Transfer module to inject compressed early-stage features into later layers via learnable pathways. The central claim is that this OT-flow construction achieves state-of-the-art results on three benchmarks while using only 1% of trainable parameters, with improved sample efficiency and generalization.

Significance. If the empirical claims hold under rigorous verification, the work would be significant for resource-constrained multi-agent perception systems. It supplies a principled OT-based mechanism for minimizing transport costs across both data distributions and network hierarchies, directly targeting PEFT-specific failure modes rather than generic adapter tuning. The low parameter count and claimed superiority in sample efficiency represent practical strengths for deployment; explicit credit is due for grounding the method in established OT theory rather than ad-hoc heuristics.

major comments (3)
  1. [§4] §4 (Experiments): The abstract and results sections assert SOTA performance on three benchmarks with only 1% trainable parameters, yet no quantitative tables, baseline comparisons, error bars, or statistical tests are referenced in the provided text; without these, the central claim cannot be evaluated for robustness or effect size.
  2. [§3.1–3.2] §3.1–3.2: The two degradation factors (inter-frame redundancy and deep-layer semantic erosion) are stated to have been identified via 'detailed analysis,' but no controlled ablation isolates their individual contributions (e.g., redundancy removal while holding semantics fixed, or vice versa) nor compares against standard PEFT failure modes such as gradient conflict in multi-agent fusion; this leaves the necessity of the OT-flow construction unanchored.
  3. [§3.2] §3.2, Wasserstein Greedy Sampling: The bounded covering radius is introduced without an explicit derivation linking it to the Wasserstein distance or showing that the resulting filter is parameter-free; the claim that it selectively removes redundancy without instability requires a concrete proof or sensitivity analysis.
minor comments (2)
  1. [§3.3] Notation for the Progressive Knowledge Transfer module (learnable pathways) is introduced without a clear equation or diagram showing how early-stage compressed representations are injected into later stages.
  2. [Abstract] The abstract mentions 'extensive experiments' but the manuscript would benefit from an explicit list of the three benchmarks and the precise PEFT baselines used for comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the manuscript to improve clarity, add missing empirical details, and strengthen the theoretical justifications where needed.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments): The abstract and results sections assert SOTA performance on three benchmarks with only 1% trainable parameters, yet no quantitative tables, baseline comparisons, error bars, or statistical tests are referenced in the provided text; without these, the central claim cannot be evaluated for robustness or effect size.

    Authors: We agree that the excerpt provided to the referee omitted explicit cross-references to the experimental tables. The full manuscript contains Tables 1–3 reporting quantitative results on the three benchmarks (including comparisons to PEFT baselines such as LoRA, Adapter, and Prefix-Tuning), with mean performance and standard deviations over 5 random seeds, plus paired t-test p-values confirming statistical significance. In the revision we will add explicit pointers from the abstract and §4 to these tables and figures, and include a new paragraph summarizing effect sizes and sample-efficiency curves. revision: yes

  2. Referee: [§3.1–3.2] §3.1–3.2: The two degradation factors (inter-frame redundancy and deep-layer semantic erosion) are stated to have been identified via 'detailed analysis,' but no controlled ablation isolates their individual contributions (e.g., redundancy removal while holding semantics fixed, or vice versa) nor compares against standard PEFT failure modes such as gradient conflict in multi-agent fusion; this leaves the necessity of the OT-flow construction unanchored.

    Authors: We acknowledge that the current version presents the two factors through observational analysis rather than a fully crossed ablation. In the revised manuscript we will add a new subsection (3.3) containing controlled experiments: (i) redundancy removal with semantics held fixed via early-layer freezing, (ii) semantic-injection ablation with redundancy kept, and (iii) comparison against gradient-conflict baselines. These will quantify the marginal contribution of each component. revision: yes

  3. Referee: [§3.2] §3.2, Wasserstein Greedy Sampling: The bounded covering radius is introduced without an explicit derivation linking it to the Wasserstein distance or showing that the resulting filter is parameter-free; the claim that it selectively removes redundancy without instability requires a concrete proof or sensitivity analysis.

    Authors: We will expand §3.2 with a short derivation showing that the covering-radius threshold is obtained directly from the 1-Wasserstein distance between consecutive frame embeddings under the empirical measure, and that the filter remains parameter-free because the radius is computed from the data covariance without learned parameters. We will also add a sensitivity plot (new Figure 4) varying the radius multiplier over [0.5, 2.0] and reporting both performance and training stability metrics. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework grounded in external OT theory

full rationale

The paper's central construction (Wasserstein Greedy Sampling + Progressive Knowledge Transfer) is presented as an application of established optimal transport theory to the identified factors, without any equations shown that reduce the claimed performance gains to fitted parameters by construction or to self-citations whose content is unverified. No self-definitional loops, no fitted inputs renamed as predictions, and no uniqueness theorems imported from the authors' prior work appear in the provided text. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Central claim rests on optimal transport being an appropriate cost metric for both data and representation hierarchies, plus the two new modules introduced without external validation.

axioms (1)
  • domain assumption Optimal transport theory can be used to minimize information transport costs across data distributions and network hierarchies in PEFT adaptation.
    Invoked as the grounding for the FlowAdapt framework.
invented entities (2)
  • Wasserstein Greedy Sampling strategy no independent evidence
    purpose: Selectively filter redundant samples via a bounded covering radius.
    New sampling component introduced to address inter-frame redundancy.
  • Progressive Knowledge Transfer module no independent evidence
    purpose: Progressively inject compressed early-stage representations into later stages through learnable pathways.
    New module introduced to alleviate semantic degradation in deep layers.

pith-pipeline@v0.9.0 · 5526 in / 1291 out tokens · 57054 ms · 2026-05-16T03:46:02.291422+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages

  1. [1]

    Wasserstein gan, 2017

    Martin Arjovsky, Soumith Chintala, and L ´eon Bottou. Wasserstein gan, 2017. 3

  2. [2]

    Survey on cooperative perception in an automotive context.IEEE Transactions on Intelligent Transportation Systems, 23(9):14204–14223, 2022

    Antoine Caillot, Safa Ouerghi, Pascal Vasseur, R ´emi Bout- teau, and Yohan Dupuis. Survey on cooperative perception in an automotive context.IEEE Transactions on Intelligent Transportation Systems, 23(9):14204–14223, 2022. 1

  3. [3]

    F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds, 2019

    Qi Chen. F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds, 2019. 1, 2

  4. [4]

    Cooper: Cooperative perception for connected autonomous vehicles based on 3d point clouds

    Qi Chen, Sihai Tang, Qing Yang, and Song Fu. Cooper: Cooperative perception for connected autonomous vehicles based on 3d point clouds. In2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), pages 514–524, 2019. 2

  5. [5]

    Adaptformer: Adapt- ing vision transformers for scalable visual recognition, 2022

    Shoufa Chen, Chongjian Ge, Zhan Tong, Jiangliu Wang, Yibing Song, Jue Wang, and Ping Luo. Adaptformer: Adapt- ing vision transformers for scalable visual recognition, 2022. 3, 6, 7

  6. [6]

    Optimal transport for domain adaptation, 2016

    Nicolas Courty, R ´emi Flamary, Devis Tuia, and Alain Rako- tomamonjy. Optimal transport for domain adaptation, 2016. 3

  7. [7]

    Joint distribution optimal transportation for domain adaptation, 2017

    Nicolas Courty, R ´emi Flamary, Amaury Habrard, and Alain Rakotomamonjy. Joint distribution optimal transportation for domain adaptation, 2017. 3

  8. [8]

    Multi-level optimal transport for universal cross-tokenizer knowledge distillation on language models,

    Xiao Cui, Mo Zhu, Yulei Qin, Liang Xie, Wengang Zhou, and Houqiang Li. Multi-level optimal transport for universal cross-tokenizer knowledge distillation on language models,

  9. [9]

    Qlora: Efficient finetuning of quantized llms,

    Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. Qlora: Efficient finetuning of quantized llms,

  10. [10]

    Carla: An open urban driving simulator, 2017

    Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. Carla: An open urban driving simulator, 2017. 6

  11. [11]

    Gonzalez

    Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance.Theoretical Computer Science, 38: 293–306, 1985. 4

  12. [12]

    Collaborative perception in autonomous driv- ing: Methods, datasets, and challenges.IEEE Intelligent Transportation Systems Magazine, 15(6):131–151, 2023

    Yushan Han, Hui Zhang, Huifang Li, Yi Jin, Congyan Lang, and Yidong Li. Collaborative perception in autonomous driv- ing: Methods, datasets, and challenges.IEEE Intelligent Transportation Systems Magazine, 15(6):131–151, 2023. 6

  13. [13]

    Parameter-efficient fine-tuning for large models: A comprehensive survey, 2024

    Zeyu Han, Chao Gao, Jinyang Liu, Jeff Zhang, and Sai Qian Zhang. Parameter-efficient fine-tuning for large models: A comprehensive survey, 2024. 3

  14. [14]

    Sensitivity-aware visual parameter-efficient fine- tuning, 2023

    Haoyu He, Jianfei Cai, Jing Zhang, Dacheng Tao, and Bohan Zhuang. Sensitivity-aware visual parameter-efficient fine- tuning, 2023. 1

  15. [15]

    Parameter-efficient transfer learning for nlp, 2019

    Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-efficient transfer learning for nlp, 2019. 1, 3, 6, 7

  16. [16]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021. 1, 3

  17. [17]

    Where2comm: Communication-efficient collab- orative perception via spatial confidence maps, 2022

    Yue Hu, Shaoheng Fang, Zixing Lei, Yiqi Zhong, and Si- heng Chen. Where2comm: Communication-efficient collab- orative perception via spatial confidence maps, 2022. 2

  18. [18]

    Nguyen, Mostafa Rahimi Azghadi, Yuxuan Xia, Qing-Long Han, and Sumei Sun

    Tao Huang, Jianan Liu, Xi Zhou, Dinh C. Nguyen, Mostafa Rahimi Azghadi, Yuxuan Xia, Qing-Long Han, and Sumei Sun. V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges, 2024. 1

  19. [19]

    Vi- sual prompt tuning, 2022

    Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. Vi- sual prompt tuning, 2022. 3

  20. [20]

    DUSA: Decoupled unsupervised Sim2Real adaptation for vehicle-to-everything collaborative perception

    Xianghao Kong, Wentao Jiang, Jinrang Jia, Yifeng Shi, Run- sheng Xu, and Si Liu. DUSA: Decoupled unsupervised Sim2Real adaptation for vehicle-to-everything collaborative perception. InProceedings of the 31st ACM International Conference on Multimedia, pages 1943–1954, 2023. 6, 7

  21. [21]

    Ro- bust collaborative perception without external localization and clock devices, 2024

    Zixing Lei, Zhenyang Ni, Ruize Han, Shuo Tang, Dingju Wang, Chen Feng, Siheng Chen, and Yanfeng Wang. Ro- bust collaborative perception without external localization and clock devices, 2024. 2

  22. [22]

    Prefix-tuning: Optimizing continuous prompts for generation, 2021

    Xiang Lisa Li and Percy Liang. Prefix-tuning: Optimizing continuous prompts for generation, 2021. 3

  23. [23]

    Learning distilled collaboration graph for multi-agent perception, 2022

    Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng, and Wenjun Zhang. Learning distilled collaboration graph for multi-agent perception, 2022. 2

  24. [24]

    Scaling down to scale up: A guide to parameter- efficient fine-tuning, 2024

    Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, and Anna Rumshisky. Scaling down to scale up: A guide to parameter- efficient fine-tuning, 2024. 3

  25. [25]

    Scaling & shifting your features: A new baseline for efficient model tuning, 2023

    Dongze Lian, Daquan Zhou, Jiashi Feng, and Xinchao Wang. Scaling & shifting your features: A new baseline for efficient model tuning, 2023. 6

  26. [26]

    Towards vehicle-to-everything autonomous driving: A survey on collaborative perception,

    Si Liu, Chen Gao, Yuan Chen, Xingyu Peng, Xianghao Kong, Kun Wang, Runsheng Xu, Wentao Jiang, Hao Xiang, Jiaqi Ma, and Miao Wang. Towards vehicle-to-everything autonomous driving: A survey on collaborative perception,

  27. [27]

    Dora: Weight-decomposed low-rank adaptation, 2024

    Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, and Min-Hung Chen. Dora: Weight-decomposed low-rank adaptation, 2024. 3

  28. [28]

    When2com: Multi-Agent Perception via Communi- cation Graph Grouping

    Yen-Cheng Liu, Junjiao Tian, Nathaniel Glaser, and Zsolt Kira. When2com: Multi-Agent Perception via Communi- cation Graph Grouping. In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4105–4114, Seattle, W A, USA, 2020. IEEE. 2

  29. [29]

    Who2com: Collaborative perception via learnable handshake communication, 2020

    Yen-Cheng Liu, Junjiao Tian, Chih-Yao Ma, Nathan Glaser, Chia-Wen Kuo, and Zsolt Kira. Who2com: Collaborative perception via learnable handshake communication, 2020. 2

  30. [30]

    Robust collaborative 3D object detection in presence of pose errors, 2023

    Yifan Lu, Quanhao Li, Baoan Liu, Mehrdad Dianati, Chen Feng, Siheng Chen, and Yanfeng Wang. Robust collaborative 3D object detection in presence of pose errors, 2023. 2, 6

  31. [31]

    An extensible framework for open heterogeneous collaborative perception, 2024

    Yifan Lu, Yue Hu, Yiqi Zhong, Dequan Wang, Yanfeng Wang, and Siheng Chen. An extensible framework for open heterogeneous collaborative perception, 2024. 1, 2

  32. [32]

    MACP: Efficient model adap- tation for cooperative perception

    Yunsheng Ma, Juanwu Lu, Can Cui, Sicheng Zhao, Xu Cao, Wenqian Ye, and Ziran Wang. MACP: Efficient model adap- tation for cooperative perception. In2024 IEEE/CVF Win- ter Conference on Applications of Computer Vision (WACV), 9 pages 3361–3370, Waikoloa, HI, USA, 2024. IEEE. 1, 3, 6, 7

  33. [33]

    Goracs: Group-level optimal transport-guided coreset selection for llm-based recommender systems, 2025

    Tiehua Mei, Hengrui Chen, Peng Yu, Jiaqing Liang, and De- qing Yang. Goracs: Group-level optimal transport-guided coreset selection for llm-based recommender systems, 2025. 3

  34. [34]

    Computational optimal transport, 2020

    Gabriel Peyr ´e and Marco Cuturi. Computational optimal transport, 2020. 2, 3

  35. [35]

    Car2x-based perception in a high-level fusion ar- chitecture for cooperative perception systems

    Andreas Rauch, Felix Klanner, Ralph Rasshofer, and Klaus Dietmayer. Car2x-based perception in a high-level fusion ar- chitecture for cooperative perception systems. In2012 IEEE Intelligent Vehicles Symposium, pages 270–275, 2012. 2

  36. [36]

    TraF-align: Trajectory-aware feature alignment for asynchronous multi- agent perception, 2025

    Zhiying Song, Lei Yang, Fuxi Wen, and Jun Li. TraF-align: Trajectory-aware feature alignment for asynchronous multi- agent perception, 2025. 1, 2

  37. [37]

    Collaborative perception datasets for autonomous driv- ing: A review, 2025

    Naibang Wang, Deyong Shang, Yan Gong, Xiaoxi Hu, Ziy- ing Song, Lei Yang, Yuhan Huang, Xiaoyu Wang, and Jianli Lu. Collaborative perception datasets for autonomous driv- ing: A review, 2025. 6

  38. [38]

    V2VNet: Vehicle-to-vehicle communication for joint perception and prediction

    Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, and Raquel Urtasun. V2VNet: Vehicle-to-vehicle communication for joint perception and prediction. InComputer Vision – ECCV 2020, pages 605–

  39. [39]

    Springer International Publishing, Cham, 2020. 2

  40. [40]

    CoPEFT: Fast adaptation framework for multi- agent collaborative perception with parameter-efficient fine- tuning, 2025

    Quanmin Wei, Penglin Dai, Wei Li, Bingyi Liu, and Xiao Wu. CoPEFT: Fast adaptation framework for multi- agent collaborative perception with parameter-efficient fine- tuning, 2025. 1, 2, 3, 6, 7

  41. [41]

    Di-v2x: Learning domain- invariant representation for vehicle-infrastructure collabora- tive 3d object detection, 2023

    Li Xiang, Junbo Yin, Wei Li, Cheng-Zhong Xu, Ruigang Yang, and Jianbing Shen. Di-v2x: Learning domain- invariant representation for vehicle-infrastructure collabora- tive 3d object detection, 2023. 6

  42. [42]

    Parameter-efficient fine-tuning for pre-trained vision models: A survey and benchmark, 2025

    Yi Xin, Jianjiang Yang, Siqi Luo, Yuntao Du, Qi Qin, Kan- grui Cen, Yangfan He, Zhiwei Zhang, Bin Fu, Xiaokang Yang, Guangtao Zhai, Ming-Hsuan Yang, and Xiaohong Liu. Parameter-efficient fine-tuning for pre-trained vision models: A survey and benchmark, 2025. 1

  43. [43]

    Opencda:an open cooperative driving automation framework integrated with co-simulation, 2021

    Runsheng Xu, Yi Guo, Xu Han, Xin Xia, Hao Xiang, and Jiaqi Ma. Opencda:an open cooperative driving automation framework integrated with co-simulation, 2021. 6

  44. [44]

    Cobevt: Cooperative bird’s eye view semantic segmentation with sparse transformers, 2022

    Runsheng Xu, Zhengzhong Tu, Hao Xiang, Wei Shao, Bolei Zhou, and Jiaqi Ma. Cobevt: Cooperative bird’s eye view semantic segmentation with sparse transformers, 2022. 2

  45. [45]

    V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer

    Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming- Hsuan Yang, and Jiaqi Ma. V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer. InCom- puter Vision – ECCV 2022, pages 107–124. Springer Nature Switzerland, Cham, 2022. 2, 6

  46. [46]

    Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communica- tion, 2022

    Runsheng Xu, Hao Xiang, Xin Xia, Xu Han, Jinlong Li, and Jiaqi Ma. Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communica- tion, 2022. 6

  47. [47]

    V2x-vitv2: Improved vision transformers for vehicle-to-everything cooperative perception.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 47(1): 650–662, 2025

    Runsheng Xu, Chia-Ju Chen, Zhengzhong Tu, and Ming- Hsuan Yang. V2x-vitv2: Improved vision transformers for vehicle-to-everything cooperative perception.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 47(1): 650–662, 2025. 2

  48. [48]

    Codyntrust: Robust asynchronous collaborative perception via dynamic feature trust modulus, 2025

    Yunjiang Xu, Lingzhi Li, Jin Wang, Benyuan Yang, Zhiwen Wu, Xinhong Chen, and Jianping Wang. Codyntrust: Robust asynchronous collaborative perception via dynamic feature trust modulus, 2025. 2

  49. [49]

    Is discretization fusion all you need for collaborative perception?, 2025

    Kang Yang, Tianci Bu, Lantao Li, Chunxu Li, Yongcai Wang, and Deying Li. Is discretization fusion all you need for collaborative perception?, 2025. 1

  50. [50]

    Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection

    Haibao Yu, Yizhen Luo, Mao Shu, Yiyi Huo, Zebang Yang, Yifeng Shi, Zhenglong Guo, Hanyu Li, Xing Hu, Jirui Yuan, and Zaiqing Nie. Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection. In 2022 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 21329–21338, 2022. 6

  51. [51]

    ERMVP: Communication-efficient and collaboration-robust multi-vehicle perception in chal- lenging environments

    Jingyu Zhang, Kun Yang, Yilei Wang, Hanqi Wang, Peng Sun, and Liang Song. ERMVP: Communication-efficient and collaboration-robust multi-vehicle perception in chal- lenging environments. In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12575–12584, Seattle, W A, USA, 2024. IEEE. 1, 2

  52. [52]

    Adalora: Adaptive budget allocation for parameter-efficient fine-tuning, 2023

    Qingru Zhang, Minshuo Chen, Alexander Bukharin, Nikos Karampatziakis, Pengcheng He, Yu Cheng, Weizhu Chen, and Tuo Zhao. Adalora: Adaptive budget allocation for parameter-efficient fine-tuning, 2023. 3 10