GCP: Guarded Collaborative Perception with Spatial-Temporal Aware Malicious Agent Detection

Hangcheng Cao; Haonan An; Senkang Hu; Yihang Tao; Yue Hu; Yuguang Fang

arxiv: 2501.02450 · v2 · submitted 2025-01-05 · 💻 cs.CV

GCP: Guarded Collaborative Perception with Spatial-Temporal Aware Malicious Agent Detection

Yihang Tao , Senkang Hu , Yue Hu , Haonan An , Hangcheng Cao , Yuguang Fang This is my paper

Pith reviewed 2026-05-23 06:18 UTC · model grok-4.3

classification 💻 cs.CV

keywords collaborative perceptionmalicious agent detectionadversarial attacksautonomous drivingbird's eye viewspatial-temporal analysisdefense framework

0 comments

The pith

GCP detects malicious agents in collaborative perception by combining spatial consistency checks with temporal motion flow reconstruction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that existing defenses for shared perception data among vehicles fail against a new blind area confusion attack that uses subtle changes to evade single-frame outlier checks. GCP counters this by enforcing spatial agreement through a scaled concordance loss on features and by rebuilding historical bird's eye view motion patterns in uncertain areas to expose time-based inconsistencies. These two signals are merged with a joint statistical test to flag malicious agents reliably. If correct, the method raises perception accuracy under attack without needing extra sensors or messages. The result matters because collaborative perception extends each vehicle's view but breaks if even one participant sends poisoned data.

Core claim

Single-shot outlier detection is vulnerable to a blind area confusion attack that perturbs inputs and outputs subtly; GCP counters this by maintaining spatial consistency via a confidence-scaled spatial concordance loss and detecting temporal anomalies through reconstruction of historical bird's eye view motion flows in low-confidence regions, then synthesizing both via a joint spatial-temporal Benjamini-Hochberg test for detection.

What carries the argument

The joint spatial-temporal Benjamini-Hochberg test that fuses a confidence-scaled spatial concordance loss with reconstruction of historical bird's eye view motion flows to identify anomalies.

If this is right

Raises average precision at 0.5 IoU by up to 34.69 percent over prior defenses specifically under blind area confusion attacks.
Delivers steady 5 to 8 percent gains against other common attack types.
Keeps single-frame spatial checks intact while adding temporal analysis without extra communication overhead.
Enables detection that accounts for message correlations across time frames rather than isolated snapshots.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dual-domain reconstruction idea could apply to other multi-agent sensor fusion settings where historical state estimates exist.
Evaluating performance when the statistical threshold is tuned under stronger adaptive attackers would test robustness beyond the reported scenarios.
Real-time implementation on vehicle hardware would reveal whether the motion-flow reconstruction adds acceptable latency.

Load-bearing premise

The joint statistical test can combine spatial and temporal signals into reliable malicious-agent flags without missing attacks that mimic normal flows or producing too many false alarms.

What would settle it

A crafted attack that preserves both spatial concordance scores and plausible reconstructed motion flows while still degrading the final perception output would show the detection method does not catch all effective threats.

Figures

Figures reproduced from arXiv: 2501.02450 by Hangcheng Cao, Haonan An, Senkang Hu, Yihang Tao, Yue Hu, Yuguang Fang.

**Figure 1.** Figure 1: Illustration of security challenges and defense mechanisms in CP. While CP systems are vulnerable to adversarial messages from malicious agents, our proposed GCP framework provides comprehensive protection through joint spatial-temporal consistency verification, effectively safeguarding the system against various attack patterns. the CP results. However, these methods are either bandwidthconsuming or vuln… view at source ↗

**Figure 2.** Figure 2: Overview of the proposed blind area confusion (BAC) attack. The malicious agent first establishes communication with the victim ego CAV to obtain collaborative messages, then infers the victim’s blind regions through differential detection analysis and region segmentation. Finally, it generates adversarial perturbations guided by the inferred confidence mask to confuse the victim’s perception defense syste… view at source ↗

**Figure 3.** Figure 3: Overview of the proposed GCP framework. GCP performs joint spatial-temporal consistency verification through two key components: (1) a confidence-scaled spatial concordance loss that adaptively evaluates detection consistency, and (2) an LSTM-AE-based temporal BEV flow reconstruction that captures motion patterns in CP. decreases in occluded areas and at sensor range boundaries [28]. When other CAVs provid… view at source ↗

**Figure 4.** Figure 4: Architecture of LSTM-AE-based BEV flow reconstruction. The input BEV flow vector consists of 8-dimensional features representing corner points of detected objects. The encoded latent features are repeated K + 1 times before decoding, followed by a TimeDistributed layer for temporalaware reconstruction of object motion patterns. where ωs and ωt are learnable weights. The BH procedure controls FDR through a… view at source ↗

**Figure 5.** Figure 5: Comparative results of AP under different cached frame length and consecutive KF interpolation times on V2X-Sim Dataset. Attack settings: m = 2, λ = 0.25; ∆i = ∆o = 0.5. (a)-(d): AP@0.7 results under different attacks and interpolation budgets; (e)-(h): AP@0.7 results under different attacks and cached frame length [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of the BAC attack pipeline on V2X-Sim dataset. (a) Detection results using only victim vehicle’s local perception; (b) Enhanced detection results through CP; (c) Initial BAC seed map generated from differential detection results; (d) Refined BAC confidence mask obtained through blind region segmentation. Red boxes are predictions while the green ones are GT. 76.87%, while smaller (K = 3, 73.9… view at source ↗

**Figure 7.** Figure 7: Visualization of 3D detection results on V2X-Sim dataset. Attack settings: number of malicious agents = 2; attack ratio = 0.25; input/output perturbation budget = 0.5. Scene ID: 8, Frame ID: 81, 65, 30, 47 (from top to bottom). Red boxes are predictions while the green ones are GT [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: BEV flow reconstruction loss distribution. Agent 0 and 2 are under BAC attack (∆o = 0.5) while agent 3 and 4 are normal. Scene ID: 8, Frame ID: 61. visualized. First, the malicious agent analyzes the detection results using only the victim vehicle’s local perception ( [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

read the original abstract

Collaborative perception significantly enhances autonomous driving safety by extending each vehicle's perception range through message sharing among connected and autonomous vehicles. Unfortunately, it is also vulnerable to adversarial message attacks from malicious agents, resulting in severe performance degradation. While existing defenses employ hypothesis-and-verification frameworks to detect malicious agents based on single-shot outliers, they overlook temporal message correlations, which can be circumvented by subtle yet harmful perturbations in model input and output spaces. This paper reveals a novel blind area confusion (BAC) attack that compromises existing single-shot outlier-based detection methods. As a countermeasure, we propose GCP, a Guarded Collaborative Perception framework based on spatial-temporal aware malicious agent detection, which maintains single-shot spatial consistency through a confidence-scaled spatial concordance loss, while simultaneously examining temporal anomalies by reconstructing historical bird's eye view motion flows in low-confidence regions. We also employ a joint spatial-temporal Benjamini-Hochberg test to synthesize dual-domain anomaly results for reliable malicious agent detection. Extensive experiments demonstrate GCP's superior performance under diverse attack scenarios, achieving up to 34.69% improvements in AP@0.5 compared to the state-of-the-art CP defense strategies under BAC attacks, while maintaining consistent 5-8% improvements under other typical attacks. Code will be released at https://github.com/yihangtao/GCP.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a BAC attack on single-shot collaborative perception detectors and a GCP defense using spatial concordance plus temporal motion reconstruction with a joint BH test, but experiments lack detail and the FDR control under dependence is unaddressed.

read the letter

The core contribution is a new blind area confusion attack that targets single-shot outlier defenses in collaborative perception, plus GCP as a countermeasure that adds temporal checks. The spatial part uses a confidence-scaled concordance loss to keep consistency, while the temporal part reconstructs historical BEV motion flows only in low-confidence regions. They combine the two via a joint spatial-temporal Benjamini-Hochberg test and claim up to 34.69% AP@0.5 gains under BAC and smaller gains under other attacks. That temporal angle is a reasonable observation; prior work really did focus on single-frame checks, so pointing out the gap is fair. The method itself is described clearly enough in the abstract to see the pieces. The experimental claims are the main problem. The abstract reports performance numbers without describing attack construction, baseline re-implementations, dataset splits, or any statistical tests beyond the final AP figures. That makes it impossible to judge whether the improvements are robust or just from favorable setup choices. The joint BH step is also a soft spot. The temporal p-values are computed only inside low-confidence spatial regions, so the two sets of p-values are structurally dependent. Standard BH requires independence or positive regression dependence for FDR control, and the abstract gives no argument or check that this holds. If the dependence is negative or arbitrary, the false-positive rate on malicious-agent flagging could be higher than claimed. This work is aimed at researchers in connected autonomous vehicle security and collaborative perception. A reader already working on attack detection might pick up the spatial-temporal split as an idea to try, but the missing experimental transparency and the unexamined dependence mean it is not yet something to cite or extend directly. I would send it to peer review so the authors can supply the missing details and either fix or justify the statistical step.

Referee Report

1 major / 2 minor

Summary. The paper proposes GCP, a Guarded Collaborative Perception framework to defend collaborative perception in autonomous driving against malicious agent attacks. It introduces a confidence-scaled spatial concordance loss to enforce single-shot spatial consistency and reconstructs historical BEV motion flows to detect temporal anomalies in low-confidence regions; these are combined via a joint spatial-temporal Benjamini-Hochberg test for malicious agent detection. The work also introduces a novel blind area confusion (BAC) attack that evades prior single-shot defenses and reports up to 34.69% AP@0.5 gains over SOTA CP defenses under BAC and 5-8% gains under other attacks.

Significance. If the joint BH procedure can be shown to control FDR despite the structural dependence between the spatial and temporal p-values, the framework would provide a meaningful advance by addressing the temporal vulnerability that single-shot outlier detectors miss. The combination of a parameter-light spatial loss with explicit temporal reconstruction is a concrete technical contribution; releasing code further strengthens reproducibility.

major comments (1)

[Detection synthesis step] Detection synthesis step (abstract and § on malicious agent detection): the joint spatial-temporal Benjamini-Hochberg test is applied to p-values derived from the confidence-scaled concordance loss and from temporal motion-flow reconstruction performed only inside low-confidence spatial regions. Because the temporal test is conditioned on the spatial low-confidence mask, the two sets of p-values are structurally dependent. Standard BH guarantees require independence or positive regression dependence; neither is established nor is a dependence-robust alternative (e.g., dependence-adjusted BH or permutation-based FDR) provided. This directly affects the reliability of the malicious-agent flagging that underpins all reported gains.

minor comments (2)

[Abstract] Abstract: performance numbers (34.69 % AP@0.5, 5-8 % gains) are stated without reference to the number of random seeds, statistical significance tests, or variance; the full experimental section should make these explicit.
[Method] Notation: the precise definition of the p-values fed into the joint BH procedure (how the concordance loss and reconstruction error are converted to p-values) should be stated in a single equation or algorithm box for clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the joint Benjamini-Hochberg procedure. The observation regarding structural dependence is valid and merits explicit treatment to strengthen the theoretical grounding of the malicious-agent detection. We address the point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Detection synthesis step] Detection synthesis step (abstract and § on malicious agent detection): the joint spatial-temporal Benjamini-Hochberg test is applied to p-values derived from the confidence-scaled concordance loss and from temporal motion-flow reconstruction performed only inside low-confidence spatial regions. Because the temporal test is conditioned on the spatial low-confidence mask, the two sets of p-values are structurally dependent. Standard BH guarantees require independence or positive regression dependence; neither is established nor is a dependence-robust alternative (e.g., dependence-adjusted BH or permutation-based FDR) provided. This directly affects the reliability of the malicious-agent flagging that underpins all reported gains.

Authors: We agree that conditioning the temporal reconstruction on the spatial low-confidence mask induces structural dependence between the two families of p-values, and that the manuscript does not formally establish PRDS or provide a dependence-robust procedure. To correct this, we will revise the detection-synthesis section to (i) explicitly acknowledge the dependence, (ii) replace the standard joint BH with a permutation-based FDR control that respects the conditioning (by permuting historical BEV flows within the masked regions while preserving the spatial p-values), and (iii) report the resulting empirical FDR on the BAC and other attack benchmarks. These changes will be accompanied by a short theoretical note on why the permutation approach guarantees FDR control under the observed dependence structure. The empirical gains remain unchanged, but the reliability claim will now rest on a dependence-aware procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method is self-contained with independent components

full rationale

The paper defines GCP via explicit components: a confidence-scaled spatial concordance loss for single-shot consistency, reconstruction of historical BEV motion flows for temporal anomalies in low-confidence regions, and a joint spatial-temporal Benjamini-Hochberg test for synthesizing detections. Performance improvements (e.g., AP@0.5 gains) are reported from experiments under attacks, not from any quantity that reduces by construction to fitted parameters or self-referential definitions. No self-definitional loops, fitted-input predictions, load-bearing self-citations, uniqueness theorems, or ansatz smuggling appear in the derivation chain. The approach relies on standard loss terms and statistical procedures applied to independently computed p-values, making the central claims externally falsifiable via the released code and benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based on abstract only; the method assumes standard statistical procedures and loss functions apply directly to adversarial detection without additional unstated parameters or entities.

axioms (1)

domain assumption The Benjamini-Hochberg procedure can be directly applied to combine spatial and temporal anomaly scores for reliable malicious agent identification.
Invoked in the joint spatial-temporal test step described in the abstract.

pith-pipeline@v0.9.0 · 5782 in / 1206 out tokens · 51822 ms · 2026-05-23T06:18:09.204303+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

confidence-scaled spatial concordance loss ... LSTM-AE-based temporal BEV flow reconstruction

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 1 internal anchor

[1]

Cooper: Cooperative Perception for Connected Autonomous Vehicles Based on 3D Point Clouds ,

Q. Chen, S. Tang, Q. Yang, and S. Fu, “Cooper: Cooperative Perception for Connected Autonomous Vehicles Based on 3D Point Clouds ,” in IEEE International Conference on Distributed Computing Systems (ICDCS), Jul. 2019, pp. 514–524

work page 2019
[2]

V2X- Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving,

Y . Li, D. Ma, Z. An, Z. Wang, Y . Zhong, S. Chen, and C. Feng, “V2X- Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving,” IEEE Robotics and Automation Letters , vol. 7, no. 4, pp. 10 914–10 921, 2022

work page 2022
[3]

Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving,

S. Hu, Z. Fang, H. An, G. Xu, Y . Zhou, X. Chen, and Y . Fang, “Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving,” arXiv:2310.00013, 2024

work page arXiv 2024
[4]

Where2comm: Communication-Efficient Collaborative Perception via Spatial Confi- dence Maps,

Y . Hu, S. Fang, Z. Lei, Y . Zhong, and S. Chen, “Where2comm: Communication-Efficient Collaborative Perception via Spatial Confi- dence Maps,” in Advances in Neural Information Processing Systems (NeurIPS), 2022

work page 2022
[5]

Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,

S. Hu, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,” arXiv:2311.16754, 2024

work page arXiv 2024
[6]

PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles,

Z. Fang, S. Hu, H. An, Y . Zhang, J. Wang, H. Cao, X. Chen, and Y . Fang, “PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles,” IEEE Transactions on Mobile Computing , vol. 23, no. 12, pp. 15 003–15 018, 2024

work page 2024
[7]

Direct-CP: Directed Collabora- tive Perception for Connected and Autonomous Vehicles via Proactive Attention,

Y . Tao, S. Hu, Z. Fang, and Y . Fang, “Direct-CP: Directed Collabora- tive Perception for Connected and Autonomous Vehicles via Proactive Attention,” arXiv:2409.08840, 2024

work page arXiv 2024
[8]

AgentsCo- Driver: Large Language Model Empowered Collaborative Driving with Lifelong Learning,

S. Hu, Z. Fang, Z. Fang, Y . Deng, X. Chen, and Y . Fang, “AgentsCo- Driver: Large Language Model Empowered Collaborative Driving with Lifelong Learning,” arXiv:2404.06345, Apr. 2024

work page arXiv 2024
[9]

AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging,

S. Hu, Z. Fang, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging,” arXiv:2408.03624, Aug. 2024

work page arXiv 2024
[10]

Toward Full-Scene Domain Generalization in Multi-Agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,

S. Hu, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “Toward Full-Scene Domain Generalization in Multi-Agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,” IEEE Transactions on Intelligent Transportation Systems , pp. 1–14, 2024

work page 2024
[11]

CP- Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception,

S. Hu, Y . Tao, G. Xu, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “CP- Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception,” arXiv:2412.12000, Dec. 2024

work page arXiv 2024
[12]

R- ACP: Real-Time Adaptive Collaborative Perception Leveraging Robust Task-Oriented Communications,

Z. Fang, J. Wang, Y . Ma, Y . Tao, Y . Deng, X. Chen, and Y . Fang, “R- ACP: Real-Time Adaptive Collaborative Perception Leveraging Robust Task-Oriented Communications,” arXiv:2410.04168, 2024

work page arXiv 2024
[13]

On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures,

Q. Zhang, S. Jin, R. Zhu, J. Sun, X. Zhang, Q. A. Chen, and Z. M. Mao, “On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures,” in 33rd USENIX Security Symposium, Aug. 2024, pp. 6309–6326

work page 2024
[14]

Adversarial Attacks On Multi-Agent Communication,

J. Tu, T. Wang, J. Wang, S. Manivasagam, M. Ren, and R. Urtasun, “Adversarial Attacks On Multi-Agent Communication,” in IEEE/CVF International Conference on Computer Vision (ICCV) , 2021, pp. 7748– 7757

work page 2021
[15]

Among Us: Adversarially Robust Collaborative Perception by Consensus,

Y . Li, Q. Fang, J. Bai, S. Chen, F. Juefei-Xu, and C. Feng, “Among Us: Adversarially Robust Collaborative Perception by Consensus,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 186–195

work page 2023
[16]

Malicious Agent Detection for Robust Multi-Agent Collaborative Perception,

Y . Zhao, Z. Xiang, S. Yin, X. Pang, S. Chen, and Y . Wang, “Malicious Agent Detection for Robust Multi-Agent Collaborative Perception,” arXiv:2310.11901, 2024

work page arXiv 2024
[17]

CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception,

Anonymous, “CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception,” in Submitted to The Thirteenth International Conference on Learning Representations (ICLR), 2024, under review

work page 2024
[18]

Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges,

Y . Han, H. Zhang, H. Li, Y . Jin, C. Lang, and Y . Li, “Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges,” IEEE Intelligent Transportation Systems Magazine , vol. 15, no. 6, pp. 131–151, Nov. 2023, arXiv:2301.06262 [cs]

work page arXiv 2023
[19]

Collaborative Per- ception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities,

S. Hu, Z. Fang, Y . Deng, X. Chen, and Y . Fang, “Collaborative Per- ception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities,” Jan. 2024, arXiv:2401.01544

work page arXiv 2024
[20]

DSDNet: Deep Structured Self-driving Network,

W. Zeng, S. Wang, R. Liao, Y . Chen, B. Yang, and R. Urtasun, “DSDNet: Deep Structured Self-driving Network,” in European Conference on Computer Vision (ECCV) , 2020, pp. 156–172

work page 2020
[21]

Learning Distilled Collaboration Graph for Multi-Agent Perception,

Y . Li, S. Ren, P. Wu, S. Chen, C. Feng, and W. Zhang, “Learning Distilled Collaboration Graph for Multi-Agent Perception,” in Advances in Neural Information Processing Systems (NeurIPS) , vol. 34, 2021, pp. 29 541–29 552

work page 2021
[22]

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction,

T.-H. Wang, S. Manivasagam, M. Liang, B. Yang, W. Zeng, and R. Urta- sun, “V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction,” in European Conference on Computer Vision (ECCV) , 2020, pp. 605–621

work page 2020
[23]

Asynchrony-Robust Collaborative Perception via Bird’s Eye View Flow,

S. Wei, Y . Wei, Y . Hu, Y . Lu, Y . Zhong, S. Chen, and Y . Zhang, “Asynchrony-Robust Collaborative Perception via Bird’s Eye View Flow,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 36, 2023, pp. 28 462–28 477

work page 2023
[24]

Robust Collaborative 3d Object Detection in Presence of Pose Errors,

Y . Lu, Q. Li, B. Liu, M. Dianati, C. Feng, S. Chen, and Y . Wang, “Robust Collaborative 3d Object Detection in Presence of Pose Errors,” in IEEE International Conference on Robotics and Automation (ICRA) , 2023, pp. 4812–4818

work page 2023
[25]

Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks,

H. An, G. Hua, Z. Lin, and Y . Fang, “Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks,” arXiv:2405.09863, 2024

work page arXiv 2024
[26]

Secure Traffic Sign Recognition: An Attention-Enabled Universal Image Inpainting Mechanism against Light Patch Attacks,

H. Cao, L. Yuan, G. Xu, Z. He, Z. Fang, and Y . Fang, “Secure Traffic Sign Recognition: An Attention-Enabled Universal Image Inpainting Mechanism against Light Patch Attacks,” arXiv:2409.04133, 2024

work page arXiv 2024
[27]

Security Analysis of WiFi-based Sensing Systems: Threats from Perturbation Attacks,

H. Cao, W. Huang, G. Xu, X. Chen, Z. He, J. Hu, H. Jiang, and Y . Fang, “Security Analysis of WiFi-based Sensing Systems: Threats from Perturbation Attacks,” arXiv:2404.15587, 2024

work page arXiv 2024
[28]

LiDAR-Based Online 3D Video Object Detection With Graph-Based Message Passing and Spatiotemporal Transformer Attention,

J. Yin, J. Shen, C. Guan, D. Zhou, and R. Yang, “LiDAR-Based Online 3D Video Object Detection With Graph-Based Message Passing and Spatiotemporal Transformer Attention,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2020, pp. 11 492– 11 501

work page 2020
[29]

Energy Efficient Schedul- ing Algorithms for Sweep Coverage in Mobile Sensor Networks,

X. Gao, Z. Chen, J. Pan, F. Wu, and G. Chen, “Energy Efficient Schedul- ing Algorithms for Sweep Coverage in Mobile Sensor Networks,” IEEE Transactions on Mobile Computing, vol. 19, no. 6, pp. 1332–1345, 2020

work page 2020
[30]

Fast and Robust LiDAR-Inertial Odometry by Tightly-Coupled Iterated Kalman Smoother and Robocentric V oxels,

J. Liu, Y . Zhang, X. Zhao, Z. He, W. Liu, and X. Lv, “Fast and Robust LiDAR-Inertial Odometry by Tightly-Coupled Iterated Kalman Smoother and Robocentric V oxels,” IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 10, pp. 14 486–14 496, 2024

work page 2024
[31]

Reconstruction-based LSTM-Autoencoder for Anomaly- based DDoS Attack Detection over Multivariate Time-Series Data,

Y . Wei, J. Jang-Jaccard, F. Sabrina, W. Xu, S. Camtepe, and A. Dunmore, “Reconstruction-based LSTM-Autoencoder for Anomaly- based DDoS Attack Detection over Multivariate Time-Series Data,” arXiv:2305.09475, 2023

work page arXiv 2023
[32]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,

Y . Benjamini and Y . Hochberg, “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal statistical society: series B (Methodological) , vol. 57, no. 1, pp. 289–300, 1995

work page 1995
[33]

CARLA: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning , 2017, pp. 1–16

work page 2017
[34]

Low-Rate DDoS Attacks Detection and Traceback by Using New Information Metrics,

Y . Xiang, K. Li, and W. Zhou, “Low-Rate DDoS Attacks Detection and Traceback by Using New Information Metrics,” IEEE Transactions on Information Forensics and Security , vol. 6, no. 2, pp. 426–437, 2011

work page 2011
[35]

A Mathematical Modeling of Stuxnet- Style Autonomous Vehicle Malware,

H. Ahn, J. Choi, and Y . H. Kim, “A Mathematical Modeling of Stuxnet- Style Autonomous Vehicle Malware,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 673–683, 2023

work page 2023
[36]

Towards Deep Learning Models Resistant to Adversarial Attacks,

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks,” in Interna- tional Conference on Learning Representations (ICLR) , 2018

work page 2018
[37]

Towards Evaluating the Robustness of Neural Networks,

N. Carlini and D. Wagner, “Towards Evaluating the Robustness of Neural Networks,” in IEEE Symposium on Security and Privacy (SP) , May 2017, pp. 39–57

work page 2017
[38]

Fast and Furious: Real Time End- to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net,

W. Luo, B. Yang, and R. Urtasun, “Fast and Furious: Real Time End- to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2018, pp. 3569–3577

work page 2018
[39]

Adversarial examples in the physical world

A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv:1607.02533, 2017. APPENDIX A KF- BASED BEV F LOW INTERPOLATION Given the state transition equations for intermittent BEV flow, we can directly apply these to the Kalman filter (KF) framework for both prediction and state update, and thereby interpolate the missin...

work page internal anchor Pith review Pith/arXiv arXiv 2017

[1] [1]

Cooper: Cooperative Perception for Connected Autonomous Vehicles Based on 3D Point Clouds ,

Q. Chen, S. Tang, Q. Yang, and S. Fu, “Cooper: Cooperative Perception for Connected Autonomous Vehicles Based on 3D Point Clouds ,” in IEEE International Conference on Distributed Computing Systems (ICDCS), Jul. 2019, pp. 514–524

work page 2019

[2] [2]

V2X- Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving,

Y . Li, D. Ma, Z. An, Z. Wang, Y . Zhong, S. Chen, and C. Feng, “V2X- Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving,” IEEE Robotics and Automation Letters , vol. 7, no. 4, pp. 10 914–10 921, 2022

work page 2022

[3] [3]

Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving,

S. Hu, Z. Fang, H. An, G. Xu, Y . Zhou, X. Chen, and Y . Fang, “Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving,” arXiv:2310.00013, 2024

work page arXiv 2024

[4] [4]

Where2comm: Communication-Efficient Collaborative Perception via Spatial Confi- dence Maps,

Y . Hu, S. Fang, Z. Lei, Y . Zhong, and S. Chen, “Where2comm: Communication-Efficient Collaborative Perception via Spatial Confi- dence Maps,” in Advances in Neural Information Processing Systems (NeurIPS), 2022

work page 2022

[5] [5]

Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,

S. Hu, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,” arXiv:2311.16754, 2024

work page arXiv 2024

[6] [6]

PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles,

Z. Fang, S. Hu, H. An, Y . Zhang, J. Wang, H. Cao, X. Chen, and Y . Fang, “PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles,” IEEE Transactions on Mobile Computing , vol. 23, no. 12, pp. 15 003–15 018, 2024

work page 2024

[7] [7]

Direct-CP: Directed Collabora- tive Perception for Connected and Autonomous Vehicles via Proactive Attention,

Y . Tao, S. Hu, Z. Fang, and Y . Fang, “Direct-CP: Directed Collabora- tive Perception for Connected and Autonomous Vehicles via Proactive Attention,” arXiv:2409.08840, 2024

work page arXiv 2024

[8] [8]

AgentsCo- Driver: Large Language Model Empowered Collaborative Driving with Lifelong Learning,

S. Hu, Z. Fang, Z. Fang, Y . Deng, X. Chen, and Y . Fang, “AgentsCo- Driver: Large Language Model Empowered Collaborative Driving with Lifelong Learning,” arXiv:2404.06345, Apr. 2024

work page arXiv 2024

[9] [9]

AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging,

S. Hu, Z. Fang, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging,” arXiv:2408.03624, Aug. 2024

work page arXiv 2024

[10] [10]

Toward Full-Scene Domain Generalization in Multi-Agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,

S. Hu, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “Toward Full-Scene Domain Generalization in Multi-Agent Collaborative Bird’s Eye View Segmentation for Connected and Autonomous Driving,” IEEE Transactions on Intelligent Transportation Systems , pp. 1–14, 2024

work page 2024

[11] [11]

CP- Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception,

S. Hu, Y . Tao, G. Xu, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “CP- Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception,” arXiv:2412.12000, Dec. 2024

work page arXiv 2024

[12] [12]

R- ACP: Real-Time Adaptive Collaborative Perception Leveraging Robust Task-Oriented Communications,

Z. Fang, J. Wang, Y . Ma, Y . Tao, Y . Deng, X. Chen, and Y . Fang, “R- ACP: Real-Time Adaptive Collaborative Perception Leveraging Robust Task-Oriented Communications,” arXiv:2410.04168, 2024

work page arXiv 2024

[13] [13]

On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures,

Q. Zhang, S. Jin, R. Zhu, J. Sun, X. Zhang, Q. A. Chen, and Z. M. Mao, “On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures,” in 33rd USENIX Security Symposium, Aug. 2024, pp. 6309–6326

work page 2024

[14] [14]

Adversarial Attacks On Multi-Agent Communication,

J. Tu, T. Wang, J. Wang, S. Manivasagam, M. Ren, and R. Urtasun, “Adversarial Attacks On Multi-Agent Communication,” in IEEE/CVF International Conference on Computer Vision (ICCV) , 2021, pp. 7748– 7757

work page 2021

[15] [15]

Among Us: Adversarially Robust Collaborative Perception by Consensus,

Y . Li, Q. Fang, J. Bai, S. Chen, F. Juefei-Xu, and C. Feng, “Among Us: Adversarially Robust Collaborative Perception by Consensus,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 186–195

work page 2023

[16] [16]

Malicious Agent Detection for Robust Multi-Agent Collaborative Perception,

Y . Zhao, Z. Xiang, S. Yin, X. Pang, S. Chen, and Y . Wang, “Malicious Agent Detection for Robust Multi-Agent Collaborative Perception,” arXiv:2310.11901, 2024

work page arXiv 2024

[17] [17]

CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception,

Anonymous, “CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception,” in Submitted to The Thirteenth International Conference on Learning Representations (ICLR), 2024, under review

work page 2024

[18] [18]

Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges,

Y . Han, H. Zhang, H. Li, Y . Jin, C. Lang, and Y . Li, “Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges,” IEEE Intelligent Transportation Systems Magazine , vol. 15, no. 6, pp. 131–151, Nov. 2023, arXiv:2301.06262 [cs]

work page arXiv 2023

[19] [19]

Collaborative Per- ception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities,

S. Hu, Z. Fang, Y . Deng, X. Chen, and Y . Fang, “Collaborative Per- ception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities,” Jan. 2024, arXiv:2401.01544

work page arXiv 2024

[20] [20]

DSDNet: Deep Structured Self-driving Network,

W. Zeng, S. Wang, R. Liao, Y . Chen, B. Yang, and R. Urtasun, “DSDNet: Deep Structured Self-driving Network,” in European Conference on Computer Vision (ECCV) , 2020, pp. 156–172

work page 2020

[21] [21]

Learning Distilled Collaboration Graph for Multi-Agent Perception,

Y . Li, S. Ren, P. Wu, S. Chen, C. Feng, and W. Zhang, “Learning Distilled Collaboration Graph for Multi-Agent Perception,” in Advances in Neural Information Processing Systems (NeurIPS) , vol. 34, 2021, pp. 29 541–29 552

work page 2021

[22] [22]

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction,

T.-H. Wang, S. Manivasagam, M. Liang, B. Yang, W. Zeng, and R. Urta- sun, “V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction,” in European Conference on Computer Vision (ECCV) , 2020, pp. 605–621

work page 2020

[23] [23]

Asynchrony-Robust Collaborative Perception via Bird’s Eye View Flow,

S. Wei, Y . Wei, Y . Hu, Y . Lu, Y . Zhong, S. Chen, and Y . Zhang, “Asynchrony-Robust Collaborative Perception via Bird’s Eye View Flow,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 36, 2023, pp. 28 462–28 477

work page 2023

[24] [24]

Robust Collaborative 3d Object Detection in Presence of Pose Errors,

Y . Lu, Q. Li, B. Liu, M. Dianati, C. Feng, S. Chen, and Y . Wang, “Robust Collaborative 3d Object Detection in Presence of Pose Errors,” in IEEE International Conference on Robotics and Automation (ICRA) , 2023, pp. 4812–4818

work page 2023

[25] [25]

Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks,

H. An, G. Hua, Z. Lin, and Y . Fang, “Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks,” arXiv:2405.09863, 2024

work page arXiv 2024

[26] [26]

Secure Traffic Sign Recognition: An Attention-Enabled Universal Image Inpainting Mechanism against Light Patch Attacks,

H. Cao, L. Yuan, G. Xu, Z. He, Z. Fang, and Y . Fang, “Secure Traffic Sign Recognition: An Attention-Enabled Universal Image Inpainting Mechanism against Light Patch Attacks,” arXiv:2409.04133, 2024

work page arXiv 2024

[27] [27]

Security Analysis of WiFi-based Sensing Systems: Threats from Perturbation Attacks,

H. Cao, W. Huang, G. Xu, X. Chen, Z. He, J. Hu, H. Jiang, and Y . Fang, “Security Analysis of WiFi-based Sensing Systems: Threats from Perturbation Attacks,” arXiv:2404.15587, 2024

work page arXiv 2024

[28] [28]

LiDAR-Based Online 3D Video Object Detection With Graph-Based Message Passing and Spatiotemporal Transformer Attention,

J. Yin, J. Shen, C. Guan, D. Zhou, and R. Yang, “LiDAR-Based Online 3D Video Object Detection With Graph-Based Message Passing and Spatiotemporal Transformer Attention,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2020, pp. 11 492– 11 501

work page 2020

[29] [29]

Energy Efficient Schedul- ing Algorithms for Sweep Coverage in Mobile Sensor Networks,

X. Gao, Z. Chen, J. Pan, F. Wu, and G. Chen, “Energy Efficient Schedul- ing Algorithms for Sweep Coverage in Mobile Sensor Networks,” IEEE Transactions on Mobile Computing, vol. 19, no. 6, pp. 1332–1345, 2020

work page 2020

[30] [30]

Fast and Robust LiDAR-Inertial Odometry by Tightly-Coupled Iterated Kalman Smoother and Robocentric V oxels,

J. Liu, Y . Zhang, X. Zhao, Z. He, W. Liu, and X. Lv, “Fast and Robust LiDAR-Inertial Odometry by Tightly-Coupled Iterated Kalman Smoother and Robocentric V oxels,” IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 10, pp. 14 486–14 496, 2024

work page 2024

[31] [31]

Reconstruction-based LSTM-Autoencoder for Anomaly- based DDoS Attack Detection over Multivariate Time-Series Data,

Y . Wei, J. Jang-Jaccard, F. Sabrina, W. Xu, S. Camtepe, and A. Dunmore, “Reconstruction-based LSTM-Autoencoder for Anomaly- based DDoS Attack Detection over Multivariate Time-Series Data,” arXiv:2305.09475, 2023

work page arXiv 2023

[32] [32]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,

Y . Benjamini and Y . Hochberg, “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal statistical society: series B (Methodological) , vol. 57, no. 1, pp. 289–300, 1995

work page 1995

[33] [33]

CARLA: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning , 2017, pp. 1–16

work page 2017

[34] [34]

Low-Rate DDoS Attacks Detection and Traceback by Using New Information Metrics,

Y . Xiang, K. Li, and W. Zhou, “Low-Rate DDoS Attacks Detection and Traceback by Using New Information Metrics,” IEEE Transactions on Information Forensics and Security , vol. 6, no. 2, pp. 426–437, 2011

work page 2011

[35] [35]

A Mathematical Modeling of Stuxnet- Style Autonomous Vehicle Malware,

H. Ahn, J. Choi, and Y . H. Kim, “A Mathematical Modeling of Stuxnet- Style Autonomous Vehicle Malware,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 673–683, 2023

work page 2023

[36] [36]

Towards Deep Learning Models Resistant to Adversarial Attacks,

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks,” in Interna- tional Conference on Learning Representations (ICLR) , 2018

work page 2018

[37] [37]

Towards Evaluating the Robustness of Neural Networks,

N. Carlini and D. Wagner, “Towards Evaluating the Robustness of Neural Networks,” in IEEE Symposium on Security and Privacy (SP) , May 2017, pp. 39–57

work page 2017

[38] [38]

Fast and Furious: Real Time End- to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net,

W. Luo, B. Yang, and R. Urtasun, “Fast and Furious: Real Time End- to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2018, pp. 3569–3577

work page 2018

[39] [39]

Adversarial examples in the physical world

A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv:1607.02533, 2017. APPENDIX A KF- BASED BEV F LOW INTERPOLATION Given the state transition equations for intermittent BEV flow, we can directly apply these to the Kalman filter (KF) framework for both prediction and state update, and thereby interpolate the missin...

work page internal anchor Pith review Pith/arXiv arXiv 2017