Fine-pruning: Defending against backdooring attacks on deep neural networks

Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg · 2025 · arXiv 2505.19532

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Plan2Cleanse frames RL backdoor detection as a Monte Carlo planning problem to achieve over 61 percentage point gains in trigger detection and improved win rates in competitive environments.

BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

BehaviorGuard detects backdoor behaviors in DRL policies via behavioral drift in action distributions and suppresses suspicious actions at runtime, claimed as the first online defense for both single- and multi-agent settings.

citing papers explorer

Showing 2 of 2 citing papers.

Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning cs.LG · 2026-05-10 · unverdicted · none · ref 7
Plan2Cleanse frames RL backdoor detection as a Monte Carlo planning problem to achieve over 61 percentage point gains in trigger detection and improved win rates in competitive environments.
BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning cs.AI · 2026-05-07 · unverdicted · none · ref 20
BehaviorGuard detects backdoor behaviors in DRL policies via behavioral drift in action distributions and suppresses suspicious actions at runtime, claimed as the first online defense for both single- and multi-agent settings.

Fine-pruning: Defending against backdooring attacks on deep neural networks

fields

years

verdicts

representative citing papers

citing papers explorer