Plan2Cleanse frames RL backdoor detection as a Monte Carlo planning problem to achieve over 61 percentage point gains in trigger detection and improved win rates in competitive environments.
Fine-pruning: Defending against backdooring attacks on deep neural networks
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
BehaviorGuard detects backdoor behaviors in DRL policies via behavioral drift in action distributions and suppresses suspicious actions at runtime, claimed as the first online defense for both single- and multi-agent settings.
citing papers explorer
-
Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning
Plan2Cleanse frames RL backdoor detection as a Monte Carlo planning problem to achieve over 61 percentage point gains in trigger detection and improved win rates in competitive environments.
-
BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning
BehaviorGuard detects backdoor behaviors in DRL policies via behavioral drift in action distributions and suppresses suspicious actions at runtime, claimed as the first online defense for both single- and multi-agent settings.