Your agent can defend itself against backdoor attacks

· 2025 · arXiv 2506.08336

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

ATAAT: Adaptive Threat-Aware Adversarial Tuning Framework against Backdoor Attacks on Vision-Language-Action Models

cs.RO · 2026-05-09 · unverdicted · novelty 6.0

ATAAT is an adaptive adversarial tuning method that enables effective, stealthy backdoor attacks on VLA models by dynamically selecting gradient decoupling strategies based on attacker capabilities.

From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems

cs.RO · 2026-04-04 · unverdicted · novelty 6.0

Backdoor attacks aligned with JSON command formats in LLM robot controllers achieve 83% attack success rate while preserving over 93% clean accuracy and sub-second latency.

citing papers explorer

Showing 2 of 2 citing papers.

ATAAT: Adaptive Threat-Aware Adversarial Tuning Framework against Backdoor Attacks on Vision-Language-Action Models cs.RO · 2026-05-09 · unverdicted · none · ref 26
ATAAT is an adaptive adversarial tuning method that enables effective, stealthy backdoor attacks on VLA models by dynamically selecting gradient decoupling strategies based on attacker capabilities.
From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems cs.RO · 2026-04-04 · unverdicted · none · ref 23
Backdoor attacks aligned with JSON command formats in LLM robot controllers achieve 83% attack success rate while preserving over 93% clean accuracy and sub-second latency.

Your agent can defend itself against backdoor attacks

fields

years

verdicts

representative citing papers

citing papers explorer