TIGS detects backdoor-induced attention collapse in LLMs and applies content-aware tail-risk screening plus intrinsic geometric smoothing to suppress attacks while preserving normal performance.
Sampdetox: Black-box backdoor defense via perturbation-based sample detoxification.NeurIPS, 37:121236–121264, 2024b
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
TCAP detects backdoor samples in MLLM fine-tuning via tri-component attention profiling, GMM-based head identification, and EM vote aggregation.
citing papers explorer
-
Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing
TIGS detects backdoor-induced attention collapse in LLMs and applies content-aware tail-risk screening plus intrinsic geometric smoothing to suppress attacks while preserving normal performance.
-
TCAP: Tri-Component Attention Profiling for Unsupervised Backdoor Detection in MLLM Fine-Tuning
TCAP detects backdoor samples in MLLM fine-tuning via tri-component attention profiling, GMM-based head identification, and EM vote aggregation.