Pots: Proof-of-training-steps for backdoor detection in large language mod- els,

· 2025 · arXiv 2510.15106

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

BackFlush: Knowledge-Free Backdoor Detection and Elimination with Watermark Preservation in Large Language Models

cs.CR · 2026-04-15 · unverdicted · novelty 6.0

BackFlush detects backdoors via susceptibility amplification and eliminates them with RoPE unlearning to reach 1% ASR and 99% clean accuracy while preserving watermarks.

citing papers explorer

Showing 1 of 1 citing paper.

BackFlush: Knowledge-Free Backdoor Detection and Elimination with Watermark Preservation in Large Language Models cs.CR · 2026-04-15 · unverdicted · none · ref 26
BackFlush detects backdoors via susceptibility amplification and eliminates them with RoPE unlearning to reach 1% ASR and 99% clean accuracy while preserving watermarks.

Pots: Proof-of-training-steps for backdoor detection in large language mod- els,

fields

years

verdicts

representative citing papers

citing papers explorer