Baseline defenses including perplexity-based detection, input preprocessing, and adversarial training offer partial robustness to text adversarial attacks on LLMs, with challenges arising from weak discrete optimizers.
Adversarial attacks on machine learning systems for high-frequency trading
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2023 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Baseline defenses including perplexity-based detection, input preprocessing, and adversarial training offer partial robustness to text adversarial attacks on LLMs, with challenges arising from weak discrete optimizers.