CogManip is a benchmark that tests 13 LLMs on 15 manipulation risks in 1,000 multi-turn dialogues, finding heterogeneous risks and prompt sensitivity in models like DeepSeek-V3.2.
arXiv preprint arXiv:2602.14135 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CogManip: Benchmarking Manipulative Behavior in Multi-Turn Interactions with Large Language Model
CogManip is a benchmark that tests 13 LLMs on 15 manipulation risks in 1,000 multi-turn dialogues, finding heterogeneous risks and prompt sensitivity in models like DeepSeek-V3.2.