A new backdoor technique called TSBH uses reverse tree search to create malicious chain-of-thought data and injects it in two stages to hijack LLM reasoning upon trigger activation.
Darkmind: Latent chain-of-thought backdoor in customized llms
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
ShadowCoT introduces a reasoning-level backdoor attack on LLMs achieving 94.4% attack success rate and 88.4% hijacking success rate with 0.15% parameter updates via internal state conditioning and reasoning chain pollution.
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
citing papers explorer
-
Unreal Thinking: Chain-of-Thought Hijacking via Two-stage Backdoor
A new backdoor technique called TSBH uses reverse tree search to create malicious chain-of-thought data and injects it in two stages to hijack LLM reasoning upon trigger activation.
-
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
ShadowCoT introduces a reasoning-level backdoor attack on LLMs achieving 94.4% attack success rate and 88.4% hijacking success rate with 0.15% parameter updates via internal state conditioning and reasoning chain pollution.
-
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.