A new defense framework called the disillusion paradigm uses an imitation game and chain-of-thought reasoning in generative agents to neutralize deductive and inductive adversarial illusions in white-box and black-box scenarios.
Neural trojans,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Proposes a self-aware unlearning method inspired by hypnopaedia that uses model inversion and hypothesis testing to detect and detach backdoor triggers from machine learning models.
citing papers explorer
-
Imitation Game for Adversarial Disillusion with Chain-of-Thought Reasoning in Generative AI
A new defense framework called the disillusion paradigm uses an imitation game and chain-of-thought reasoning in generative agents to neutralize deductive and inductive adversarial illusions in white-box and black-box scenarios.
-
Hypnopaedia-Aware Machine Unlearning via Psychometrics of Artificial Mental Imagery
Proposes a self-aware unlearning method inspired by hypnopaedia that uses model inversion and hypothesis testing to detect and detach backdoor triggers from machine learning models.