DVAR turns video authenticity detection into an iterative debate between a generative hypothesis agent and a natural mechanism agent, resolved via minimum description length and a knowledge base for better generalization than supervised detectors.
Dire for diffusion-generated image detection
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 4roles
background 1polarities
background 1representative citing papers
GAPL learns a compact set of canonical forgery prototypes and applies two-stage LoRA training to build a low-variance feature space that improves generalization across GAN and diffusion generators.
The ITW-SM dataset and targeted optimization of detector design choices yield a 26.87% average AUC improvement for state-of-the-art AI-generated image detectors under real-world social media conditions.
BLINK benchmark shows multimodal LLMs reach only 45-51 percent accuracy on core visual perception tasks where humans achieve 95 percent, indicating these abilities have not emerged.
citing papers explorer
-
DVAR: Adversarial Multi-Agent Debate for Video Authenticity Detection
DVAR turns video authenticity detection into an iterative debate between a generative hypothesis agent and a natural mechanism agent, resolved via minimum description length and a knowledge base for better generalization than supervised detectors.
-
Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes
GAPL learns a compact set of canonical forgery prototypes and applies two-stage LoRA training to build a low-variance feature space that improves generalization across GAN and diffusion generators.
-
Navigating the Challenges of AI-Generated Image Detection in the Wild: What Truly Matters?
The ITW-SM dataset and targeted optimization of detector design choices yield a 26.87% average AUC improvement for state-of-the-art AI-generated image detectors under real-world social media conditions.
-
BLINK: Multimodal Large Language Models Can See but Not Perceive
BLINK benchmark shows multimodal LLMs reach only 45-51 percent accuracy on core visual perception tasks where humans achieve 95 percent, indicating these abilities have not emerged.