Dissect and Prune: Enhancing Robustness in AI-Generated Image Detection

Dahye Kim; Donghun Lee; Hyun Seok Seong; Jaehyun Choi; Jang-Ho Choi; Seongho Kim; Sungwon Yi

arxiv: 2606.10309 · v1 · pith:CYQA2ICSnew · submitted 2026-06-09 · 💻 cs.CV

Dissect and Prune: Enhancing Robustness in AI-Generated Image Detection

Dahye Kim , Jaehyun Choi , Hyun Seok Seong , Seongho Kim , Donghun Lee , Sungwon Yi , Jang-Ho Choi This is my paper

Pith reviewed 2026-06-27 14:15 UTC · model grok-4.3

classification 💻 cs.CV

keywords AI-generated image detectionrobustnesspost-processinginpaintingfeature pruningprediction asymmetrygenerative artifacts

0 comments

The pith

Pruning features aligned with inpainted regions reduces bias toward real images in AI-generated content detectors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current detectors achieve high accuracy mainly by favoring the real class, which reduces their ability to catch generated images once standard operations like compression or resizing are applied. The paper traces this to spurious features that distract from actual generative traces. DEAR identifies these by checking how strongly each channel's activations match inpaint masks, then removes the channels at both extremes of alignment. What remains are the activations that track true artifacts instead. Tests confirm the pruned models handle new generators and post-processed inputs with less bias toward calling everything real.

Core claim

By measuring channel alignment to inpaint masks and pruning activations that match either the inpainted or non-inpainted regions too closely, DEAR retains only those features that encode genuine generative artifacts; the resulting detectors exhibit reduced prediction asymmetry and greater robustness to unseen generators and post-processing operations.

What carries the argument

DEAR (Dissect and Prune), which removes channel activations whose alignment with inpaint masks falls at either extreme, leaving only those aligned with genuine generative artifacts.

If this is right

Detectors retain sensitivity to generated content even after compression and resizing.
Performance holds up when tested on generators not seen during training.
The bias that produces more real-class predictions is reduced.
Only channels that capture actual generative traces survive the pruning step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Inpainting could serve as a general probe for isolating spurious correlations in other forensic or classification tasks.
The same alignment measurement might be used to audit training data for hidden biases before model deployment.
Retraining from scratch with only the retained features might further stabilize performance across domains.

Load-bearing premise

Features that align strongly with inpainted or non-inpainted regions are the ones that become unreliable after post-processing and that hide the real generative signals.

What would settle it

Apply DEAR to an existing detector, then measure whether the false-negative rate on post-processed generated images from an unseen generator remains as high as the original model's rate.

Figures

Figures reproduced from arXiv: 2606.10309 by Dahye Kim, Donghun Lee, Hyun Seok Seong, Jaehyun Choi, Jang-Ho Choi, Seongho Kim, Sungwon Yi.

**Figure 1.** Figure 1: Prediction Asymmetry in AIGI Detection. Comparison of real accuracy (R.Acc) and fake accuracy (F.Acc) before and after post-processing on FLUX (Labs, 2024a) generated images. Existing detectors maintain high R.Acc regardless of post-processing, but F.Acc drops dramatically after post-processing is applied. This asymmetric degradation reveals that detectors rely on fragile spurious features for fake detec… view at source ↗

**Figure 2.** Figure 2: Overview of Dissect and Prune (DEAR). Our method operates in three stages: (1) Dissection via Feature Alignment, where we use inpainted images and masks as a diagnostic tool to measure how strongly each feature channel aligns with generated or real regions through Regional Activation Discrepancy (RAD); (2) Bilateral Pruning, where we identify and remove channels at both extremes of the RAD distribution, as… view at source ↗

**Figure 3.** Figure 3: Feature Alignment Visualization. From left to right: original real image, inpainting mask, inpainted image, and activation maps from three representative channels. The high RAD channel (fourth column) activates strongly within the inpainted region, the low RAD channel (sixth column) activates predominantly on the background, and the middle RAD channel (fifth column) shows no clear regional preference. we… view at source ↗

**Figure 4.** Figure 4: illustrates the relationship between RAD and robustness, revealing that channels at both extremes of the RAD distribution are more susceptible to degradation than those in the middle range. We attribute this trend to the spurious nature of features at these extremes: strongly negative values correspond to dataset-specific signatures like compression artifacts (Rajan & Lee, 2025; Grommelt et al., 2024), w… view at source ↗

**Figure 5.** Figure 5: Score distribution shift. Baseline detectors (left) exhibit severe distribution shift on unseen generators, while DEAR variants (right) maintain stable fake score distributions above the decision threshold. AEROBLADE (Ricker et al., 2024) and WaRPAD (Choi et al., 2025) offer deployment flexibility by analyzing reconstruction errors or feature stability without detector-specific optimization. Most relevan… view at source ↗

**Figure 6.** Figure 6: Examples from the diagnostic inpaint dataset. Each group of three rows shows: original real images (top), binary inpaint masks (middle), and resulting inpainted images with mask boundaries highlighted in white (bottom). 23 [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: Extended Feature Alignment Visualization. Additional examples demonstrating the regional activation patterns across different images. Each row shows (from left to right): original image, inpainting mask, inpainted image, and activation maps from high, middle, and low RAD channels. Consistent with [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗

**Figure 8.** Figure 8: Average AUC comparison across generators. We compare baseline detectors (Corvi, Rajan), their Stay-Positive variants (Corvi+, Rajan+), inpaint-trained variants (Corvi-inpaint, Rajan-inpaint), and DEAR with varying pruning ratios (α ∈ {0.1, 0.2, 0.3}). Hatched bars indicate performance on original images; solid bars indicate post-processed images. DEAR consistently achieves the highest average AUC, particul… view at source ↗

**Figure 9.** Figure 9: Per-generator AUC comparison. Detailed breakdown of detection performance across nine generators (SD, MJ, KD, PG, PixArt, LCM, FLUX, Wuerstchen, aMUSEd). Top row shows results on original images; bottom row shows results on post-processed images. DEAR variants (blue for Corvi-based, orange for Rajan-based) demonstrate superior robustness. 25 [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗

read the original abstract

While existing AI-generated image detectors report high performance, we identify that this is largely driven by a critical prediction asymmetry: a bias toward the real class that severely limits sensitivity to generated content, especially under standard post-processing operations such as compression and resizing. We hypothesize that this stems from the model's reliance on spurious features, distracting signals that obscure true generative artifacts. To address this, we propose DEAR (Dissect and Prune), which leverages inpainted images to identify and prune these interfering components. Specifically, we find that features strongly aligned to either inpainted or non-inpainted regions are less robust to post-processing. By measuring the alignment between channel activations and inpaint masks, DEAR removes features at both extremes, retaining only those that capture genuine generative artifacts. Experimental results demonstrate that our approach significantly enhances robustness against unseen generators and post-processing, effectively mitigating the prediction asymmetry. Our code is available at https://github.com/dahyedahye/dear.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DEAR's inpaint-based channel pruning targets a real asymmetry in detectors but rests on an unverified assumption that middle-alignment features are the genuine ones.

read the letter

The main thing here is a pruning trick that uses inpaint masks to drop the most and least aligned channels in a detector, on the idea that those extremes pick up spurious signals while the middle ones keep the real generative traces. That produces claimed gains against post-processing and unseen generators.

What stands out is the concrete mechanism: they measure per-channel activation alignment with the inpaint mask on generated images, then prune both tails. The abstract flags the prediction asymmetry problem clearly and the code is released, which is useful for anyone trying to reproduce or extend this. The hypothesis that extreme alignment channels are less robust is at least testable.

The soft spot is that nothing in the provided abstract shows the retained channels actually correspond to known artifact signatures like frequency patterns or upsampling traces from earlier work. If the inpainter itself injects correlated signals, the alignment score could just be selecting against inpainter-specific features rather than purifying the detector. No numbers, dataset sizes, or statistical checks appear in the abstract, and the choice of alignment thresholds is not explained. That leaves the robustness claim hanging on unspecified experiments.

This is for people already working on practical robustness fixes for image detectors who want a new pruning heuristic to try. It is not yet strong enough on its own for broad claims about artifact isolation. A serious editor should send it to review once the full experiments and controls are checked, because the asymmetry problem is worth addressing and the method is simple enough to evaluate directly.

Referee Report

3 major / 1 minor

Summary. The paper identifies a prediction asymmetry in existing AI-generated image detectors that biases toward the real class and reduces sensitivity to generated images, particularly under post-processing like compression and resizing. It hypothesizes this arises from spurious features and proposes DEAR (Dissect and Prune), which generates inpainted images, measures alignment between channel activations and inpaint masks, prunes channels at both high- and low-alignment extremes, and retains middle channels asserted to capture only genuine generative artifacts. The abstract states that experiments show this significantly enhances robustness to unseen generators and post-processing while mitigating the asymmetry; code is released.

Significance. If the central claims are substantiated with quantitative evidence, the approach could provide a practical, inpainting-based pruning technique to improve detector robustness by isolating features less tied to specific post-processing or inpainting signals. The release of code is a strength for reproducibility. However, without metrics or validation against established artifact signatures, the significance for the broader field of generative image detection remains unclear.

major comments (3)

[Abstract] Abstract: The central claim that 'experimental results demonstrate that our approach significantly enhances robustness' and 'effectively mitigating the prediction asymmetry' is presented without any quantitative metrics, dataset sizes, baseline comparisons, statistical tests, or details on alignment threshold selection. This information is load-bearing for assessing whether the robustness gains are real and general.
[DEAR description] Paragraph describing DEAR: The pruning rule rests on the assumption that 'features strongly aligned to either inpainted or non-inpainted regions are less robust to post-processing' and that retained middle channels capture 'genuine generative artifacts,' yet no independent verification is described that links retained channels to documented artifact types such as spectral anomalies, upsampling traces, or frequency-domain patterns from prior GAN/CNN detection literature.
[Experiments] Experiments section (implied by abstract claims): The post-processing robustness results and claims about unseen generators lack any reported details on experimental protocol, number of images, choice of inpainting method, or how thresholds are chosen, preventing assessment of whether gains are an artifact of the specific inpainter rather than a general purification of the detector.

minor comments (1)

[Abstract] The abstract could more precisely define 'prediction asymmetry' and briefly indicate the scale of the experiments even if full numbers appear later.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment point-by-point below, proposing revisions to strengthen the manuscript where the concerns are valid.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'experimental results demonstrate that our approach significantly enhances robustness' and 'effectively mitigating the prediction asymmetry' is presented without any quantitative metrics, dataset sizes, baseline comparisons, statistical tests, or details on alignment threshold selection. This information is load-bearing for assessing whether the robustness gains are real and general.

Authors: We agree that the abstract would be strengthened by including key quantitative results. In the revised version, we will add specific metrics such as accuracy improvements under post-processing (e.g., compression and resizing), reduction in prediction asymmetry, dataset sizes, and baseline comparisons from our experiments. Threshold selection details will also be summarized. revision: yes
Referee: [DEAR description] Paragraph describing DEAR: The pruning rule rests on the assumption that 'features strongly aligned to either inpainted or non-inpainted regions are less robust to post-processing' and that retained middle channels capture 'genuine generative artifacts,' yet no independent verification is described that links retained channels to documented artifact types such as spectral anomalies, upsampling traces, or frequency-domain patterns from prior GAN/CNN detection literature.

Authors: The pruning criterion is motivated by empirical robustness tests showing that extreme-alignment channels degrade under post-processing. The manuscript does not include direct, independent verification mapping retained channels to specific prior artifact signatures. We will add a discussion section in the revision referencing established artifact literature (e.g., spectral and frequency patterns) to better contextualize our retained features, while noting that end-to-end gains on unseen generators serve as the primary validation. revision: partial
Referee: [Experiments] Experiments section (implied by abstract claims): The post-processing robustness results and claims about unseen generators lack any reported details on experimental protocol, number of images, choice of inpainting method, or how thresholds are chosen, preventing assessment of whether gains are an artifact of the specific inpainter rather than a general purification of the detector.

Authors: The full manuscript reports the experimental protocol, including dataset sizes, the specific inpainting method, threshold selection via alignment score distributions, and evaluation on multiple post-processing operations and unseen generators. We will expand the Experiments section in the revision to make these details more explicit and add a sensitivity analysis on inpainter choice to demonstrate generality. revision: yes

Circularity Check

0 steps flagged

No circularity: experimental pruning validated by external benchmarks

full rationale

The paper presents DEAR as a heuristic channel-pruning procedure that measures activation-mask alignment on inpainted images and retains middle-ranked channels; robustness gains are reported via direct experiments on unseen generators and post-processing. No equations, fitted parameters, or self-citations are shown to reduce the central claim to its own inputs by construction. The method is self-contained against external test sets and does not invoke uniqueness theorems or rename known results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The approach implicitly assumes inpainting produces a useful contrast signal for identifying spurious features.

pith-pipeline@v0.9.1-grok · 5717 in / 1062 out tokens · 16477 ms · 2026-06-27T14:15:34.752221+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

71 extracted references · 5 linked inside Pith

[1]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Network dissection: Quantifying interpretability of deep visual representations , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
[2]

arXiv preprint arXiv:1811.10597 , year=

Gan dissection: Visualizing and understanding generative adversarial networks , author=. arXiv preprint arXiv:1811.10597 , year=

Pith/arXiv arXiv
[3]

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

On the detection of synthetic images generated by diffusion models , author=. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2023 , organization=

2023
[4]

The Thirteenth International Conference on Learning Representations , year=

Aligned Datasets Improve Detection of Latent Diffusion-Generated Images , author=. The Thirteenth International Conference on Learning Representations , year=
[5]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[6]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

Improving synthetic image detection towards generalization: An image transformation perspective , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1 , pages=
[7]

arXiv preprint arXiv:2406.19435 , year=

A sanity check for ai-generated image detection , author=. arXiv preprint arXiv:2406.19435 , year=

arXiv
[8]

arXiv preprint arXiv:2509.20890 , year=

FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies , author=. arXiv preprint arXiv:2509.20890 , year=

arXiv
[9]

arXiv preprint arXiv:2406.09398 , year=

Real-time deepfake detection in the real-world , author=. arXiv preprint arXiv:2406.09398 , year=

arXiv
[10]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Towards universal fake image detectors that generalize across generative models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[11]

Proceedings of the AAAI Conference on Artificial Intelligence , year=

C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=
[12]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Raising the bar of ai-generated image detection with clip , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[13]

European Conference on Computer Vision , pages=

Leveraging representations from intermediate encoder-blocks for synthetic image detection , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024
[14]

International Conference on Machine Learning (ICML) , year=

Effort: Efficient orthogonal modeling for generalizable ai-generated image detection , author=. International Conference on Machine Learning (ICML) , year=
[15]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

A bias-free training paradigm for more general ai-generated image detection , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[16]

arXiv preprint arXiv:2505.14359 , year=

Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable , author=. arXiv preprint arXiv:2505.14359 , year=

arXiv
[17]

Forty-first International Conference on Machine Learning , year=

Drct: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images , author=. Forty-first International Conference on Machine Learning , year=
[18]

European Conference on Computer Vision , pages=

Contrasting deepfakes diffusion via contrastive learning and global-local similarities , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024
[19]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Aeroblade: Training-free detection of latent diffusion images using autoencoder reconstruction error , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[20]

arXiv preprint arXiv:2511.14030 , year=

Training-free Detection of AI-generated images via Cropping Robustness , author=. arXiv preprint arXiv:2511.14030 , year=

arXiv
[21]

arXiv preprint arXiv:1506.03365 , year=

Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop , author=. arXiv preprint arXiv:1506.03365 , year=

Pith/arXiv arXiv
[22]

European conference on computer vision , pages=

Microsoft coco: Common objects in context , author=. European conference on computer vision , pages=. 2014 , organization=

2014
[23]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[24]

arXiv preprint arXiv:2111.11431 , year=

Redcaps: Web-curated image-text data created by the people, for the people , author=. arXiv preprint arXiv:2111.11431 , year=

arXiv
[25]

2024 , howpublished=

Black Forest Labs , title=. 2024 , howpublished=

2024
[26]

arXiv preprint arXiv:2310.03502 , year=

Kandinsky: an improved text-to-image synthesis with image prior and latent diffusion , author=. arXiv preprint arXiv:2310.03502 , year=

arXiv
[27]

5: Three insights towards enhancing aesthetic quality in text-to-image generation , author=

Playground v2. 5: Three insights towards enhancing aesthetic quality in text-to-image generation , author=. arXiv preprint arXiv:2402.17245 , year=

Pith/arXiv arXiv
[28]

Junsong Chen and Jincheng YU and Chongjian GE and Lewei Yao and Enze Xie and Zhongdao Wang and James Kwok and Ping Luo and Huchuan Lu and Zhenguo Li , booktitle=
[29]

arXiv preprint arXiv:2310.04378 , year=

Latent consistency models: Synthesizing high-resolution images with few-step inference , author=. arXiv preprint arXiv:2310.04378 , year=

Pith/arXiv arXiv
[30]

The Twelfth International Conference on Learning Representations , year=

W\"urstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models , author=. The Twelfth International Conference on Learning Representations , year=
[31]

arXiv preprint arXiv:2401.01808 , year=

amused: An open muse reproduction , author=. arXiv preprint arXiv:2401.01808 , year=

arXiv
[32]

ICLR , year=

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models , author=. ICLR , year=
[33]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=
[34]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
[35]

arXiv preprint arXiv:2210.02747 , year=

Flow matching for generative modeling , author=. arXiv preprint arXiv:2210.02747 , year=

Pith/arXiv arXiv
[36]

Proceedings of the AAAI Conference on Artificial Intelligence , year=

Frequency-aware deepfake detection: Improving generalizability through frequency space domain learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=
[37]

arXiv preprint arXiv:2505.12335 , year=

Is Artificial Intelligence Generated Image Detection a Solved Problem? , author=. arXiv preprint arXiv:2505.12335 , year=

arXiv
[38]

Forty-second International Conference on Machine Learning , year=

Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection , author=. Forty-second International Conference on Machine Learning , year=
[39]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Automatic correction of internal units in generative neural networks , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[40]

for now , author=

CNN-generated images are surprisingly easy to spot... for now , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[41]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
[42]

IEEE Transactions on image processing , volume=

Active contours without edges , author=. IEEE Transactions on image processing , volume=. 2001 , publisher=

2001
[43]

Nature Machine Intelligence , volume=

Shortcut learning in deep neural networks , author=. Nature Machine Intelligence , volume=. 2020 , publisher=

2020
[44]

ECCV , year=

Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets , author=. ECCV , year=
[45]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year=

FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year=
[46]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
[47]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Dire for diffusion-generated image detection , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[48]

Luo, Yunpeng and Du, Junlong and Yan, Ke and Ding, Shouhong , booktitle=. Lare\^
[49]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Fakeinversion: Learning to detect images from unseen text-to-image models by inverting stable diffusion , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[50]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[51]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Forgery-aware adaptive transformer for generalizable synthetic image detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[52]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Semantic Discrepancy-aware Detector for Image Forgery Identification , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[53]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Aigi-holmes: Towards explainable and generalizable ai-generated image detection via multimodal large language models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[54]

Advances in Neural Information Processing Systems , volume=

Breaking semantic artifacts for generalized ai-generated image detection , author=. Advances in Neural Information Processing Systems , volume=
[55]

Advances in Neural Information Processing Systems , volume=

Breaking latent prior bias in detectors for generalizable aigc image detection , author=. Advances in Neural Information Processing Systems , volume=
[56]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Towards universal ai-generated image detection by variational information bottleneck network , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[57]

Advances in Neural Information Processing Systems , volume=

MLEP: Multi-granularity Local Entropy Patterns for Generalized AI-generated Image Detection , author=. Advances in Neural Information Processing Systems , volume=
[58]

Forty-second International Conference on Machine Learning , year=

PiD: Generalized AI-Generated Images Detection with Pixelwise Decomposition Residuals , author=. Forty-second International Conference on Machine Learning , year=
[59]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Any-resolution ai-generated image detection by spectral learning , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[60]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Learning on gradients: Generalized artifacts representation for gan-generated images detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[61]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

LOTA: Bit-Planes Guided AI-Generated Image Detection , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[62]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Forensic self-descriptions are all you need for zero-shot detection, open-set source attribution, and clustering of ai-generated images , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[63]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Beyond Generation: A Diffusion-based Low-level Feature Extractor for Detecting AI-generated Images , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[64]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[65]

Advances in neural information processing systems , volume=

Genimage: A million-scale benchmark for detecting ai-generated image , author=. Advances in neural information processing systems , volume=
[66]

Advances in neural information processing systems , volume=

Seeing is not always believing: Benchmarking human and model perception of ai-generated images , author=. Advances in neural information processing systems , volume=
[67]

Advances in Neural Information Processing Systems , volume=

Df40: Toward next-generation deepfake detection , author=. Advances in Neural Information Processing Systems , volume=
[68]

Advances in Neural Information Processing Systems , volume=

Semi-truths: A large-scale dataset of ai-augmented images for evaluating robustness of ai-generated image detectors , author=. Advances in Neural Information Processing Systems , volume=
[69]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Community forensics: Using thousands of generators to train fake image detectors , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
[70]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[71]

The Twelfth International Conference on Learning Representations (ICLR) , year =

Podell, Dustin and English, Zion and Lacey, Kyle and Blattmann, Andreas and Dockhorn, Tim and M. The Twelfth International Conference on Learning Representations (ICLR) , year =

[1] [1]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Network dissection: Quantifying interpretability of deep visual representations , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

[2] [2]

arXiv preprint arXiv:1811.10597 , year=

Gan dissection: Visualizing and understanding generative adversarial networks , author=. arXiv preprint arXiv:1811.10597 , year=

Pith/arXiv arXiv

[3] [3]

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

On the detection of synthetic images generated by diffusion models , author=. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2023 , organization=

2023

[4] [4]

The Thirteenth International Conference on Learning Representations , year=

Aligned Datasets Improve Detection of Latent Diffusion-Generated Images , author=. The Thirteenth International Conference on Learning Representations , year=

[5] [5]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[6] [6]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

Improving synthetic image detection towards generalization: An image transformation perspective , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1 , pages=

[7] [7]

arXiv preprint arXiv:2406.19435 , year=

A sanity check for ai-generated image detection , author=. arXiv preprint arXiv:2406.19435 , year=

arXiv

[8] [8]

arXiv preprint arXiv:2509.20890 , year=

FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies , author=. arXiv preprint arXiv:2509.20890 , year=

arXiv

[9] [9]

arXiv preprint arXiv:2406.09398 , year=

Real-time deepfake detection in the real-world , author=. arXiv preprint arXiv:2406.09398 , year=

arXiv

[10] [10]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Towards universal fake image detectors that generalize across generative models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[11] [11]

Proceedings of the AAAI Conference on Artificial Intelligence , year=

C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=

[12] [12]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Raising the bar of ai-generated image detection with clip , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[13] [13]

European Conference on Computer Vision , pages=

Leveraging representations from intermediate encoder-blocks for synthetic image detection , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024

[14] [14]

International Conference on Machine Learning (ICML) , year=

Effort: Efficient orthogonal modeling for generalizable ai-generated image detection , author=. International Conference on Machine Learning (ICML) , year=

[15] [15]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

A bias-free training paradigm for more general ai-generated image detection , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[16] [16]

arXiv preprint arXiv:2505.14359 , year=

Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable , author=. arXiv preprint arXiv:2505.14359 , year=

arXiv

[17] [17]

Forty-first International Conference on Machine Learning , year=

Drct: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images , author=. Forty-first International Conference on Machine Learning , year=

[18] [18]

European Conference on Computer Vision , pages=

Contrasting deepfakes diffusion via contrastive learning and global-local similarities , author=. European Conference on Computer Vision , pages=. 2024 , organization=

2024

[19] [19]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Aeroblade: Training-free detection of latent diffusion images using autoencoder reconstruction error , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[20] [20]

arXiv preprint arXiv:2511.14030 , year=

Training-free Detection of AI-generated images via Cropping Robustness , author=. arXiv preprint arXiv:2511.14030 , year=

arXiv

[21] [21]

arXiv preprint arXiv:1506.03365 , year=

Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop , author=. arXiv preprint arXiv:1506.03365 , year=

Pith/arXiv arXiv

[22] [22]

European conference on computer vision , pages=

Microsoft coco: Common objects in context , author=. European conference on computer vision , pages=. 2014 , organization=

2014

[23] [23]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[24] [24]

arXiv preprint arXiv:2111.11431 , year=

Redcaps: Web-curated image-text data created by the people, for the people , author=. arXiv preprint arXiv:2111.11431 , year=

arXiv

[25] [25]

2024 , howpublished=

Black Forest Labs , title=. 2024 , howpublished=

2024

[26] [26]

arXiv preprint arXiv:2310.03502 , year=

Kandinsky: an improved text-to-image synthesis with image prior and latent diffusion , author=. arXiv preprint arXiv:2310.03502 , year=

arXiv

[27] [27]

5: Three insights towards enhancing aesthetic quality in text-to-image generation , author=

Playground v2. 5: Three insights towards enhancing aesthetic quality in text-to-image generation , author=. arXiv preprint arXiv:2402.17245 , year=

Pith/arXiv arXiv

[28] [28]

Junsong Chen and Jincheng YU and Chongjian GE and Lewei Yao and Enze Xie and Zhongdao Wang and James Kwok and Ping Luo and Huchuan Lu and Zhenguo Li , booktitle=

[29] [29]

arXiv preprint arXiv:2310.04378 , year=

Latent consistency models: Synthesizing high-resolution images with few-step inference , author=. arXiv preprint arXiv:2310.04378 , year=

Pith/arXiv arXiv

[30] [30]

The Twelfth International Conference on Learning Representations , year=

W\"urstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models , author=. The Twelfth International Conference on Learning Representations , year=

[31] [31]

arXiv preprint arXiv:2401.01808 , year=

amused: An open muse reproduction , author=. arXiv preprint arXiv:2401.01808 , year=

arXiv

[32] [32]

ICLR , year=

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models , author=. ICLR , year=

[33] [33]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

[34] [34]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

[35] [35]

arXiv preprint arXiv:2210.02747 , year=

Flow matching for generative modeling , author=. arXiv preprint arXiv:2210.02747 , year=

Pith/arXiv arXiv

[36] [36]

Proceedings of the AAAI Conference on Artificial Intelligence , year=

Frequency-aware deepfake detection: Improving generalizability through frequency space domain learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , year=

[37] [37]

arXiv preprint arXiv:2505.12335 , year=

Is Artificial Intelligence Generated Image Detection a Solved Problem? , author=. arXiv preprint arXiv:2505.12335 , year=

arXiv

[38] [38]

Forty-second International Conference on Machine Learning , year=

Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection , author=. Forty-second International Conference on Machine Learning , year=

[39] [39]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Automatic correction of internal units in generative neural networks , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[40] [40]

for now , author=

CNN-generated images are surprisingly easy to spot... for now , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[41] [41]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

[42] [42]

IEEE Transactions on image processing , volume=

Active contours without edges , author=. IEEE Transactions on image processing , volume=. 2001 , publisher=

2001

[43] [43]

Nature Machine Intelligence , volume=

Shortcut learning in deep neural networks , author=. Nature Machine Intelligence , volume=. 2020 , publisher=

2020

[44] [44]

ECCV , year=

Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets , author=. ECCV , year=

[45] [45]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year=

FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year=

[46] [46]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

[47] [47]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Dire for diffusion-generated image detection , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[48] [48]

Luo, Yunpeng and Du, Junlong and Yan, Ke and Ding, Shouhong , booktitle=. Lare\^

[49] [49]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Fakeinversion: Learning to detect images from unseen text-to-image models by inverting stable diffusion , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[50] [50]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[51] [51]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Forgery-aware adaptive transformer for generalizable synthetic image detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[52] [52]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Semantic Discrepancy-aware Detector for Image Forgery Identification , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[53] [53]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Aigi-holmes: Towards explainable and generalizable ai-generated image detection via multimodal large language models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[54] [54]

Advances in Neural Information Processing Systems , volume=

Breaking semantic artifacts for generalized ai-generated image detection , author=. Advances in Neural Information Processing Systems , volume=

[55] [55]

Advances in Neural Information Processing Systems , volume=

Breaking latent prior bias in detectors for generalizable aigc image detection , author=. Advances in Neural Information Processing Systems , volume=

[56] [56]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Towards universal ai-generated image detection by variational information bottleneck network , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[57] [57]

Advances in Neural Information Processing Systems , volume=

MLEP: Multi-granularity Local Entropy Patterns for Generalized AI-generated Image Detection , author=. Advances in Neural Information Processing Systems , volume=

[58] [58]

Forty-second International Conference on Machine Learning , year=

PiD: Generalized AI-Generated Images Detection with Pixelwise Decomposition Residuals , author=. Forty-second International Conference on Machine Learning , year=

[59] [59]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Any-resolution ai-generated image detection by spectral learning , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[60] [60]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Learning on gradients: Generalized artifacts representation for gan-generated images detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

[61] [61]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

LOTA: Bit-Planes Guided AI-Generated Image Detection , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[62] [62]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Forensic self-descriptions are all you need for zero-shot detection, open-set source attribution, and clustering of ai-generated images , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[63] [63]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Beyond Generation: A Diffusion-based Low-level Feature Extractor for Detecting AI-generated Images , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[64] [64]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[65] [65]

Advances in neural information processing systems , volume=

Genimage: A million-scale benchmark for detecting ai-generated image , author=. Advances in neural information processing systems , volume=

[66] [66]

Advances in neural information processing systems , volume=

Seeing is not always believing: Benchmarking human and model perception of ai-generated images , author=. Advances in neural information processing systems , volume=

[67] [67]

Advances in Neural Information Processing Systems , volume=

Df40: Toward next-generation deepfake detection , author=. Advances in Neural Information Processing Systems , volume=

[68] [68]

Advances in Neural Information Processing Systems , volume=

Semi-truths: A large-scale dataset of ai-augmented images for evaluating robustness of ai-generated image detectors , author=. Advances in Neural Information Processing Systems , volume=

[69] [69]

Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

Community forensics: Using thousands of generators to train fake image detectors , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

[70] [70]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

[71] [71]

The Twelfth International Conference on Learning Representations (ICLR) , year =

Podell, Dustin and English, Zion and Lacey, Kyle and Blattmann, Andreas and Dockhorn, Tim and M. The Twelfth International Conference on Learning Representations (ICLR) , year =