SpliceRadar: A Learned Method For Blind Image Forensics

Aurobrata Ghosh; Maneesh Singh; Terrance E Boult; Zheng Zhong

arxiv: 1906.11663 · v1 · pith:7PXFRFU3new · submitted 2019-06-27 · 💻 cs.CV

SpliceRadar: A Learned Method For Blind Image Forensics

Aurobrata Ghosh , Zheng Zhong , Terrance E Boult , Maneesh Singh This is my paper

Pith reviewed 2026-05-25 14:40 UTC · model grok-4.3

classification 💻 cs.CV

keywords splice localizationimage forensicsblind detectiondeep learningcamera model identificationGaussian mixture modelmanipulation detection

0 comments

The pith

A deep learning method localizes image splices without knowing the camera model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a technique for detecting and localizing spliced regions in digital images using deep learning without any information about the camera that took the image. Instead of training directly on manipulated images, the model learns to identify camera models from a large set of untouched photos. The learned features are then used during testing to separate regions that appear to come from different cameras by fitting a Gaussian mixture model. This setup allows the method to work on new images and datasets where camera details are unavailable.

Core claim

We propose a deep learning based method for splice localization without prior knowledge of a test image's camera-model. It comprises a novel approach for learning rich filters and for suppressing image-edges. Additionally, we train our model on a surrogate task of camera model identification, which allows us to leverage large and widely available, unmanipulated, camera-tagged image databases. During inference, we assume that the spliced and host regions come from different camera-models and we segment these regions using a Gaussian-mixture model.

What carries the argument

Convolutional network trained on camera model identification as surrogate task, with learned rich filters and edge suppression, followed by Gaussian mixture model segmentation of feature maps at inference.

If this is right

Enables splice localization on images from unknown cameras.
Uses abundant unmanipulated camera-tagged images for training instead of scarce manipulated examples.
Achieves results on par with or above the state-of-the-art on three test databases.
Generalizes to unknown datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Camera model features extracted this way could serve as a starting point for other blind forensic tasks.
The method would likely require a different segmentation step if more than two source cameras are present.
Success depends on the distinctiveness of camera signatures even after splicing operations.

Load-bearing premise

Spliced and host regions in a test image come from different camera models.

What would settle it

Performance collapse on a dataset of splices where both regions are taken from the same camera model.

Figures

Figures reproduced from arXiv: 1906.11663 by Aurobrata Ghosh, Maneesh Singh, Terrance E Boult, Zheng Zhong.

**Figure 1.** Figure 1: SpliceRadar is able to learn low level features while suppressing semantic-information which are image specific. This allows it to generalize well to new tampered datasets. Two examples: col-1: input image, col-2: sample of a learned rich filter (contains semanticedges), col-3: final features (semantic-edges suppressed), col-4: output heat map indicating tampered region. and learn low-level features of ca… view at source ↗

**Figure 2.** Figure 2: System architecture of SpliceRadar. the semantic contents of the training data, which would affect its generalization ability. Therefore, after learning the spatial distribution of these residuals, we further suppress the remaining semantic-edges by applying a probabilistic regularization. From these we learn a hundred-dimensional feature vector characteristic of a camera-model and independent of the ima… view at source ↗

**Figure 3.** Figure 3: Qualitative results from SpliceRadar. Col-1: input image, col-2: ground-truth manipulation mask, col-3: predicted probability [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative comparison of SpliceRadar, SB and EXIF-SC. Col-1: input image, col-2: ground-truth manipulation mask, col-3: [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Hard examples where all three algorithms, SpliceRadar, SB and EXIF-SC, fail to detect the spliced regions. Col-1: input image, [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Detection and localization of image manipulations like splices are gaining in importance with the easy accessibility of image editing softwares. While detection generates a verdict for an image it provides no insight into the manipulation. Localization helps explain a positive detection by identifying the pixels of the image which have been tampered. We propose a deep learning based method for splice localization without prior knowledge of a test image's camera-model. It comprises a novel approach for learning rich filters and for suppressing image-edges. Additionally, we train our model on a surrogate task of camera model identification, which allows us to leverage large and widely available, unmanipulated, camera-tagged image databases. During inference, we assume that the spliced and host regions come from different camera-models and we segment these regions using a Gaussian-mixture model. Experiments on three test databases demonstrate results on par with and above the state-of-the-art and a good generalization ability to unknown datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The method trains a CNN on camera ID as surrogate then clusters features with GMM, but only under the untested assumption that spliced and host regions come from different cameras.

read the letter

The main idea is training a network on camera model identification to get features useful for splice localization, adding edge suppression, and then running GMM clustering at test time. This lets them skip needing camera knowledge for the test image and tap into large public camera-tagged datasets instead of scarce manipulated ones. That training choice is practical and avoids some data bottlenecks common in forensics work. The edge suppression step also looks like a sensible way to reduce false positives from natural boundaries. The assumption during inference is the clear soft spot. The paper states that spliced and host regions come from different camera models and relies on the GMM to separate them, but nothing in the abstract or description shows tests where that does not hold. If a splice uses the same camera model, the features may not separate and localization collapses regardless of how good the filters are. Real splices do not always obey the different-camera rule, so the method's reliability depends on how often that assumption matches practice. The abstract claims results on par with or above SOTA plus good generalization on three databases, yet supplies no numbers, baselines, or ablations, which leaves the evidence hard to weigh. This is aimed at computer vision groups working on blind forensics. It has a workable idea and honest use of surrogate training, so it deserves peer review to examine the full experiments and see whether the assumption is addressed or quantified there.

Referee Report

2 major / 0 minor

Summary. The paper proposes SpliceRadar, a CNN-based method for blind splice localization that requires no prior camera-model knowledge of the test image. A network is trained on the surrogate task of camera-model identification using large unmanipulated datasets; the architecture includes novel components for learning rich filters and suppressing image edges. At inference the learned features are clustered with a GMM under the explicit assumption that spliced and host regions originate from different camera models. Experiments on three test databases are claimed to match or exceed prior SOTA while showing good generalization to unknown data.

Significance. The surrogate-task strategy that exploits abundant camera-tagged data is a clear strength and could meaningfully advance blind forensics if the empirical claims are substantiated. However, the load-bearing inference assumption (different camera models) is unvalidated in the provided description, which limits the assessed significance until addressed.

major comments (2)

[Abstract] Abstract: the localization pipeline rests on the assumption that 'the spliced and host regions come from different camera-models' followed by GMM segmentation, yet no experiment, ablation, or analysis is described that tests feature separability when this assumption is violated or quantifies how often real-world splices satisfy it.
[Abstract] Abstract: the claim that experiments 'demonstrate results on par with and above the state-of-the-art' supplies no metrics, baselines, error bars, dataset sizes, or ablation results, preventing any assessment of the central empirical claim.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments highlighting the central assumption and the need for clearer empirical support in the abstract. We respond to each point below.

read point-by-point responses

Referee: [Abstract] Abstract: the localization pipeline rests on the assumption that 'the spliced and host regions come from different camera-models' followed by GMM segmentation, yet no experiment, ablation, or analysis is described that tests feature separability when this assumption is violated or quantifies how often real-world splices satisfy it.

Authors: The assumption is stated explicitly as a design choice for blind localization. We agree that testing feature separability under violation (same-camera splices) is valuable and will add a controlled ablation on the test sets by artificially creating same-model splices to measure degradation. A full quantification of real-world splice statistics is difficult without a dedicated provenance dataset, but we will add discussion referencing prior forensics literature on cross-camera splicing prevalence. revision: yes
Referee: [Abstract] Abstract: the claim that experiments 'demonstrate results on par with and above the state-of-the-art' supplies no metrics, baselines, error bars, dataset sizes, or ablation results, preventing any assessment of the central empirical claim.

Authors: Abstracts are space-limited and serve as summaries; the full Experiments section reports the metrics, baselines, dataset sizes (three test databases), and comparisons. We will revise the abstract to include key quantitative highlights (e.g., F1 scores and dataset names) while remaining within length limits. revision: yes

standing simulated objections not resolved

A rigorous quantification of how frequently real-world splices satisfy the different-camera-model assumption would require a large-scale study of verified manipulated images with camera metadata, which is not feasible within this work.

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external data and standard clustering

full rationale

The paper trains a CNN on the surrogate task of camera-model identification using large external camera-tagged databases of unmanipulated images. At inference it applies a standard Gaussian-mixture model to the learned features under an explicitly stated assumption that spliced and host regions originate from different camera models. No equations, fitted parameters, or predictions are shown to reduce by construction to the method's own inputs. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims therefore remain independent of the paper's own outputs and rest on external benchmarks and conventional post-processing.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only input supplies no derivations, fitted constants, or new postulated entities; the approach rests on standard deep-learning assumptions and the explicit inference assumption of differing camera models for spliced regions.

pith-pipeline@v0.9.0 · 5693 in / 1177 out tokens · 36294 ms · 2026-05-25T14:40:54.850403+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

[1]

Agarwal and H

S. Agarwal and H. Farid. Photo forensics from JPEG dim- ples. In 2017 IEEE Workshop on Information Forensics and Security (WIFS), pages 1–6, 12 2017

work page 2017
[2]

J. H. Bappy, A. K. Roy-Chowdhury, J. Bunk, L. Nataraj, and B. S. Manjunath. Exploiting spatial structure for localizing manipulated image regions. In The IEEE International Con- ference on Computer Vision (ICCV), 10 2017

work page 2017
[3]

Barni, E

M. Barni, E. Nowroozi, and B. Tondi. Higher-order, adversary-aware, double JPEG-detection via selected train- ing on attacked samples. In 25th European Signal Process- ing Conference (EUSIPCO), pages 281 – 285, 08 2017

work page 2017
[4]

Bayar and M

B. Bayar and M. C. Stamm. Augmented convolutional fea- ture maps for robust CNN-based camera model identiﬁca- tion. In 2017 IEEE International Conference on Image Pro- cessing (ICIP), pages 4098–4102, 09 2017

work page 2017
[5]

Bayar and M

B. Bayar and M. C. Stamm. Constrained convolutional neu- ral networks: A new approach towards general purpose im- age manipulation detection. IEEE Transactions on Informa- tion Forensics and Security, 13(11):2691–2706, 11 2018

work page 2018
[6]

Bondi, L

L. Bondi, L. Barofﬁo, D. G¨uera, P. Bestagini, E. J. Delp, and S. Tubaro. First steps toward camera model identiﬁcation with convolutional neural networks. IEEE Signal Processing Letters, 24(3):259–263, 03 2017

work page 2017
[7]

Bondi, S

L. Bondi, S. Lameri, D. G ¨uera, P. Bestagini, E. Delp, and S. Tubaro. Tampering detection and localization through clustering of camera-based CNN features. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1855–1864, 07 2017

work page 2017
[8]

M. Chen, J. Fridrich, M. Goljan, and J. Luks. Determining image origin and integrity using sensor noise. Information Forensics and Security, IEEE Transactions on, 3:74 – 90, 04 2008

work page 2008
[9]

Cozzolino, G

D. Cozzolino, G. Poggi, and L. Verdoliva. Splicebuster: A new blind image splicing detector. In 2015 IEEE Inter- national Workshop on Information Forensics and Security (WIFS), pages 1–6, 11 2015

work page 2015
[10]

Cozzolino, G

D. Cozzolino, G. Poggi, and L. Verdoliva. Recasting residual-based local descriptors as convolutional neural net- works: An application to image forgery detection. In Pro- ceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security , pages 159–164, New York, NY , USA, 2017. ACM

work page 2017
[11]

Cozzolino, J

D. Cozzolino, J. Thies, A. R ¨ossler, C. Riess, M. Nießner, and L. Verdoliva. Forensictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv, 2018

work page 2018
[12]

Cozzolino and L

D. Cozzolino and L. Verdoliva. Noiseprint: a CNN-based camera model ﬁngerprint. arXiv, 2018

work page 2018
[13]

T. J. d. Carvalho, C. Riess, E. Angelopoulou, H. Pedrini, and A. d. R. Rocha. Exposing digital image forgeries by illumi- nation color classiﬁcation. IEEE Transactions on Informa- tion Forensics and Security, 8(7):1182–1194, 07 2013

work page 2013
[14]

Fiscus, H

J. Fiscus, H. Guan, Y . Lee, A. Yates, A. Delgado, D. Zhou, D. Joy, and A. Pereira. The 2017 Nimble Challenge Evalua- tion: Results and Future Directions, 2017

work page 2017
[15]

Fridrich and J

J. Fridrich and J. Kodovsky. Rich models for steganalysis of digital images. IEEE Transactions on Information Forensics and Security, 7(3):868–882, 06 2012

work page 2012
[16]

Gloe and R

T. Gloe and R. Bhme. The ‘Dresden Image Database’ for benchmarking digital image forensics. In Proceedings of the 25th Symposium On Applied Computing (ACM SAC 2010) , volume 2, pages 1585–1591, 2010

work page 2010
[17]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 06 2016

work page 2016
[18]

M. Huh, A. Liu, A. Owens, and A. A. Efros. Fighting fake news: Image splice detection via learned self-consistency. In V . Ferrari, M. Hebert, C. Sminchisescu, and Y . Weiss, edi- tors, Computer Vision – ECCV, pages 106–124, Cham, 2018. Springer International Publishing

work page 2018
[19]

Lukas, J

J. Lukas, J. Fridrich, and M. Goljan. Digital camera iden- tiﬁcation from sensor pattern noise. IEEE Transactions on Information Forensics and Security, 1(2):205–214, 06 2006

work page 2006
[20]

F. Maes, D. Vandermeulen, and P. Suetens. Medical image registration using mutual information. Proceedings of the IEEE, 91(10):1699–1722, 10 2003

work page 2003
[21]

Mayer and M

O. Mayer and M. C. Stamm. Learned forensic source sim- ilarity for unknown camera models. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE SigPort, 2018

work page 2018
[22]

A. C. Popescu and H. Farid. Exposing digital forgeries in color ﬁlter array interpolated images. IEEE Transactions on Signal Processing, 53(10):3948–3959, 10 2005

work page 2005
[23]

T. Qiao, F. Retraint, R. Cogranne, and T. H. Thai. Individ- ual camera device identiﬁcation from JPEG images. Signal Processing: Image Communication, 52:74 – 86, 2017

work page 2017
[24]

R ¨ossler, D

A. R ¨ossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner. Faceforensics++: Learning to detect ma- nipulated facial images. arXiv, 2019

work page 2019
[25]

Salloum, Y

R. Salloum, Y . Ren, and C.-C. J. Kuo. Image splicing localization using a multi-task fully convolutional network (MFCN). Journal of Visual Communication and Image Rep- resentation, 51:201 – 209, 2018

work page 2018
[26]

San Choi, E

K. San Choi, E. Lam, and K. Wong. Source camera identi- ﬁcation by JPEG compression statistics for image forensics. In IEEE Region Conf. TENCON, pages 1 – 4, 12 2006

work page 2006
[27]

Zampoglou, S

M. Zampoglou, S. Papadopoulos, and I. Kompatsiaris. Large-scale evaluation of splicing localization algorithms for web images. Multimedia Tools and Applications, 09 2016

work page 2016
[28]

P. Zhou, X. Han, V . I. Morariu, and L. S. Davis. Learn- ing rich features for image manipulation detection. In The IEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), pages 1053–1061, 06 2018. 4328

work page 2018

[1] [1]

Agarwal and H

S. Agarwal and H. Farid. Photo forensics from JPEG dim- ples. In 2017 IEEE Workshop on Information Forensics and Security (WIFS), pages 1–6, 12 2017

work page 2017

[2] [2]

J. H. Bappy, A. K. Roy-Chowdhury, J. Bunk, L. Nataraj, and B. S. Manjunath. Exploiting spatial structure for localizing manipulated image regions. In The IEEE International Con- ference on Computer Vision (ICCV), 10 2017

work page 2017

[3] [3]

Barni, E

M. Barni, E. Nowroozi, and B. Tondi. Higher-order, adversary-aware, double JPEG-detection via selected train- ing on attacked samples. In 25th European Signal Process- ing Conference (EUSIPCO), pages 281 – 285, 08 2017

work page 2017

[4] [4]

Bayar and M

B. Bayar and M. C. Stamm. Augmented convolutional fea- ture maps for robust CNN-based camera model identiﬁca- tion. In 2017 IEEE International Conference on Image Pro- cessing (ICIP), pages 4098–4102, 09 2017

work page 2017

[5] [5]

Bayar and M

B. Bayar and M. C. Stamm. Constrained convolutional neu- ral networks: A new approach towards general purpose im- age manipulation detection. IEEE Transactions on Informa- tion Forensics and Security, 13(11):2691–2706, 11 2018

work page 2018

[6] [6]

Bondi, L

L. Bondi, L. Barofﬁo, D. G¨uera, P. Bestagini, E. J. Delp, and S. Tubaro. First steps toward camera model identiﬁcation with convolutional neural networks. IEEE Signal Processing Letters, 24(3):259–263, 03 2017

work page 2017

[7] [7]

Bondi, S

L. Bondi, S. Lameri, D. G ¨uera, P. Bestagini, E. Delp, and S. Tubaro. Tampering detection and localization through clustering of camera-based CNN features. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1855–1864, 07 2017

work page 2017

[8] [8]

M. Chen, J. Fridrich, M. Goljan, and J. Luks. Determining image origin and integrity using sensor noise. Information Forensics and Security, IEEE Transactions on, 3:74 – 90, 04 2008

work page 2008

[9] [9]

Cozzolino, G

D. Cozzolino, G. Poggi, and L. Verdoliva. Splicebuster: A new blind image splicing detector. In 2015 IEEE Inter- national Workshop on Information Forensics and Security (WIFS), pages 1–6, 11 2015

work page 2015

[10] [10]

Cozzolino, G

D. Cozzolino, G. Poggi, and L. Verdoliva. Recasting residual-based local descriptors as convolutional neural net- works: An application to image forgery detection. In Pro- ceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security , pages 159–164, New York, NY , USA, 2017. ACM

work page 2017

[11] [11]

Cozzolino, J

D. Cozzolino, J. Thies, A. R ¨ossler, C. Riess, M. Nießner, and L. Verdoliva. Forensictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv, 2018

work page 2018

[12] [12]

Cozzolino and L

D. Cozzolino and L. Verdoliva. Noiseprint: a CNN-based camera model ﬁngerprint. arXiv, 2018

work page 2018

[13] [13]

T. J. d. Carvalho, C. Riess, E. Angelopoulou, H. Pedrini, and A. d. R. Rocha. Exposing digital image forgeries by illumi- nation color classiﬁcation. IEEE Transactions on Informa- tion Forensics and Security, 8(7):1182–1194, 07 2013

work page 2013

[14] [14]

Fiscus, H

J. Fiscus, H. Guan, Y . Lee, A. Yates, A. Delgado, D. Zhou, D. Joy, and A. Pereira. The 2017 Nimble Challenge Evalua- tion: Results and Future Directions, 2017

work page 2017

[15] [15]

Fridrich and J

J. Fridrich and J. Kodovsky. Rich models for steganalysis of digital images. IEEE Transactions on Information Forensics and Security, 7(3):868–882, 06 2012

work page 2012

[16] [16]

Gloe and R

T. Gloe and R. Bhme. The ‘Dresden Image Database’ for benchmarking digital image forensics. In Proceedings of the 25th Symposium On Applied Computing (ACM SAC 2010) , volume 2, pages 1585–1591, 2010

work page 2010

[17] [17]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 06 2016

work page 2016

[18] [18]

M. Huh, A. Liu, A. Owens, and A. A. Efros. Fighting fake news: Image splice detection via learned self-consistency. In V . Ferrari, M. Hebert, C. Sminchisescu, and Y . Weiss, edi- tors, Computer Vision – ECCV, pages 106–124, Cham, 2018. Springer International Publishing

work page 2018

[19] [19]

Lukas, J

J. Lukas, J. Fridrich, and M. Goljan. Digital camera iden- tiﬁcation from sensor pattern noise. IEEE Transactions on Information Forensics and Security, 1(2):205–214, 06 2006

work page 2006

[20] [20]

F. Maes, D. Vandermeulen, and P. Suetens. Medical image registration using mutual information. Proceedings of the IEEE, 91(10):1699–1722, 10 2003

work page 2003

[21] [21]

Mayer and M

O. Mayer and M. C. Stamm. Learned forensic source sim- ilarity for unknown camera models. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE SigPort, 2018

work page 2018

[22] [22]

A. C. Popescu and H. Farid. Exposing digital forgeries in color ﬁlter array interpolated images. IEEE Transactions on Signal Processing, 53(10):3948–3959, 10 2005

work page 2005

[23] [23]

T. Qiao, F. Retraint, R. Cogranne, and T. H. Thai. Individ- ual camera device identiﬁcation from JPEG images. Signal Processing: Image Communication, 52:74 – 86, 2017

work page 2017

[24] [24]

R ¨ossler, D

A. R ¨ossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner. Faceforensics++: Learning to detect ma- nipulated facial images. arXiv, 2019

work page 2019

[25] [25]

Salloum, Y

R. Salloum, Y . Ren, and C.-C. J. Kuo. Image splicing localization using a multi-task fully convolutional network (MFCN). Journal of Visual Communication and Image Rep- resentation, 51:201 – 209, 2018

work page 2018

[26] [26]

San Choi, E

K. San Choi, E. Lam, and K. Wong. Source camera identi- ﬁcation by JPEG compression statistics for image forensics. In IEEE Region Conf. TENCON, pages 1 – 4, 12 2006

work page 2006

[27] [27]

Zampoglou, S

M. Zampoglou, S. Papadopoulos, and I. Kompatsiaris. Large-scale evaluation of splicing localization algorithms for web images. Multimedia Tools and Applications, 09 2016

work page 2016

[28] [28]

P. Zhou, X. Han, V . I. Morariu, and L. S. Davis. Learn- ing rich features for image manipulation detection. In The IEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), pages 1053–1061, 06 2018. 4328

work page 2018