Addressing Image Authenticity When Cameras Use Generative AI
Pith reviewed 2026-05-09 22:21 UTC · model grok-4.3
The pith
An image-specific MLP decoder with a modality encoder recovers the pre-hallucination version of camera images after generative AI processing in the ISP.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By optimizing an image-specific multi-layer perceptron (MLP) decoder together with a modality-specific encoder, the method recovers the image before hallucinated content was added from the final camera output. The encoder and MLP decoder are self-contained, require only 180 KB of storage, and can be embedded as metadata in standard formats such as JPEG and HEIC, allowing post-capture application without ISP access.
What carries the argument
An image-specific MLP decoder optimized jointly with a modality-specific encoder that inverts the effects of generative AI operations performed during image capture.
If this is right
- The recovered unhallucinated image helps avoid misinterpretation of semantic content introduced by AI enhancements.
- The approach operates post-capture on the final image alone, independent of the camera ISP.
- Minimal storage of 180 KB allows embedding the decoder and encoder in common image file formats.
- Applies to specific operations like AI-based digital zoom and low-light image enhancement.
Where Pith is reading between the lines
- This could lead to software tools that automatically restore original image details from enhanced camera photos.
- The method might extend to detecting and correcting other forms of generative modifications in imaging devices.
- Camera manufacturers could adopt this to provide optional authenticity recovery features in future models.
Load-bearing premise
The hallucinations added by generative operations in the camera ISP are sufficiently invertible from the final image alone through optimization of an image-specific MLP decoder with a modality encoder, without access to ground-truth pre-hallucination data or the ISP itself.
What would settle it
Comparing the MLP-recovered image against a known ground-truth pre-hallucination image (such as a non-AI zoomed capture) and finding that it does not reduce the difference metrics like perceptual loss compared to the original AI-processed image.
Figures
read the original abstract
The ability of generative AI (GenAI) methods to photorealistically alter camera images has raised awareness about the authenticity of images shared online. Interestingly, images captured directly by our cameras are considered authentic and faithful. However, with the increasing integration of deep-learning modules into cameras' capture-time hardware -- namely, the image signal processor (ISP) -- there is now a potential for hallucinated content in images directly output by our cameras. Hallucinated capture-time image content is typically benign, such as enhanced edges or texture, but in certain operations, such as AI-based digital zoom or low-light image enhancement, hallucinations can potentially alter the semantics and interpretation of the image content. As a result, users may not realize that the content in their camera images is not authentic. This paper addresses this issue by enabling users to recover the 'unhallucinated' version of the camera image to avoid misinterpretation of the image content. Our approach works by optimizing an image-specific multi-layer perceptron (MLP) decoder together with a modality-specific encoder so that, given the camera image, we can recover the image before hallucinated content was added. The encoder and MLP are self-contained and can be applied post-capture to the image without requiring access to the camera ISP. Moreover, the encoder and MLP decoder require only 180 KB of storage and can be readily saved as metadata within standard image formats such as JPEG and HEIC.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a post-capture method to recover the 'unhallucinated' version of images captured by cameras whose ISP incorporates generative AI modules. The approach optimizes an image-specific MLP decoder together with a modality-specific encoder using only the final camera image, enabling recovery of pre-hallucination content without direct ISP access. The encoder and decoder are claimed to require only 180 KB of storage and can be embedded as metadata in standard formats like JPEG and HEIC.
Significance. If the proposed optimization successfully inverts the effects of ISP-based generative AI operations such as AI zoom and low-light enhancement, the result would be significant for image forensics and authenticity verification in consumer devices. It offers a lightweight, self-contained solution that does not require manufacturer cooperation or access to proprietary ISP pipelines, potentially allowing widespread adoption for preserving original scene content.
major comments (2)
- [Abstract] Abstract: The central claim that an image-specific MLP decoder can recover the pre-hallucination image is presented without any quantitative results, validation experiments, error metrics, or comparisons to baselines. This leaves the invertibility assertion unsupported by evidence.
- [Abstract] Abstract: The optimization of the MLP decoder is described only at a high level; no loss function, objective, or training procedure is specified for the case where no ground-truth pre-hallucination image is available. This makes the mapping underconstrained, as many plausible pre-images are consistent with the observed output.
Simulated Author's Rebuttal
We thank the referee for their constructive comments and positive evaluation of the work's potential impact. We respond point-by-point to the major comments below and indicate the revisions we will incorporate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that an image-specific MLP decoder can recover the pre-hallucination image is presented without any quantitative results, validation experiments, error metrics, or comparisons to baselines. This leaves the invertibility assertion unsupported by evidence.
Authors: The abstract is intentionally concise and does not include numerical results. The full manuscript contains quantitative validation experiments, including error metrics such as PSNR and SSIM, along with comparisons to relevant baselines that support the invertibility claim. We will revise the abstract to include a brief summary of these key quantitative results. revision: yes
-
Referee: [Abstract] Abstract: The optimization of the MLP decoder is described only at a high level; no loss function, objective, or training procedure is specified for the case where no ground-truth pre-hallucination image is available. This makes the mapping underconstrained, as many plausible pre-images are consistent with the observed output.
Authors: The abstract provides a high-level overview, but the full manuscript details the self-supervised optimization, including the specific loss function (a combination of reconstruction consistency with the observed image and regularization terms to constrain the solution space) and the training procedure. This addresses the underconstrained nature by enforcing fidelity and preventing arbitrary pre-images. We will revise the abstract to briefly specify the loss function and objective. revision: yes
Circularity Check
No significant circularity; method is an independent optimization procedure
full rationale
The paper presents a post-capture optimization of an image-specific MLP decoder and modality-specific encoder to recover a pre-hallucination image from the final camera output. No equations, derivations, or first-principles claims are described that reduce the recovery result to a fitted parameter, self-referential definition, or self-citation chain. The approach is framed as an empirical optimization applied to the observed image alone, without any load-bearing step that renames or tautologically equates the output to the input by construction. The central claim therefore remains self-contained as a proposed method rather than a circular restatement of its own assumptions.
Axiom & Free-Parameter Ledger
free parameters (1)
- image-specific MLP weights
axioms (2)
- domain assumption Hallucinations from AI-based ISP operations are invertible using a modality-specific encoder and image-specific decoder without access to the original capture pipeline.
- domain assumption A modality-specific encoder can be trained or provided to enable the decoder to undo hallucinations across similar camera modes.
Reference graph
Works this paper leans on
-
[1]
DALL·E 3.https://www.openai.com/index/ dall-e-3/. Accessed: 2026-02-26. 1
work page 2026
-
[2]
Make your own deepfakes.https://deepfakesweb. com/. Accessed: 2026-02-26. 1
work page 2026
-
[3]
EU Artificial Intelligence Act.https : / / artificialintelligenceact.eu/high-level- summary/. Accessed: 2026-02-26. 1
work page 2026
-
[4]
Faceapp: Face editor.https://www.faceapp.com/. Accessed: 2026-02-26. 1
work page 2026
-
[5]
AI generated images on FB.https://about.fb.com/ news / 2024 / 02 / labeling - ai - generated - images - on - facebook - instagram - and - threads/. Accessed: 2026-02-26. 2
work page 2024
-
[6]
Adobe Firefly.https : / / www . adobe . com / products/firefly.html. Accessed: 2026-02-26. 1
work page 2026
-
[7]
google / technology / ai / google - gen - ai - content - transparency - c2pa/
GenAI content transparency on Google.https : / / blog . google / technology / ai / google - gen - ai - content - transparency - c2pa/. Accessed: 2026-02-26. 2
work page 2026
- [8]
-
[9]
Protecting world leaders against deep fakes
Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. Protecting world leaders against deep fakes. InCVPR workshops, 2019. 1
work page 2019
-
[10]
NTIRE 2017 chal- lenge on single image super-resolution: Dataset and study
Eirikur Agustsson and Radu Timofte. NTIRE 2017 chal- lenge on single image super-resolution: Dataset and study. InCVPR Workshops, 2017. 5, 6, 7, 8, 9, 10, 11
work page 2017
-
[11]
Michael S. Brown. Color processing for digital cameras. InFundamentals and Applications of Colour Engineering, chapter 4, pages 81–98. Wiley, 2023. 2
work page 2023
-
[12]
Simple baselines for image restoration
Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InECCV, 2022. 5, 6, 9
work page 2022
-
[13]
Mauricio Delbracio, Damien Kelly, Michael S. Brown, and Peyman Milanfar. Mobile computational photography: A tour.Annual Review of Vision Science, 7(1):571–604, 2021. 2
work page 2021
-
[14]
Tom Dobber, Nadia Metoui, Damian Trilling, Natali Hel- berger, and Claes De Vreese. Do (microtargeted) deepfakes have real effects on political attitudes?The International Journal of Press/Politics, 26(1):69–91, 2021. 1
work page 2021
-
[15]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Image super-resolution using deep convolutional net- works.IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2):295–307, 2016. 3
work page 2016
-
[16]
Brown, Radu Timofte, Karlo Ko ˇsˇcevi´c, Michael Freeman, Vasily Tesalin, Dmitry Bocharov, et al
Egor Ershov, Alex Savchik, Denis Shepelev, Nikola Bani ´c, Michael S. Brown, Radu Timofte, Karlo Ko ˇsˇcevi´c, Michael Freeman, Vasily Tesalin, Dmitry Bocharov, et al. NTIRE 2022 challenge on night photography rendering. InCVPR Workshops, 2022. 3
work page 2022
-
[17]
Image forgery detection.IEEE Signal Process- ing Magazine, 26(2):16–25, 2009
Hany Farid. Image forgery detection.IEEE Signal Process- ing Magazine, 26(2):16–25, 2009. 3
work page 2009
-
[18]
Creating, using, misusing, and detecting deep fakes.Journal of Online Trust and Safety, 1(4), 2022
Hany Farid. Creating, using, misusing, and detecting deep fakes.Journal of Online Trust and Safety, 1(4), 2022. 3
work page 2022
-
[19]
Implicit diffusion models for continuous super-resolution
Sicheng Gao, Xuhui Liu, Bohan Zeng, Sheng Xu, Yan- jing Li, Xiaoyan Luo, Jianzhuang Liu, Xiantong Zhen, and Baochang Zhang. Implicit diffusion models for continuous super-resolution. InCVPR, 2023. 2
work page 2023
-
[20]
Generative adversarial nets.NeurIPS, 2014
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.NeurIPS, 2014. 3
work page 2014
-
[21]
TruFor: Leveraging all-round clues for trustworthy image forgery detection and localiza- tion
Fabrizio Guillaro, Davide Cozzolino, Avneesh Sud, Nicholas Dufour, and Luisa Verdoliva. TruFor: Leveraging all-round clues for trustworthy image forgery detection and localiza- tion. InCVPR, 2023. 3
work page 2023
-
[22]
Michael Hameleers, Toni GLA van der Meer, and Tom Dob- ber. You won’t believe what they just said! The effects of political deepfakes embedded as vox populi on social media. Social Media+ Society, 8(3), 2022. 1
work page 2022
-
[23]
DRAW: Defending camera-shooted RAW against image manipulation
Xiaoxiao Hu, Qichao Ying, Zhenxing Qian, Sheng Li, and Xinpeng Zhang. DRAW: Defending camera-shooted RAW against image manipulation. InICCV, 2023. 3
work page 2023
-
[24]
AIM 2019 chal- lenge on RAW to RGB mapping: Methods and results
Andrey Ignatov, Radu Timofte, Sung-Jea Ko, Seung-Wook Kim, Kwang-Hyun Uhm, Seo-Won Ji, Sung-Jin Cho, Jun- Pyo Hong, Kangfu Mei, Juncheng Li, et al. AIM 2019 chal- lenge on RAW to RGB mapping: Methods and results. In ICCV Workshops, 2019. 3
work page 2019
-
[25]
AIM 2020 chal- lenge on learned image signal processing pipeline
Andrey Ignatov, Radu Timofte, Zhilu Zhang, Ming Liu, Haolin Wang, Wangmeng Zuo, Jiawei Zhang, Ruimao Zhang, Zhanglin Peng, Sijie Ren, et al. AIM 2020 chal- lenge on learned image signal processing pipeline. InECCV Workshops, 2020
work page 2020
-
[26]
Learned smartphone ISP on mobile NPUs with deep learning, mobile AI 2021 chal- lenge: Report
Andrey Ignatov, Cheng-Ming Chiang, Hsien-Kai Kuo, Anas- tasia Sycheva, and Radu Timofte. Learned smartphone ISP on mobile NPUs with deep learning, mobile AI 2021 chal- lenge: Report. InCVPR Workshops, 2021. 3
work page 2021
-
[27]
Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, Jingang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, et al. Efficient and accurate quantized image super-resolution on mobile NPUs, mobile AI & AIM 2022 challenge: report. In ECCV Workshops, 2022. 3
work page 2022
-
[28]
Learned smartphone ISP on mobile GPUs with deep learning, mobile AI & AIM 2022 challenge: Report
Andrey Ignatov, Radu Timofte, Shuai Liu, Chaoyu Feng, Fu- rui Bai, Xiaotao Wang, Lei Lei, Ziyao Yi, Yan Xiang, Zibin Liu, et al. Learned smartphone ISP on mobile GPUs with deep learning, mobile AI & AIM 2022 challenge: Report. In ECCV Workshops, 2022. 3
work page 2022
-
[29]
AutoDIR: Automatic all-in-one image restoration with latent diffusion
Yitong Jiang, Zhaoyang Zhang, Tianfan Xue, and Jinwei Gu. AutoDIR: Automatic all-in-one image restoration with latent diffusion. InECCV, 2024. 6
work page 2024
-
[30]
Perceptual losses for real-time style transfer and super-resolution
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016. 3
work page 2016
-
[31]
Adam: A method for stochastic optimization.arXiv, 2014
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv, 2014. 6
work page 2014
-
[32]
Paweł Korus. Digital image integrity – a survey of protection and verification techniques.Digital Signal Processing, 71:1– 26, 2017. 3
work page 2017
-
[33]
Pawel Korus and Nasir Memon. Content authentication for neural imaging pipelines: End-to-end optimization of photo provenance in complex distribution channels. InCVPR,
-
[34]
GamutMLP: A lightweight MLP for color loss re- covery
Hoang M Le, Brian Price, Scott Cohen, and Michael S Brown. GamutMLP: A lightweight MLP for color loss re- covery. InCVPR, 2023. 4
work page 2023
-
[35]
Photo- realistic single image super-resolution using a generative ad- versarial network
Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo- realistic single image super-resolution using a generative ad- versarial network. InCVPR, 2017. 3
work page 2017
-
[36]
Metadata- based RAW reconstruction via implicit neural functions
Leyi Li, Huijie Qiao, Qi Ye, and Qinmin Yang. Metadata- based RAW reconstruction via implicit neural functions. In CVPR, 2023. 4
work page 2023
-
[37]
Learning generative structure prior for blind text image super-resolution
Xiaoming Li, Wangmeng Zuo, and Chen Change Loy. Learning generative structure prior for blind text image super-resolution. InCVPR, 2023. 3, 5, 7
work page 2023
-
[38]
Zhetong Liang, Jianrui Cai, Zisheng Cao, and Lei Zhang. CameraNet: A two-stage framework for effective camera ISP learning.IEEE Transactions on Image Processing, 30: 2248–2262, 2021. 3, 4
work page 2021
-
[39]
Deep-FlexISP: A three- stage framework for night photography rendering
Shuai Liu, Chaoyu Feng, Xiaotao Wang, Hao Wang, Ran Zhu, Yongqiang Li, and Lei Lei. Deep-FlexISP: A three- stage framework for night photography rendering. InCVPR Workshops, 2022. 3, 4
work page 2022
-
[40]
Srinivasan, Matthew Tancik, Jonathan T
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. NeRF: Representing scenes as neural radiance fields for view syn- thesis. InECCV, 2020. 2, 3, 5, 6, 7
work page 2020
-
[41]
Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a multires- olution hash encoding.ACM Transactions on Graphics, 41 (4):1–15, 2022. 3, 5, 6, 7, 9
work page 2022
-
[42]
Seonghyeon Nam, Abhijith Punnappurath, Marcus A. Brubaker, and Michael S. Brown. Learning sRGB-to-raw- RGB de-rendering with content-aware metadata. InCVPR,
-
[43]
Rang M. H. Nguyen and Michael S. Brown. RAW image re- construction using a self-contained sRGB-JPEG image with only 64 KB overhead. InCVPR, 2016. 4
work page 2016
-
[44]
Thanh Thi Nguyen, Quoc Viet Hung Nguyen, Dung Tien Nguyen, Duc Thanh Nguyen, Thien Huynh-The, Saeid Nahavandi, Thanh Tam Nguyen, Quoc-Viet Pham, and Cuong M Nguyen. Deep learning for deepfakes creation and detection: A survey.Computer Vision and Image Under- standing, 223:103525, 2022. 3
work page 2022
-
[45]
Huseyin Ozkaya and Veysel Aslantas. A triple self- embedding fragile watermarking scheme for image tamper detection and recovery.IEEE Access, 12:140082–140096,
-
[46]
Statistical tools for digi- tal forensics
Alin C Popescu and Hany Farid. Statistical tools for digi- tal forensics. InThe International Workshop on Information Hiding, 2004. 3
work page 2004
-
[47]
Abhijith Punnappurath, Luxi Zhao, Abdelrahman Abdel- hamed, and Michael S. Brown. Advocating pixel-level au- thentication of camera-captured images.IEEE Access, 12: 45839–45846, 2024. 2, 3, 4, 5, 9
work page 2024
-
[48]
NTIRE 2023 challenge on night photography rendering
Alina Shutova, Egor Ershov, Georgy Perevozchikov, Ivan Er- makov, Nikola Bani´c, Radu Timofte, Richard Collins, Maria Efimova, Arseniy Terekhin, Simone Zini, et al. NTIRE 2023 challenge on night photography rendering. InCVPR Work- shops, 2023. 3
work page 2023
-
[49]
Implicit neural representa- tions with periodic activation functions.NeurIPS, 2020
Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. Implicit neural representa- tions with periodic activation functions.NeurIPS, 2020. 2, 3, 5, 6, 7
work page 2020
-
[50]
Implicit neural representations for image compression
Yannick Str ¨umpler, Janis Postels, Ren Yang, Luc Van Gool, and Federico Tombari. Implicit neural representations for image compression. InECCV, 2022. 2, 4
work page 2022
-
[51]
Cristian Vaccari and Andrew Chadwick. Deepfakes and disinformation: Exploring the impact of synthetic political video on deception, uncertainty, and trust in news.Social media+ society, 6(1), 2020. 1
work page 2020
-
[52]
Luisa Verdoliva. Media forensics and deepfakes: an overview.IEEE Journal of Selected Topics in Signal Pro- cessing, 14(5):910–932, 2020. 3
work page 2020
-
[53]
Real-ESRGAN: Training real-world blind super-resolution with pure synthetic data
Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. Real-ESRGAN: Training real-world blind super-resolution with pure synthetic data. InICCV Workshops, 2021. 3, 5, 7, 9
work page 2021
-
[54]
Raw image reconstruc- tion with learned compact metadata
Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C Kot, and Bihan Wen. Raw image reconstruc- tion with learned compact metadata. InCVPR, 2023. 4
work page 2023
-
[55]
SinSR: Diffusion-based image super- resolution in a single step
Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C Kot, and Bihan Wen. SinSR: Diffusion-based image super- resolution in a single step. InCVPR, 2024. 3
work page 2024
-
[56]
Deep Retinex decomposition for low-light enhancement
Chen Wei, Wenjing Wang, Wenhan Yang, and Jiaying Liu. Deep Retinex decomposition for low-light enhancement. In BMVC, 2018. 5, 6, 7, 8
work page 2018
-
[57]
Qichao Ying, Hang Zhou, Zhenxing Qian, Sheng Li, and Xinpeng Zhang. Learning to immunize images for tamper localization and self-recovery.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 3
work page 2023
-
[58]
ResShift: Efficient diffusion model for image super- resolution by residual shifting.NeurIPS, 2024
Zongsheng Yue, Jianyi Wang, and Chen Change Loy. ResShift: Efficient diffusion model for image super- resolution by residual shifting.NeurIPS, 2024. 3
work page 2024
-
[59]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, 2018. 3
work page 2018
-
[60]
EditGuard: Versatile image watermarking for tamper localization and copyright protection
Xuanyu Zhang, Runyi Li, Jiwen Yu, Youmin Xu, Weiqi Li, and Jian Zhang. EditGuard: Versatile image watermarking for tamper localization and copyright protection. InCVPR,
-
[61]
Diffusion-based blind text image super-resolution
Yuzhe Zhang, Jiawei Zhang, Hao Li, Zhouxia Wang, Luwei Hou, Dongqing Zou, and Liheng Bian. Diffusion-based blind text image super-resolution. InCVPR, 2024. 3
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.