Breaking the Resolution Barrier: Arbitrary-resolution Deep Image Steganography Framework

Boyu Wang; Chi Wang; Xiang Zhang; Xinjue Hu; Zhangjie Fu; Zhenshan Tan

arxiv: 2601.15739 · v2 · submitted 2026-01-22 · 💻 cs.CV

Breaking the Resolution Barrier: Arbitrary-resolution Deep Image Steganography Framework

Xinjue Hu , Chi Wang , Boyu Wang , Xiang Zhang , Zhenshan Tan , Zhangjie Fu This is my paper

Pith reviewed 2026-05-16 12:10 UTC · model grok-4.3

classification 💻 cs.CV

keywords deep image steganographyarbitrary resolutionfrequency decouplingimplicit neural representationblind recoveryhigh-frequency latentcontinuous reconstruction

0 comments

The pith

ARDIS allows hiding a secret image in a cover of fixed resolution and recovering it at any original resolution by decoupling global structure from high-frequency details.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that current deep steganography forces secret and cover images to share the same size, which forces resampling and loses detail when sizes differ. ARDIS instead splits the secret into a resolution-matched global basis plus a compact high-frequency latent code that is hidden in the cover. At extraction, an implicit neural function uses the latent code to add back the missing high-frequency residuals at whatever resolution the user requests, and it also decodes the target resolution from the hidden data itself. This removes the need to know or match resolutions in advance and yields better visual invisibility than prior fixed-resolution methods.

Core claim

The ARDIS framework performs frequency decoupling in the hiding stage to separate a secret image into a global basis aligned with the cover resolution and a resolution-agnostic high-frequency latent code. These are embedded together in a fixed-resolution cover. Recovery employs a latent-guided implicit reconstructor in which the hidden latent modulates a continuous implicit function that queries and renders the high-frequency residuals onto the recovered global basis at any desired output resolution. An implicit resolution coding step further embeds the discrete resolution value as dense feature maps in redundant feature space, enabling fully blind decoding of both the secret content and its

What carries the argument

Frequency Decoupling Architecture paired with Latent-Guided Implicit Reconstructor that modulates a continuous implicit function with a resolution-agnostic high-frequency latent code.

If this is right

Secret images of any size can be hidden without forced downsampling or upsampling before embedding.
The receiver can output the secret at its native resolution even when that resolution is unknown at hiding time.
Cross-resolution recovery fidelity exceeds that of existing fixed-resolution deep steganography methods.
The same cover image can support multiple secret images whose resolutions differ from each other and from the cover.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The continuous reconstruction step could be extended to allow the receiver to request super-resolved versions of the secret beyond its original sampling.
The same frequency-decoupling plus implicit-modulation pattern might apply directly to video or 3-D data where frame or voxel resolutions vary.
Because resolution information travels in the redundant feature domain, the method could be combined with other capacity-enhancing techniques without changing the core architecture.

Load-bearing premise

The high-frequency latent code extracted from the steganographic image can be used by the implicit reconstructor to faithfully restore original details at arbitrary resolutions without significant information loss from the initial decoupling step.

What would settle it

Measure PSNR and SSIM on recovered secret images whose original resolution differs from the cover by a factor of four or more; if average fidelity falls below the levels reported for same-resolution baselines, the claim of faithful arbitrary-resolution recovery does not hold.

Figures

Figures reproduced from arXiv: 2601.15739 by Boyu Wang, Chi Wang, Xiang Zhang, Xinjue Hu, Zhangjie Fu, Zhenshan Tan.

**Figure 2.** Figure 2: Overview of the proposed ARDIS, which supports hiding and revealing secret images at arbitrary resolutions. Song et al., 2020] have been introduced into the steganography domain due to their powerful capability in modeling data distributions. CRoSS [Yu et al., 2023] leverages the randomness in the generation of diffusion models to design a training-free steganography scheme. DiffStega [Yang et al., 2024]… view at source ↗

**Figure 3.** Figure 3: Visual comparisons of our ARDIS with leading deep image steganography methods for stego and recovering secret images in [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: The visual comparison of ARDIS and the diffusion-based DIS method on the Stego260 dataset. The diffusion-based DIS method [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 6.** Figure 6: Impact of latent guidance on detail reconstruction. With [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 5.** Figure 5: Steganalysis accuracy by SRNet. The fact that the curve [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Deep image steganography (DIS) has achieved significant results in capacity and invisibility. However, current paradigms enforce the secret image to maintain the same resolution as the cover image during hiding and revealing. This leads to two challenges: secret images with inconsistent resolutions must undergo resampling beforehand which results in detail loss during recovery, and the secret image cannot be recovered to its original resolution when the resolution value is unknown. To address these, we propose ARDIS, the first Arbitrary Resolution DIS framework, which shifts the paradigm from discrete mapping to reference-guided continuous signal reconstruction. Specifically, to minimize the detail loss caused by resolution mismatch, we first design a Frequency Decoupling Architecture in hiding stage. It disentangles the secret into a resolution-aligned global basis and a resolution-agnostic high-frequency latent to hide in a fixed-resolution cover. Second, for recovery, we propose a Latent-Guided Implicit Reconstructor to perform deterministic restoration. The recovered detail latent code modulates a continuous implicit function to accurately query and render high-frequency residuals onto the recovered global basis, ensuring faithful restoration of original details. Furthermore, to achieve blind recovery, we introduce an Implicit Resolution Coding strategy. By transforming discrete resolution values into dense feature maps and hiding them in the redundant space of the feature domain, the reconstructor can correctly decode the secret's resolution directly from the steganographic representation. Experimental results demonstrate that ARDIS significantly outperforms state-of-the-art methods in both invisibility and cross-resolution recovery fidelity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ARDIS gives a concrete way to hide and recover secret images at arbitrary resolutions by splitting frequencies and using an implicit reconstructor, but the high-frequency latent's sufficiency for unseen scales is the part that still needs evidence.

read the letter

The paper's main move is to drop the fixed-resolution requirement in deep image steganography. Instead of resampling the secret to match the cover, they split it in the hiding stage into a resolution-aligned global basis and a resolution-agnostic high-frequency latent, hide the latent in the cover, then recover with a latent-guided implicit function that renders details at any queried resolution. They also hide the resolution value itself as a dense feature map so recovery can be blind. Those three pieces—Frequency Decoupling Architecture, Latent-Guided Implicit Reconstructor, and Implicit Resolution Coding—are not in the prior fixed-resolution DIS work the abstract cites, so the claim of being first on arbitrary resolution holds up as new on the surface.

Referee Report

3 major / 2 minor

Summary. The paper introduces ARDIS, the first arbitrary-resolution deep image steganography (DIS) framework. It replaces fixed-resolution discrete mapping with a Frequency Decoupling Architecture that splits the secret image into a resolution-aligned global basis and a resolution-agnostic high-frequency latent code hidden inside a fixed-resolution cover; a Latent-Guided Implicit Reconstructor then uses the recovered latent to modulate a continuous implicit function for detail restoration at any query resolution; an Implicit Resolution Coding scheme embeds the secret resolution as dense feature maps for blind recovery. The authors claim that ARDIS significantly outperforms prior SOTA methods in both steganographic invisibility and cross-resolution recovery fidelity.

Significance. If the central architectural claims are supported by rigorous quantitative results and analysis, the work would address a long-standing practical limitation in DIS by enabling secret images of arbitrary and unknown resolutions without resampling-induced loss. The use of implicit continuous reconstruction and frequency decoupling represents a genuine paradigm shift with potential impact on flexible steganography applications; however, the current manuscript provides only high-level architectural descriptions without equations, training details, metrics, or ablations, so the significance cannot yet be assessed.

major comments (3)

[Abstract / Frequency Decoupling Architecture description] The central claim that the Frequency Decoupling Architecture produces a resolution-agnostic high-frequency latent sufficient for faithful arbitrary-resolution recovery is load-bearing, yet no reconstruction-error bound, invertibility argument, or ablation is supplied to demonstrate that high-frequency residuals orthogonal to the latent code remain negligible (see skeptic note on information loss from the initial split).
[Abstract / Experimental results claim] The abstract asserts significant outperformance over SOTA in both invisibility and cross-resolution fidelity, but no quantitative metrics (PSNR, SSIM, bit-error rates), training protocols, dataset details, or results on resolutions outside the training distribution are provided, preventing verification that the Latent-Guided Implicit Reconstructor generalizes rather than overfitting to fixed test cases.
[Latent-Guided Implicit Reconstructor description] The Latent-Guided Implicit Reconstructor is described only at the level of 'modulates a continuous implicit function'; missing are the precise network architecture, modulation mechanism, loss functions, and any analysis showing that the recovered global basis plus latent code suffice for detail restoration at unseen resolutions.

minor comments (2)

[Abstract] The acronym DIS is used before being defined; ARDIS should be introduced with its full expansion on first use.
[Abstract] The phrase 'reference-guided continuous signal reconstruction' appears without citation to prior implicit neural representation literature or clarification of how the reference is obtained.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. These have identified key areas where the manuscript requires expanded technical detail, quantitative support, and analysis to substantiate the claims of ARDIS. We have prepared a major revision that incorporates the requested information while preserving the core contributions. Our point-by-point responses are provided below.

read point-by-point responses

Referee: [Abstract / Frequency Decoupling Architecture description] The central claim that the Frequency Decoupling Architecture produces a resolution-agnostic high-frequency latent sufficient for faithful arbitrary-resolution recovery is load-bearing, yet no reconstruction-error bound, invertibility argument, or ablation is supplied to demonstrate that high-frequency residuals orthogonal to the latent code remain negligible (see skeptic note on information loss from the initial split).

Authors: We agree that the current high-level description leaves the central claim under-supported. In the revised manuscript we will add a formal mathematical formulation of the Frequency Decoupling Architecture, including the explicit decomposition into resolution-aligned global basis and resolution-agnostic high-frequency latent. We will supply an invertibility argument based on frequency orthogonality and report empirical reconstruction-error bounds obtained across multiple resolution pairs. An ablation study quantifying the contribution of the high-frequency latent (with and without it) will also be included to address potential information loss. revision: yes
Referee: [Abstract / Experimental results claim] The abstract asserts significant outperformance over SOTA in both invisibility and cross-resolution fidelity, but no quantitative metrics (PSNR, SSIM, bit-error rates), training protocols, dataset details, or results on resolutions outside the training distribution are provided, preventing verification that the Latent-Guided Implicit Reconstructor generalizes rather than overfitting to fixed test cases.

Authors: The full experimental section contains quantitative evaluations, yet we acknowledge that the presentation in the abstract and early sections is insufficiently detailed. The revision will explicitly report PSNR, SSIM, and bit-error rates for both invisibility and cross-resolution recovery, together with training protocols (optimizer, learning-rate schedule, batch size) and dataset specifications. We will add dedicated experiments on resolutions outside the training distribution to demonstrate generalization of the Latent-Guided Implicit Reconstructor. revision: yes
Referee: [Latent-Guided Implicit Reconstructor description] The Latent-Guided Implicit Reconstructor is described only at the level of 'modulates a continuous implicit function'; missing are the precise network architecture, modulation mechanism, loss functions, and any analysis showing that the recovered global basis plus latent code suffice for detail restoration at unseen resolutions.

Authors: We will expand the description of the Latent-Guided Implicit Reconstructor with the precise network architecture (MLP layers and hidden dimensions), the modulation mechanism (feature-wise linear modulation of the implicit function by the recovered latent code), and the complete loss functions (pixel-wise L1, perceptual, and adversarial terms). Supporting analysis and ablations will be added to show that the combination of recovered global basis and latent code enables faithful detail restoration at query resolutions unseen during training. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents ARDIS as a new architectural framework consisting of independently motivated components (Frequency Decoupling Architecture, Latent-Guided Implicit Reconstructor, Implicit Resolution Coding) that are described via design choices and empirical results rather than any closed mathematical derivation. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation, or input by construction; the central claims rest on the proposed network structures and reported performance metrics, which are externally falsifiable. This is the normal case of an engineering paper whose novelty lies in the architecture itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on the unproven effectiveness of frequency decoupling and implicit continuous reconstruction for lossless detail recovery; these are presented as design choices rather than derived results.

axioms (2)

domain assumption Secret images can be disentangled into a resolution-aligned global basis and a resolution-agnostic high-frequency latent without irreversible information loss
Invoked in the description of the Frequency Decoupling Architecture in the hiding stage.
domain assumption A continuous implicit function modulated by the recovered latent code can accurately render high-frequency residuals at arbitrary resolutions
Central to the Latent-Guided Implicit Reconstructor for recovery.

invented entities (2)

Frequency Decoupling Architecture no independent evidence
purpose: Disentangle secret into global basis and high-frequency latent for hiding
New component introduced to handle resolution mismatch
Latent-Guided Implicit Reconstructor no independent evidence
purpose: Use latent code to guide continuous reconstruction of details
New component for arbitrary-resolution recovery

pith-pipeline@v0.9.0 · 5576 in / 1515 out tokens · 45381 ms · 2026-05-16T12:10:38.173080+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Frequency Decoupling Architecture ... disentangles the secret into a resolution-aligned global basis and a resolution-agnostic high-frequency latent

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 3 internal anchors

[1]

Ntire 2017 challenge on single image super- resolution: Dataset and study

[Agustsson and Timofte, 2017] Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super- resolution: Dataset and study. InProceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 126–135,

work page 2017
[2]

Matting human datasets

[AISegment.cn, 2019] AISegment.cn. Matting human datasets. https://github.com/aisegmentcn/matting human datasets,

work page 2019
[3]

Hiding images in plain sight: Deep steganography.Advances in neural informa- tion processing systems, 30,

[Baluja, 2017] Shumeet Baluja. Hiding images in plain sight: Deep steganography.Advances in neural informa- tion processing systems, 30,

work page 2017
[4]

Hiding images within im- ages.IEEE transactions on pattern analysis and machine intelligence, 42(7):1685–1697,

[Baluja, 2019] Shumeet Baluja. Hiding images within im- ages.IEEE transactions on pattern analysis and machine intelligence, 42(7):1685–1697,

work page 2019
[5]

Animal im- age dataset (90 different animals)

[Banerjee, 2022] Sourav Banerjee. Animal im- age dataset (90 different animals). https: //www.kaggle.com/datasets/iamsouravbanerjee/ animal-image-dataset-90-different-animals,

work page 2022
[6]

Deep residual network for steganalysis of digital images.IEEE Transactions on Information Foren- sics and Security, 14(5):1181–1193,

[Boroumandet al., 2018 ] Mehdi Boroumand, Mo Chen, and Jessica Fridrich. Deep residual network for steganalysis of digital images.IEEE Transactions on Information Foren- sics and Security, 14(5):1181–1193,

work page 2018
[7]

Learning continuous image representation with local implicit image function

[Chenet al., 2021 ] Yinbo Chen, Sifei Liu, and Xiaolong Wang. Learning continuous image representation with local implicit image function. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8628–8638,

work page 2021
[8]

NICE: Non-linear Independent Components Estimation

[Dinhet al., 2014 ] Laurent Dinh, David Krueger, and Yoshua Bengio. Nice: Non-linear independent compo- nents estimation.arXiv preprint arXiv:1410.8516,

work page internal anchor Pith review Pith/arXiv arXiv 2014
[9]

Density estimation using Real NVP

[Dinhet al., 2016 ] Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp.arXiv preprint arXiv:1605.08803,

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

Reversible image steganography scheme based on a u-net structure.Ieee Access, 7:9314–9323,

[Duanet al., 2019 ] Xintao Duan, Kai Jia, Baoxia Li, Daidou Guo, En Zhang, and Chuan Qin. Reversible image steganography scheme based on a u-net structure.Ieee Access, 7:9314–9323,

work page 2019
[11]

Densejin: Dense depth image steganography model with joint invertible and noninvertible mechanisms.IEEE Transactions on Circuits and Systems for Video Technol- ogy,

[Duanet al., 2024 ] Delin Duan, Shuyuan Shen, Songsen Yu, Yibo Yuan, Qidong Zhou, Haojie Lv, and Huanjie Lin. Densejin: Dense depth image steganography model with joint invertible and noninvertible mechanisms.IEEE Transactions on Circuits and Systems for Video Technol- ogy,

work page 2024
[12]

Generating steganographic images via adversar- ial training.Advances in neural information processing systems, 30,

[Hayes and Danezis, 2017] Jamie Hayes and George Danezis. Generating steganographic images via adversar- ial training.Advances in neural information processing systems, 30,

work page 2017
[13]

Hinet: Deep image hiding by invertible network

[Jinget al., 2021 ] Junpeng Jing, Xin Deng, Mai Xu, Jianyi Wang, and Zhenyu Guan. Hinet: Deep image hiding by invertible network. InProceedings of the IEEE/CVF in- ternational conference on computer vision, pages 4733– 4742,

work page 2021
[14]

Stegformer: Rebuilding the glory of autoencoder-based steganography

[Keet al., 2024 ] Xiao Ke, Huanqi Wu, and Wenzhong Guo. Stegformer: Rebuilding the glory of autoencoder-based steganography. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 2723–2731,

work page 2024
[15]

Multi-patch learning: look- ing more pixels in the training phase

[Liet al., 2022 ] Lei Li, Jingzhu Tang, Ming Chen, Shijie Zhao, Junlin Li, and Li Zhang. Multi-patch learning: look- ing more pixels in the training phase. InEuropean Confer- ence on Computer Vision, pages 549–560. Springer,

work page 2022
[16]

Lidinet: A lightweight deep in- vertible network for image-in-image steganography.IEEE Transactions on Information Forensics and Security,

[Liet al., 2024 ] Fengyong Li, Yang Sheng, Kui Wu, Chuan Qin, and Xinpeng Zhang. Lidinet: A lightweight deep in- vertible network for image-in-image steganography.IEEE Transactions on Information Forensics and Security,

work page 2024
[17]

Microsoft coco: Com- mon objects in context

[Linet al., 2014 ] Tsung-Yi Lin, Michael Maire, Serge Be- longie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Com- mon objects in context. InEuropean conference on com- puter vision, pages 740–755. Springer,

work page 2014
[18]

Fearless of noise: Robust image-in- image hiding using dual-tree complex wavelet transform and state space model.IEEE Transactions on Circuits and Systems for Video Technology,

[Liuet al., 2025 ] Hao Liu, Fengyong Li, Chuan Qin, and Xinpeng Zhang. Fearless of noise: Robust image-in- image hiding using dual-tree complex wavelet transform and state space model.IEEE Transactions on Circuits and Systems for Video Technology,

work page 2025
[19]

Large-capacity image steganography based on invertible neural networks

[Luet al., 2021 ] Shao-Ping Lu, Rong Wang, Tao Zhong, and Paul L Rosin. Large-capacity image steganography based on invertible neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10816–10825,

work page 2021
[20]

Stegmamba: Distortion-free immune-cover for multi- image steganography with state space model.IEEE Trans- actions on Circuits and Systems for Video Technology,

[Luoet al., 2024 ] Ting Luo, Yuhang Zhou, Zhouyan He, Gangyi Jiang, Haiyong Xu, Shuren Qi, and Yushu Zhang. Stegmamba: Distortion-free immune-cover for multi- image steganography with state space model.IEEE Trans- actions on Circuits and Systems for Video Technology,

work page 2024
[21]

End-to-end trained cnn encoder-decoder networks for im- age steganography

[Rahimet al., 2018 ] Rafia Rahim, Shahroz Nadeem, et al. End-to-end trained cnn encoder-decoder networks for im- age steganography. InProceedings of the European con- ference on computer vision (ECCV) workshops, pages 0–0,

work page 2018
[22]

Denoising Diffusion Implicit Models

[Songet al., 2020 ] Jiaming Song, Chenlin Meng, and Ste- fano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502,

work page internal anchor Pith review Pith/arXiv arXiv 2020
[23]

Reversible gans for memory-efficient image-to-image translation

[van der Ouderaa and Worrall, 2019] Tycho FA van der Ouderaa and Daniel E Worrall. Reversible gans for memory-efficient image-to-image translation. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4720–4728,

work page 2019
[24]

Sshr: More secure generative steganography with high-quality revealed secret images

[Wanget al., 2025 ] Jiannian Wang, Yao Lu, and Guangming Lu. Sshr: More secure generative steganography with high-quality revealed secret images. InForty-second In- ternational Conference on Machine Learning,

work page 2025
[25]

Secret communication using multi-image steganography for mil- itary purposes.International Journal of Advanced Re- search in Science, Communication and Technology, 2,

[Waniet al., 2022 ] Pratik Wani, Anuja Nanaware, Sneha Shirode, Aishwarya Suram, and Archana Jadhav. Secret communication using multi-image steganography for mil- itary purposes.International Journal of Advanced Re- search in Science, Communication and Technology, 2,

work page 2022
[26]

High-capacity convolutional video steganog- raphy with temporal residual modeling

[Wenget al., 2019 ] Xinyu Weng, Yongzhi Li, Lu Chi, and Yadong Mu. High-capacity convolutional video steganog- raphy with temporal residual modeling. InProceedings of the 2019 on international conference on multimedia re- trieval, pages 87–95,

work page 2019
[27]

Robust invertible image steganogra- phy

[Xuet al., 2022 ] Youmin Xu, Chong Mou, Yujie Hu, Jingfen Xie, and Jian Zhang. Robust invertible image steganogra- phy. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 7875–7884,

work page 2022
[28]

Diffstega: towards universal training-free coverless image steganography with diffusion models

[Yanget al., 2024 ] Yiwei Yang, Zheyuan Liu, Jun Jia, Zhongpai Gao, Yunhao Li, Wei Sun, Xiaohong Liu, and Guangtao Zhai. Diffstega: towards universal training-free coverless image steganography with diffusion models. In Proceedings of the Thirty-Third International Joint Con- ference on Artificial Intelligence, pages 1579–1587,

work page 2024
[29]

Cross: Diffusion model makes controllable, robust and secure image steganography.Advances in Neural Information Processing Systems, 36:80730–80743,

[Yuet al., 2023 ] Jiwen Yu, Xuanyu Zhang, Youmin Xu, and Jian Zhang. Cross: Diffusion model makes controllable, robust and secure image steganography.Advances in Neural Information Processing Systems, 36:80730–80743,

work page 2023
[30]

Attention based data hiding with gen- erative adversarial networks

[Yu, 2020] Chong Yu. Attention based data hiding with gen- erative adversarial networks. InProceedings of the AAAI conference on artificial intelligence, volume 34, pages 1120–1128,

work page 2020
[31]

Udh: Universal deep hiding for steganography, watermarking, and light field messaging.Advances in Neural Information Process- ing Systems, 33:10223–10234,

[Zhanget al., 2020 ] Chaoning Zhang, Philipp Benz, Adil Karjauv, Geng Sun, and In So Kweon. Udh: Universal deep hiding for steganography, watermarking, and light field messaging.Advances in Neural Information Process- ing Systems, 33:10223–10234,

work page 2020
[32]

Omniguard: Hybrid manipulation localization via augmented versatile deep image watermarking

[Zhanget al., 2025 ] Xuanyu Zhang, Zecheng Tang, Zhipei Xu, Runyi Li, Youmin Xu, Bin Chen, Feng Gao, and Jian Zhang. Omniguard: Hybrid manipulation localization via augmented versatile deep image watermarking. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 3008–3018,

work page 2025
[33]

Efficient and separate authentication im- age steganography network

[Zhouet al., 2025 ] Junchao Zhou, Yao Lu, Jie Wen, and Guangming Lu. Efficient and separate authentication im- age steganography network. InForty-second International Conference on Machine Learning, 2025

work page 2025

[1] [1]

Ntire 2017 challenge on single image super- resolution: Dataset and study

[Agustsson and Timofte, 2017] Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super- resolution: Dataset and study. InProceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 126–135,

work page 2017

[2] [2]

Matting human datasets

[AISegment.cn, 2019] AISegment.cn. Matting human datasets. https://github.com/aisegmentcn/matting human datasets,

work page 2019

[3] [3]

Hiding images in plain sight: Deep steganography.Advances in neural informa- tion processing systems, 30,

[Baluja, 2017] Shumeet Baluja. Hiding images in plain sight: Deep steganography.Advances in neural informa- tion processing systems, 30,

work page 2017

[4] [4]

Hiding images within im- ages.IEEE transactions on pattern analysis and machine intelligence, 42(7):1685–1697,

[Baluja, 2019] Shumeet Baluja. Hiding images within im- ages.IEEE transactions on pattern analysis and machine intelligence, 42(7):1685–1697,

work page 2019

[5] [5]

Animal im- age dataset (90 different animals)

[Banerjee, 2022] Sourav Banerjee. Animal im- age dataset (90 different animals). https: //www.kaggle.com/datasets/iamsouravbanerjee/ animal-image-dataset-90-different-animals,

work page 2022

[6] [6]

Deep residual network for steganalysis of digital images.IEEE Transactions on Information Foren- sics and Security, 14(5):1181–1193,

[Boroumandet al., 2018 ] Mehdi Boroumand, Mo Chen, and Jessica Fridrich. Deep residual network for steganalysis of digital images.IEEE Transactions on Information Foren- sics and Security, 14(5):1181–1193,

work page 2018

[7] [7]

Learning continuous image representation with local implicit image function

[Chenet al., 2021 ] Yinbo Chen, Sifei Liu, and Xiaolong Wang. Learning continuous image representation with local implicit image function. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8628–8638,

work page 2021

[8] [8]

NICE: Non-linear Independent Components Estimation

[Dinhet al., 2014 ] Laurent Dinh, David Krueger, and Yoshua Bengio. Nice: Non-linear independent compo- nents estimation.arXiv preprint arXiv:1410.8516,

work page internal anchor Pith review Pith/arXiv arXiv 2014

[9] [9]

Density estimation using Real NVP

[Dinhet al., 2016 ] Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp.arXiv preprint arXiv:1605.08803,

work page internal anchor Pith review Pith/arXiv arXiv 2016

[10] [10]

Reversible image steganography scheme based on a u-net structure.Ieee Access, 7:9314–9323,

[Duanet al., 2019 ] Xintao Duan, Kai Jia, Baoxia Li, Daidou Guo, En Zhang, and Chuan Qin. Reversible image steganography scheme based on a u-net structure.Ieee Access, 7:9314–9323,

work page 2019

[11] [11]

Densejin: Dense depth image steganography model with joint invertible and noninvertible mechanisms.IEEE Transactions on Circuits and Systems for Video Technol- ogy,

[Duanet al., 2024 ] Delin Duan, Shuyuan Shen, Songsen Yu, Yibo Yuan, Qidong Zhou, Haojie Lv, and Huanjie Lin. Densejin: Dense depth image steganography model with joint invertible and noninvertible mechanisms.IEEE Transactions on Circuits and Systems for Video Technol- ogy,

work page 2024

[12] [12]

Generating steganographic images via adversar- ial training.Advances in neural information processing systems, 30,

[Hayes and Danezis, 2017] Jamie Hayes and George Danezis. Generating steganographic images via adversar- ial training.Advances in neural information processing systems, 30,

work page 2017

[13] [13]

Hinet: Deep image hiding by invertible network

[Jinget al., 2021 ] Junpeng Jing, Xin Deng, Mai Xu, Jianyi Wang, and Zhenyu Guan. Hinet: Deep image hiding by invertible network. InProceedings of the IEEE/CVF in- ternational conference on computer vision, pages 4733– 4742,

work page 2021

[14] [14]

Stegformer: Rebuilding the glory of autoencoder-based steganography

[Keet al., 2024 ] Xiao Ke, Huanqi Wu, and Wenzhong Guo. Stegformer: Rebuilding the glory of autoencoder-based steganography. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 2723–2731,

work page 2024

[15] [15]

Multi-patch learning: look- ing more pixels in the training phase

[Liet al., 2022 ] Lei Li, Jingzhu Tang, Ming Chen, Shijie Zhao, Junlin Li, and Li Zhang. Multi-patch learning: look- ing more pixels in the training phase. InEuropean Confer- ence on Computer Vision, pages 549–560. Springer,

work page 2022

[16] [16]

Lidinet: A lightweight deep in- vertible network for image-in-image steganography.IEEE Transactions on Information Forensics and Security,

[Liet al., 2024 ] Fengyong Li, Yang Sheng, Kui Wu, Chuan Qin, and Xinpeng Zhang. Lidinet: A lightweight deep in- vertible network for image-in-image steganography.IEEE Transactions on Information Forensics and Security,

work page 2024

[17] [17]

Microsoft coco: Com- mon objects in context

[Linet al., 2014 ] Tsung-Yi Lin, Michael Maire, Serge Be- longie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Com- mon objects in context. InEuropean conference on com- puter vision, pages 740–755. Springer,

work page 2014

[18] [18]

Fearless of noise: Robust image-in- image hiding using dual-tree complex wavelet transform and state space model.IEEE Transactions on Circuits and Systems for Video Technology,

[Liuet al., 2025 ] Hao Liu, Fengyong Li, Chuan Qin, and Xinpeng Zhang. Fearless of noise: Robust image-in- image hiding using dual-tree complex wavelet transform and state space model.IEEE Transactions on Circuits and Systems for Video Technology,

work page 2025

[19] [19]

Large-capacity image steganography based on invertible neural networks

[Luet al., 2021 ] Shao-Ping Lu, Rong Wang, Tao Zhong, and Paul L Rosin. Large-capacity image steganography based on invertible neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10816–10825,

work page 2021

[20] [20]

Stegmamba: Distortion-free immune-cover for multi- image steganography with state space model.IEEE Trans- actions on Circuits and Systems for Video Technology,

[Luoet al., 2024 ] Ting Luo, Yuhang Zhou, Zhouyan He, Gangyi Jiang, Haiyong Xu, Shuren Qi, and Yushu Zhang. Stegmamba: Distortion-free immune-cover for multi- image steganography with state space model.IEEE Trans- actions on Circuits and Systems for Video Technology,

work page 2024

[21] [21]

End-to-end trained cnn encoder-decoder networks for im- age steganography

[Rahimet al., 2018 ] Rafia Rahim, Shahroz Nadeem, et al. End-to-end trained cnn encoder-decoder networks for im- age steganography. InProceedings of the European con- ference on computer vision (ECCV) workshops, pages 0–0,

work page 2018

[22] [22]

Denoising Diffusion Implicit Models

[Songet al., 2020 ] Jiaming Song, Chenlin Meng, and Ste- fano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502,

work page internal anchor Pith review Pith/arXiv arXiv 2020

[23] [23]

Reversible gans for memory-efficient image-to-image translation

[van der Ouderaa and Worrall, 2019] Tycho FA van der Ouderaa and Daniel E Worrall. Reversible gans for memory-efficient image-to-image translation. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4720–4728,

work page 2019

[24] [24]

Sshr: More secure generative steganography with high-quality revealed secret images

[Wanget al., 2025 ] Jiannian Wang, Yao Lu, and Guangming Lu. Sshr: More secure generative steganography with high-quality revealed secret images. InForty-second In- ternational Conference on Machine Learning,

work page 2025

[25] [25]

Secret communication using multi-image steganography for mil- itary purposes.International Journal of Advanced Re- search in Science, Communication and Technology, 2,

[Waniet al., 2022 ] Pratik Wani, Anuja Nanaware, Sneha Shirode, Aishwarya Suram, and Archana Jadhav. Secret communication using multi-image steganography for mil- itary purposes.International Journal of Advanced Re- search in Science, Communication and Technology, 2,

work page 2022

[26] [26]

High-capacity convolutional video steganog- raphy with temporal residual modeling

[Wenget al., 2019 ] Xinyu Weng, Yongzhi Li, Lu Chi, and Yadong Mu. High-capacity convolutional video steganog- raphy with temporal residual modeling. InProceedings of the 2019 on international conference on multimedia re- trieval, pages 87–95,

work page 2019

[27] [27]

Robust invertible image steganogra- phy

[Xuet al., 2022 ] Youmin Xu, Chong Mou, Yujie Hu, Jingfen Xie, and Jian Zhang. Robust invertible image steganogra- phy. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 7875–7884,

work page 2022

[28] [28]

Diffstega: towards universal training-free coverless image steganography with diffusion models

[Yanget al., 2024 ] Yiwei Yang, Zheyuan Liu, Jun Jia, Zhongpai Gao, Yunhao Li, Wei Sun, Xiaohong Liu, and Guangtao Zhai. Diffstega: towards universal training-free coverless image steganography with diffusion models. In Proceedings of the Thirty-Third International Joint Con- ference on Artificial Intelligence, pages 1579–1587,

work page 2024

[29] [29]

Cross: Diffusion model makes controllable, robust and secure image steganography.Advances in Neural Information Processing Systems, 36:80730–80743,

[Yuet al., 2023 ] Jiwen Yu, Xuanyu Zhang, Youmin Xu, and Jian Zhang. Cross: Diffusion model makes controllable, robust and secure image steganography.Advances in Neural Information Processing Systems, 36:80730–80743,

work page 2023

[30] [30]

Attention based data hiding with gen- erative adversarial networks

[Yu, 2020] Chong Yu. Attention based data hiding with gen- erative adversarial networks. InProceedings of the AAAI conference on artificial intelligence, volume 34, pages 1120–1128,

work page 2020

[31] [31]

Udh: Universal deep hiding for steganography, watermarking, and light field messaging.Advances in Neural Information Process- ing Systems, 33:10223–10234,

[Zhanget al., 2020 ] Chaoning Zhang, Philipp Benz, Adil Karjauv, Geng Sun, and In So Kweon. Udh: Universal deep hiding for steganography, watermarking, and light field messaging.Advances in Neural Information Process- ing Systems, 33:10223–10234,

work page 2020

[32] [32]

Omniguard: Hybrid manipulation localization via augmented versatile deep image watermarking

[Zhanget al., 2025 ] Xuanyu Zhang, Zecheng Tang, Zhipei Xu, Runyi Li, Youmin Xu, Bin Chen, Feng Gao, and Jian Zhang. Omniguard: Hybrid manipulation localization via augmented versatile deep image watermarking. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 3008–3018,

work page 2025

[33] [33]

Efficient and separate authentication im- age steganography network

[Zhouet al., 2025 ] Junchao Zhou, Yao Lu, Jie Wen, and Guangming Lu. Efficient and separate authentication im- age steganography network. InForty-second International Conference on Machine Learning, 2025

work page 2025