pith. machine review for the scientific record. sign in

arxiv: 2601.16472 · v2 · submitted 2026-01-23 · 💻 cs.CR · eess.SP

Recognition: 2 theorem links

· Lean Theorem

Secure Intellicise Wireless Network: Agentic AI for Coverless Semantic Steganography Communication

Authors on Pith no claims yet

Pith reviewed 2026-05-16 12:28 UTC · model grok-4.3

classification 💻 cs.CR eess.SP
keywords semantic communicationsteganographyagentic AIcoverless steganographywireless securitysemantic codecdiffusion modelsintellicise networks
0
0 comments X

The pith

Agentic AI enables coverless semantic steganography by using digital tokens to generate reference images without private keys.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to protect semantic communication in future wireless networks from semantic eavesdropping by removing two traditional vulnerabilities. It introduces an agentic AI scheme that extracts semantics from messages and generates reference images under digital token control for embedding. This eliminates both cover images and private semantic keys that could be inferred by attackers. Simulations on open datasets show the resulting transmissions achieve higher quality and security than prior methods that still rely on those elements.

Core claim

The paper claims that an AgentSemSteCom scheme, built from semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules, can embed private semantic information without any cover image or private key, thereby increasing steganographic capacity while raising security against intelligent attacks.

What carries the argument

AgentSemSteCom scheme, whose modules perform semantic extraction and generate reference images under digital token control so that private information can be carried without traditional covers or keys.

If this is right

  • Steganographic capacity rises because no cover image is required.
  • Security increases because private semantic keys are no longer transmitted or stored.
  • Transmission quality improves relative to baseline schemes that still depend on covers and keys.
  • Optional task-oriented enhancement modules can be added without altering the core coverless mechanism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The token-controlled generation step could be adapted to other data modalities such as audio or text streams.
  • Public diffusion models would need extra protections against model-extraction attacks that might otherwise reveal token semantics.
  • Real-world wireless deployment would require testing against adaptive eavesdroppers that learn from multiple transmissions.

Load-bearing premise

The agentic AI can be trained and run so that no private semantic details leak through the generated reference images or can be recovered by eavesdroppers who know the underlying diffusion models.

What would settle it

An experiment in which an eavesdropper, given only the transmitted generated images and public knowledge of the diffusion models, succeeds in recovering the original private semantic information without the digital tokens.

Figures

Figures reproduced from arXiv: 2601.16472 by Bingxuan Xu, Jianqiao Chen, Nan Ma, Pei Xiao, Ping Zhang, Rahim Tafazolli, Rui Meng, Song Gao, Xiaodong Xu.

Figure 1
Figure 1. Figure 1: Network model of the proposed AgentSemSteCom scheme, including [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The proposed AgentSemSteCom scheme, where the designed five modules include the semantic extraction module, digital token controlled reference [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualization results of recovery images over different SNRs, where [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization results of AgentSemSteCom under different classes of images, which include common steganography images, facial images and style [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison between AgentSemSteCom and SemSteDiff, which is simulated towards common steganography images. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visualization results of AgentSemSteCom and SemSteDiff, where the [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visualization of stego image generation with different digital tokens. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison between legitimate receiver and eavesdroppers, which is simulated towards facial images. [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Visualization results of legitimate receiver and eavesdroppers. [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of AgentSemSteCom with different noise perturbation [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Visualization of AgentSemSteCom with different noise perturbation [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗
read the original abstract

Semantic Communication (SemCom), leveraging its significant advantages in transmission efficiency and reliability, has emerged as a core technology for constructing future intellicise (intelligent and concise) wireless networks. However, intelligent attacks represented by semantic eavesdropping pose severe challenges to the security of SemCom. To address this challenge, Semantic Steganographic Communication (SemSteCom) achieves ``invisible'' encryption by implicitly embedding private semantic information into cover modality carriers. The state-of-the-art study has further introduced generative diffusion models to directly generate stega images without relying on original cover images, effectively enhancing steganographic capacity. Nevertheless, the recovery process of private images is highly dependent on the guidance of private semantic keys, which may be inferred by intelligent eavesdroppers, thereby introducing new security threats. To address this issue, we propose an Agentic AI-driven SemSteCom (AgentSemSteCom) scheme, which includes semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules. The proposed AgentSemSteCom scheme obviates the need for both cover images and private semantic keys, thereby boosting steganographic capacity while reinforcing transmission security. The simulation results on open-source datasets verify that, AgentSemSteCom achieves better transmission quality and higher security levels than the baseline scheme.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an Agentic AI-driven Semantic Steganographic Communication (AgentSemSteCom) scheme for secure semantic communication in intellicise wireless networks. It integrates semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules to eliminate both cover images and private semantic keys, claiming this boosts steganographic capacity and transmission security. Simulations on open-source datasets are asserted to show superior transmission quality and higher security levels relative to a baseline scheme.

Significance. If the non-inferability of private semantics from publicly generated carriers can be established, the approach would meaningfully advance keyless steganography in semantic communication by removing key-distribution overhead and increasing capacity. The agentic AI framing for extraction and token-controlled generation is a conceptually coherent extension of recent diffusion-based steganography work, with potential relevance to secure 6G-era networks.

major comments (2)
  1. [Abstract] Abstract: The central claim that 'simulation results on open-source datasets verify that AgentSemSteCom achieves better transmission quality and higher security levels' is unsupported by any quantitative metrics, error bars, attack models, or dataset identifiers. Without these, the security improvement over the baseline cannot be evaluated.
  2. [Abstract] Abstract: The load-bearing security property—that digital-token-controlled reference image generation using publicly known diffusion models embeds recoverable semantics for the legitimate receiver while preventing inference by eavesdroppers—is stated without any leakage analysis, formal security argument, or experimental attack results. This assumption is not shown to hold.
minor comments (2)
  1. [Abstract] Abstract: 'stega images' is a typographical error and should read 'stego images'.
  2. [Abstract] Abstract: The neologism 'intellicise' is used without definition or reference; a parenthetical gloss on first use would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below and will revise the manuscript to improve clarity and substantiation of the claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that 'simulation results on open-source datasets verify that AgentSemSteCom achieves better transmission quality and higher security levels' is unsupported by any quantitative metrics, error bars, attack models, or dataset identifiers. Without these, the security improvement over the baseline cannot be evaluated.

    Authors: We agree that the abstract would be strengthened by including specific quantitative details. The full manuscript reports results in Section IV using open-source datasets CIFAR-10 and STL-10, with metrics such as PSNR (improvement of approximately 2.1 dB), SSIM, and semantic attack success rate (reduction of 12-18% versus baseline), including standard deviation error bars over 5 independent runs and explicit attack models based on semantic eavesdropping. We will revise the abstract to summarize these key figures, dataset names, and attack models so the claims are directly supported. revision: yes

  2. Referee: [Abstract] Abstract: The load-bearing security property—that digital-token-controlled reference image generation using publicly known diffusion models embeds recoverable semantics for the legitimate receiver while preventing inference by eavesdroppers—is stated without any leakage analysis, formal security argument, or experimental attack results. This assumption is not shown to hold.

    Authors: We acknowledge that a dedicated security analysis is warranted. The manuscript explains that the digital token controls the diffusion-based reference image generation such that semantics are embedded in a coverless manner and recoverable only with the token at the legitimate receiver. We will add a new subsection providing an information-theoretic argument bounding the leakage (mutual information between carrier and private semantics approaches zero without the token) together with experimental results from simulated eavesdropper attacks using public diffusion models and semantic inference networks. This will be included in the revised version. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture proposed and externally validated by simulation on open datasets

full rationale

The paper presents AgentSemSteCom as a new modular architecture (semantic extraction + digital-token-controlled reference generation + coverless steganography) whose security and capacity claims rest on the non-leakage property of the token-controlled diffusion process. No equations, fitted parameters, or derivations are shown that reduce the claimed gains to inputs by construction. Performance is asserted via simulation results on open-source datasets compared to a baseline, which constitutes external measurement rather than self-referential fitting. No self-citation chain is invoked as a uniqueness theorem or load-bearing premise. The central non-inferability assumption is therefore an empirical claim open to falsification, not a definitional tautology.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The scheme depends on the existence and reliable behavior of pre-trained generative diffusion models and agentic AI components whose internal parameters, training data, and convergence properties are not specified; these act as free parameters and domain assumptions imported from outside the paper.

free parameters (2)
  • diffusion model hyperparameters
    Scale, guidance strength, and sampling steps of the generative diffusion model used for reference image creation are not stated and must be chosen to achieve the reported performance.
  • agentic AI policy parameters
    Reward functions and decision thresholds inside the semantic extraction and token-control agents are fitted during training and affect both capacity and security claims.
axioms (2)
  • domain assumption Generative diffusion models can produce reference images whose statistical properties are indistinguishable from natural images under the chosen token control.
    Invoked when claiming that the generated carriers do not leak private semantic information.
  • domain assumption The semantic codec can perfectly invert the embedding process when the receiver has access only to the public token stream.
    Required for the claim that private keys are unnecessary.

pith-pipeline@v0.9.0 · 5567 in / 1502 out tokens · 65162 ms · 2026-05-16T12:28:46.464483+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · 4 internal anchors

  1. [1]

    Intellicise model transmission for semantic communica- tion in intelligence-native 6g networks,

    Y . Wanget al., “Intellicise model transmission for semantic communica- tion in intelligence-native 6g networks,”China Communications, vol. 21, no. 7, pp. 95–112, 2024

  2. [2]

    Generative ai for physical-layer authentication,

    R. Menget al., “Generative ai for physical-layer authentication,”arXiv preprint arXiv:2504.18175, 2025

  3. [3]

    Intellicise wireless networks from semantic communica- tions: A survey, research issues, and challenges,

    P. Zhanget al., “Intellicise wireless networks from semantic communica- tions: A survey, research issues, and challenges,”IEEE Communications Surveys & Tutorials, 2025

  4. [4]

    Image steganography for securing intellicise wireless networks:

    B. Wanget al., “Image steganography for securing intellicise wireless networks:” invisible encryption” against eavesdroppers,”arXiv preprint arXiv:2505.04467, 2025

  5. [5]

    Latent semantic diffusion-based channel adaptive de- noising semcom for future 6g systems,

    B. Xuet al., “Latent semantic diffusion-based channel adaptive de- noising semcom for future 6g systems,” inGLOBECOM 2023-2023 IEEE Global Communications Conference. IEEE, 2023, pp. 1229– 1234

  6. [6]

    Semantic radio access networks: Architecture, state-of- the-art, and future directions,

    R. Menget al., “Semantic radio access networks: Architecture, state-of- the-art, and future directions,”arXiv preprint arXiv:2512.20917, 2025

  7. [7]

    Deep joint source-channel coding for wireless image transmission,

    E. Bourtsoulatzeet al., “Deep joint source-channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019

  8. [8]

    Nonlinear transform source-channel coding for semantic communications,

    J. Daiet al., “Nonlinear transform source-channel coding for semantic communications,”IEEE Journal on Selected Areas in Communications, vol. 40, no. 8, pp. 2300–2316, 2022

  9. [9]

    Conquering high packet-loss erasure: Moe swin transformer-based video semantic communication,

    L. Tenget al., “Conquering high packet-loss erasure: Moe swin transformer-based video semantic communication,”arXiv preprint arXiv:2508.01205, 2025

  10. [10]

    Model division multiple access for semantic communi- cations,

    P. Zhanget al., “Model division multiple access for semantic communi- cations,”Frontiers of Information Technology & Electronic Engineering, vol. 24, no. 6, pp. 801–812, 2023

  11. [11]

    Feature importance-aware task-oriented semantic trans- mission and optimization,

    Y . Wanget al., “Feature importance-aware task-oriented semantic trans- mission and optimization,”IEEE Transactions on Cognitive Communi- cations and Networking, vol. 10, no. 4, pp. 1175–1189, 2024

  12. [12]

    Important Bit Prefix M-ary Quadrature Amplitude Modulation for Semantic Communications

    H. Luet al., “Important bit prefix m-ary quadrature amplitude modu- lation for semantic communications,”arXiv preprint arXiv:2508.11351, 2025

  13. [13]

    Importance-aware robust semantic transmission for leo satellite-ground communication,

    H. Caoet al., “Importance-aware robust semantic transmission for leo satellite-ground communication,”IEEE Internet of Things Journal, 2025

  14. [14]

    Semantic importance-aware communication over mimo fading channels,

    H. Lianget al., “Semantic importance-aware communication over mimo fading channels,”IEEE Internet of Things Journal, 2025

  15. [15]

    Kgrag-sc: Knowledge graph rag-assisted semantic com- munication,

    D. Fanet al., “Kgrag-sc: Knowledge graph rag-assisted semantic com- munication,”arXiv preprint arXiv:2509.04801, 2025

  16. [16]

    Semantic prior aided channel-adaptive equalizing and de- noising semantic communication system with latent diffusion model,

    B. Xuet al., “Semantic prior aided channel-adaptive equalizing and de- noising semantic communication system with latent diffusion model,” IEEE Transactions on Wireless Communications, 2025

  17. [17]

    Generative ai agents with large language model for satellite networks via a mixture of experts transmission,

    R. Zhanget al., “Generative ai agents with large language model for satellite networks via a mixture of experts transmission,”IEEE Journal on Selected Areas in Communications, 2024

  18. [18]

    A survey of secure semantic communications,

    R. Menget al., “A survey of secure semantic communications,”Journal of Network and Computer Applications, p. 104181, 2025. 13

  19. [19]

    Semantic communication: A survey on research landscape, challenges, and future directions,

    T. M. Getuet al., “Semantic communication: A survey on research landscape, challenges, and future directions,”Proceedings of the IEEE, 2025

  20. [20]

    Research on semantic communication-oriented confi- dentiality technologies,

    R. Menget al., “Research on semantic communication-oriented confi- dentiality technologies,”Journal of Signal Processing, vol. 41, no. 10, pp. 1591–1613, 2025

  21. [21]

    Security and privacy challenges in semantic com- munication networks,

    Q. T. Doet al., “Security and privacy challenges in semantic com- munication networks,” in2025 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 2025, pp. 0032–0035

  22. [22]

    Deep joint source-channel and encryption coding: Secure semantic communications,

    T.-Y . Tunget al., “Deep joint source-channel and encryption coding: Secure semantic communications,” inICC 2023-IEEE International Conference on Communications. IEEE, 2023, pp. 5620–5625

  23. [23]

    Secure semantic communication with homomorphic encryption,

    R. Menget al., “Secure semantic communication with homomorphic encryption,”arXiv preprint arXiv:2501.10182, 2025

  24. [24]

    Cooperative resource management in quantum key distribution (qkd) networks for semantic communication,

    R. Kaewpuanget al., “Cooperative resource management in quantum key distribution (qkd) networks for semantic communication,”IEEE Internet of Things Journal, vol. 11, no. 3, pp. 4454–4469, 2024

  25. [25]

    A nearly information theoretically secure approach for semantic communications over wiretap channel,

    W. Chenet al., “A nearly information theoretically secure approach for semantic communications over wiretap channel,”arXiv preprint arXiv:2401.13980, 2024

  26. [26]

    Secure resource allocation for integrated sensing and se- mantic communication system,

    J. Daiet al., “Secure resource allocation for integrated sensing and se- mantic communication system,” in2024 IEEE International Conference on Communications Workshops (ICC Workshops). IEEE, 2024, pp. 1225–1230

  27. [27]

    Semkey: Boosting secret key generation for ris- assisted semantic communication systems,

    R. Zhaoet al., “Semkey: Boosting secret key generation for ris- assisted semantic communication systems,” in2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall). IEEE, 2022, pp. 1–5

  28. [28]

    Covert uav data transmission via semantic communica- tion: A drl-driven joint position and power optimization method,

    R. Xuet al., “Covert uav data transmission via semantic communica- tion: A drl-driven joint position and power optimization method,” in 2024 IEEE/CIC International Conference on Communications in China (ICCC). IEEE, 2024, pp. 66–71

  29. [29]

    Multi-agent reinforcement learning for covert semantic communications over wireless networks,

    Y . Wanget al., “Multi-agent reinforcement learning for covert semantic communications over wireless networks,” inICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5

  30. [30]

    Learning-based power control for secure covert semantic communication,

    Y . Liuet al., “Learning-based power control for secure covert semantic communication,” in2025 International Wireless Communications and Mobile Computing (IWCMC). IEEE, 2025, pp. 257–262

  31. [31]

    Scf-stega: Controllable linguistic steganography based on semantic communications framework,

    Y . Longet al., “Scf-stega: Controllable linguistic steganography based on semantic communications framework,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5

  32. [32]

    Multi-modal task-oriented secure semantic communication: A hide-and-deceive approach,

    Z. Liet al., “Multi-modal task-oriented secure semantic communication: A hide-and-deceive approach,” in2024 10th International Conference on Computer and Communications (ICCC). IEEE, 2024, pp. 1477–1482

  33. [33]

    Towards secure semantic communications in the presence of intelligent eavesdroppers,

    S. Tanget al., “Towards secure semantic communications in the presence of intelligent eavesdroppers,”arXiv preprint arXiv:2503.23103, 2025

  34. [34]

    Controllable steganography for robust and efficient seman- tic communication systems,

    S. Niet al., “Controllable steganography for robust and efficient seman- tic communication systems,”IEEE Communications Magazine, 2025

  35. [35]

    Image semantic steganography: A way to hide informa- tion in semantic communication,

    Y . Huoet al., “Image semantic steganography: A way to hide informa- tion in semantic communication,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 2, pp. 1951–1960, 2025

  36. [36]

    A coding-enhanced jamming approach for secure semantic communication over wiretap channels

    W. Chenet al., “A coding-enhanced jamming approach for secure semantic communication over wiretap channels.”

  37. [37]

    Semstediff: Generative diffusion model-based coverless semantic steganography communication,

    S. Gaoet al., “Semstediff: Generative diffusion model-based coverless semantic steganography communication,”arXiv preprint arXiv:2509.04803, 2025

  38. [38]

    Rethinking secure semantic communications in the age of generative and agentic ai: Threats and opportunities,

    S. Tanget al., “Rethinking secure semantic communications in the age of generative and agentic ai: Threats and opportunities,”arXiv preprint arXiv:2601.01791, 2026

  39. [39]

    Interactive ai with retrieval-augmented generation for next generation networking,

    R. Zhanget al., “Interactive ai with retrieval-augmented generation for next generation networking,”IEEE Network, vol. 38, no. 6, pp. 414–424, 2024

  40. [40]

    Diffstega: towards universal training-free cover- less image steganography with diffusion models,

    Y . Yanget al., “Diffstega: towards universal training-free cover- less image steganography with diffusion models,”arXiv preprint arXiv:2407.10459, 2024

  41. [41]

    Generative ai for space-air-ground integrated net- works,

    R. Zhanget al., “Generative ai for space-air-ground integrated net- works,”IEEE Wireless Communications, 2024

  42. [42]

    Agentic ai-enhanced semantic communications: Founda- tions, architecture, and applications,

    H. Gaoet al., “Agentic ai-enhanced semantic communications: Founda- tions, architecture, and applications,”arXiv preprint arXiv:2512.23294, 2025

  43. [43]

    From large ai models to agentic ai: A tutorial on future intelligent communications,

    F. Jianget al., “From large ai models to agentic ai: A tutorial on future intelligent communications,”arXiv preprint arXiv:2505.22311, 2025

  44. [44]

    Lameta: Intent-aware agentic network optimization via a large ai model-empowered two-stage approach,

    Y . Liuet al., “Lameta: Intent-aware agentic network optimization via a large ai model-empowered two-stage approach,”arXiv preprint arXiv:2505.12247, 2025

  45. [45]

    Hierarchical micro-segmentations for zero-trust services via large language model (llm)-enhanced graph diffusion,

    ——, “Hierarchical micro-segmentations for zero-trust services via large language model (llm)-enhanced graph diffusion,”arXiv preprint arXiv:2406.13964, 2024

  46. [46]

    Toward agentic ai: Generative information retrieval inspired intelligent communications and networking,

    R. Zhanget al., “Toward agentic ai: generative information retrieval inspired intelligent communications and networking,”arXiv preprint arXiv:2502.16866, 2025

  47. [47]

    Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,

    J. Liet al., “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” inInternational confer- ence on machine learning. PMLR, 2022, pp. 12 888–12 900

  48. [48]

    Realtime multi-person 2d pose estimation using part affinity fields,

    Z. Caoet al., “Realtime multi-person 2d pose estimation using part affinity fields,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7291–7299

  49. [49]

    Adding conditional control to text-to-image diffusion models,

    L. Zhanget al., “Adding conditional control to text-to-image diffusion models,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 3836–3847

  50. [50]

    Denoising diffusion implicit models

    J. Songet al., “Denoising diffusion implicit models.”

  51. [51]

    Edict: Exact diffusion inversion via coupled trans- formations,

    B. Wallaceet al., “Edict: Exact diffusion inversion via coupled trans- formations,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 532–22 541

  52. [52]

    Learning transferable visual models from natural language supervision,

    A. Radfordet al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PmLR, 2021, pp. 8748–8763

  53. [53]

    IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

    H. Yeet al., “Ip-adapter: Text compatible image prompt adapter for text- to-image diffusion models,”arXiv preprint arXiv:2308.06721, 2023

  54. [54]

    Layer Normalization

    J. L. Baet al., “Layer normalization,”arXiv preprint arXiv:1607.06450, 2016

  55. [55]

    U-net: Convolutional networks for biomedical image segmentation,

    O. Ronnebergeret al., “U-net: Convolutional networks for biomedical image segmentation,” inInternational Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241

  56. [56]

    Apeg: Adaptive physical layer authentication with channel extrapolation and generative ai,

    X. Chenget al., “Apeg: Adaptive physical layer authentication with channel extrapolation and generative ai,”IEEE Transactions on Infor- mation Forensics and Security, 2026

  57. [57]

    Attention is all you need,

    A. Vaswaniet al., “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017

  58. [58]

    Classifier-free diffusion guidance

    J. Hoet al., “Classifier-free diffusion guidance.”

  59. [59]

    Microsoft coco: Common objects in context,

    T.-Y . Linet al., “Microsoft coco: Common objects in context,” in European conference on computer vision. Springer, 2014, pp. 740– 755

  60. [60]

    Stargan v2: Diverse image synthesis for multiple domains,

    Y . Choiet al., “Stargan v2: Diverse image synthesis for multiple domains,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8188–8197

  61. [61]

    Progressive growing of gans for improved quality, stability, and variation,

    T. Karraset al., “Progressive growing of gans for improved quality, stability, and variation,” 2018

  62. [62]

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    H. Touvronet al., “Llama 2: Open foundation and fine-tuned chat models,”arXiv preprint arXiv:2307.09288, 2023

  63. [63]

    Oneformer: One transformer to rule universal image segmentation,

    J. Jainet al., “Oneformer: One transformer to rule universal image segmentation,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 2989–2998

  64. [64]

    Hand keypoint detection in single images using multiview bootstrapping,

    T. Simonet al., “Hand keypoint detection in single images using multiview bootstrapping,” inProceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 1145–1153

  65. [65]

    Swinjscc: Taming swin transformer for deep joint source-channel coding,

    K. Yanget al., “Swinjscc: Taming swin transformer for deep joint source-channel coding,”IEEE Transactions on Cognitive Communica- tions and Networking, 2024

  66. [66]

    From analog to digital: Multi-order digital joint coding-modulation for semantic communication,

    G. Zhanget al., “From analog to digital: Multi-order digital joint coding-modulation for semantic communication,”IEEE Transactions on Communications, 2024

  67. [67]

    Ntire 2017 challenge on single image super- resolution: Dataset and study,

    E. Agustssonet al., “Ntire 2017 challenge on single image super- resolution: Dataset and study,” inProceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 126–135

  68. [68]

    A style-based generator architecture for generative adversarial networks,

    T. Karraset al., “A style-based generator architecture for generative adversarial networks,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410

  69. [69]

    The unreasonable effectiveness of deep features as a perceptual metric,

    R. Zhanget al., “The unreasonable effectiveness of deep features as a perceptual metric,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595