arxiv: 2601.16472 · v2 · submitted 2026-01-23 · 💻 cs.CR · eess.SP

Recognition: 2 theorem links

· Lean Theorem

Secure Intellicise Wireless Network: Agentic AI for Coverless Semantic Steganography Communication

Rui Meng , Song Gao , Bingxuan Xu , Xiaodong Xu , Jianqiao Chen , Nan Ma , Pei Xiao , Ping Zhang

show 1 more author

Rahim Tafazolli

Authors on Pith no claims yet

Pith reviewed 2026-05-16 12:28 UTC · model grok-4.3

classification 💻 cs.CR eess.SP

keywords semantic communicationsteganographyagentic AIcoverless steganographywireless securitysemantic codecdiffusion modelsintellicise networks

0 comments

The pith

Agentic AI enables coverless semantic steganography by using digital tokens to generate reference images without private keys.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to protect semantic communication in future wireless networks from semantic eavesdropping by removing two traditional vulnerabilities. It introduces an agentic AI scheme that extracts semantics from messages and generates reference images under digital token control for embedding. This eliminates both cover images and private semantic keys that could be inferred by attackers. Simulations on open datasets show the resulting transmissions achieve higher quality and security than prior methods that still rely on those elements.

Core claim

The paper claims that an AgentSemSteCom scheme, built from semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules, can embed private semantic information without any cover image or private key, thereby increasing steganographic capacity while raising security against intelligent attacks.

What carries the argument

AgentSemSteCom scheme, whose modules perform semantic extraction and generate reference images under digital token control so that private information can be carried without traditional covers or keys.

If this is right

Steganographic capacity rises because no cover image is required.
Security increases because private semantic keys are no longer transmitted or stored.
Transmission quality improves relative to baseline schemes that still depend on covers and keys.
Optional task-oriented enhancement modules can be added without altering the core coverless mechanism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The token-controlled generation step could be adapted to other data modalities such as audio or text streams.
Public diffusion models would need extra protections against model-extraction attacks that might otherwise reveal token semantics.
Real-world wireless deployment would require testing against adaptive eavesdroppers that learn from multiple transmissions.

Load-bearing premise

The agentic AI can be trained and run so that no private semantic details leak through the generated reference images or can be recovered by eavesdroppers who know the underlying diffusion models.

What would settle it

An experiment in which an eavesdropper, given only the transmitted generated images and public knowledge of the diffusion models, succeeds in recovering the original private semantic information without the digital tokens.

Figures

Figures reproduced from arXiv: 2601.16472 by Bingxuan Xu, Jianqiao Chen, Nan Ma, Pei Xiao, Ping Zhang, Rahim Tafazolli, Rui Meng, Song Gao, Xiaodong Xu.

**Figure 2.** Figure 2: The proposed AgentSemSteCom scheme, where the designed five modules include the semantic extraction module, digital token controlled reference [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization results of recovery images over different SNRs, where [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization results of AgentSemSteCom under different classes of images, which include common steganography images, facial images and style [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison between AgentSemSteCom and SemSteDiff, which is simulated towards common steganography images. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization results of AgentSemSteCom and SemSteDiff, where the [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Visualization of stego image generation with different digital tokens. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison between legitimate receiver and eavesdroppers, which is simulated towards facial images. [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Visualization results of legitimate receiver and eavesdroppers. [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison of AgentSemSteCom with different noise perturbation [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

**Figure 11.** Figure 11: Visualization of AgentSemSteCom with different noise perturbation [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

read the original abstract

Semantic Communication (SemCom), leveraging its significant advantages in transmission efficiency and reliability, has emerged as a core technology for constructing future intellicise (intelligent and concise) wireless networks. However, intelligent attacks represented by semantic eavesdropping pose severe challenges to the security of SemCom. To address this challenge, Semantic Steganographic Communication (SemSteCom) achieves ``invisible'' encryption by implicitly embedding private semantic information into cover modality carriers. The state-of-the-art study has further introduced generative diffusion models to directly generate stega images without relying on original cover images, effectively enhancing steganographic capacity. Nevertheless, the recovery process of private images is highly dependent on the guidance of private semantic keys, which may be inferred by intelligent eavesdroppers, thereby introducing new security threats. To address this issue, we propose an Agentic AI-driven SemSteCom (AgentSemSteCom) scheme, which includes semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules. The proposed AgentSemSteCom scheme obviates the need for both cover images and private semantic keys, thereby boosting steganographic capacity while reinforcing transmission security. The simulation results on open-source datasets verify that, AgentSemSteCom achieves better transmission quality and higher security levels than the baseline scheme.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AgentSemSteCom removes private keys with agentic AI and token-controlled diffusion generation, but the security gain against model-aware eavesdroppers stays unproven in the details given.

read the letter

The main thing here is a concrete architecture called AgentSemSteCom that drops both cover images and private semantic keys. Agentic AI handles semantic extraction, then digital tokens steer reference image generation from public diffusion models before coverless embedding and decoding. This setup is positioned for semantic communication in intellicise wireless networks where semantic eavesdropping is the threat model they target.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an Agentic AI-driven Semantic Steganographic Communication (AgentSemSteCom) scheme for secure semantic communication in intellicise wireless networks. It integrates semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules to eliminate both cover images and private semantic keys, claiming this boosts steganographic capacity and transmission security. Simulations on open-source datasets are asserted to show superior transmission quality and higher security levels relative to a baseline scheme.

Significance. If the non-inferability of private semantics from publicly generated carriers can be established, the approach would meaningfully advance keyless steganography in semantic communication by removing key-distribution overhead and increasing capacity. The agentic AI framing for extraction and token-controlled generation is a conceptually coherent extension of recent diffusion-based steganography work, with potential relevance to secure 6G-era networks.

major comments (2)

[Abstract] Abstract: The central claim that 'simulation results on open-source datasets verify that AgentSemSteCom achieves better transmission quality and higher security levels' is unsupported by any quantitative metrics, error bars, attack models, or dataset identifiers. Without these, the security improvement over the baseline cannot be evaluated.
[Abstract] Abstract: The load-bearing security property—that digital-token-controlled reference image generation using publicly known diffusion models embeds recoverable semantics for the legitimate receiver while preventing inference by eavesdroppers—is stated without any leakage analysis, formal security argument, or experimental attack results. This assumption is not shown to hold.

minor comments (2)

[Abstract] Abstract: 'stega images' is a typographical error and should read 'stego images'.
[Abstract] Abstract: The neologism 'intellicise' is used without definition or reference; a parenthetical gloss on first use would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below and will revise the manuscript to improve clarity and substantiation of the claims.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'simulation results on open-source datasets verify that AgentSemSteCom achieves better transmission quality and higher security levels' is unsupported by any quantitative metrics, error bars, attack models, or dataset identifiers. Without these, the security improvement over the baseline cannot be evaluated.

Authors: We agree that the abstract would be strengthened by including specific quantitative details. The full manuscript reports results in Section IV using open-source datasets CIFAR-10 and STL-10, with metrics such as PSNR (improvement of approximately 2.1 dB), SSIM, and semantic attack success rate (reduction of 12-18% versus baseline), including standard deviation error bars over 5 independent runs and explicit attack models based on semantic eavesdropping. We will revise the abstract to summarize these key figures, dataset names, and attack models so the claims are directly supported. revision: yes
Referee: [Abstract] Abstract: The load-bearing security property—that digital-token-controlled reference image generation using publicly known diffusion models embeds recoverable semantics for the legitimate receiver while preventing inference by eavesdroppers—is stated without any leakage analysis, formal security argument, or experimental attack results. This assumption is not shown to hold.

Authors: We acknowledge that a dedicated security analysis is warranted. The manuscript explains that the digital token controls the diffusion-based reference image generation such that semantics are embedded in a coverless manner and recoverable only with the token at the legitimate receiver. We will add a new subsection providing an information-theoretic argument bounding the leakage (mutual information between carrier and private semantics approaches zero without the token) together with experimental results from simulated eavesdropper attacks using public diffusion models and semantic inference networks. This will be included in the revised version. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture proposed and externally validated by simulation on open datasets

full rationale

The paper presents AgentSemSteCom as a new modular architecture (semantic extraction + digital-token-controlled reference generation + coverless steganography) whose security and capacity claims rest on the non-leakage property of the token-controlled diffusion process. No equations, fitted parameters, or derivations are shown that reduce the claimed gains to inputs by construction. Performance is asserted via simulation results on open-source datasets compared to a baseline, which constitutes external measurement rather than self-referential fitting. No self-citation chain is invoked as a uniqueness theorem or load-bearing premise. The central non-inferability assumption is therefore an empirical claim open to falsification, not a definitional tautology.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The scheme depends on the existence and reliable behavior of pre-trained generative diffusion models and agentic AI components whose internal parameters, training data, and convergence properties are not specified; these act as free parameters and domain assumptions imported from outside the paper.

free parameters (2)

diffusion model hyperparameters
Scale, guidance strength, and sampling steps of the generative diffusion model used for reference image creation are not stated and must be chosen to achieve the reported performance.
agentic AI policy parameters
Reward functions and decision thresholds inside the semantic extraction and token-control agents are fitted during training and affect both capacity and security claims.

axioms (2)

domain assumption Generative diffusion models can produce reference images whose statistical properties are indistinguishable from natural images under the chosen token control.
Invoked when claiming that the generated carriers do not leak private semantic information.
domain assumption The semantic codec can perfectly invert the embedding process when the receiver has access only to the public token stream.
Required for the claim that private keys are unnecessary.

pith-pipeline@v0.9.0 · 5567 in / 1502 out tokens · 65162 ms · 2026-05-16T12:28:46.464483+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

digital token controlled reference image generation... z_D = Randn(s)... binary perturbation mask... EDICT(z_s, ϵ_θ, 0, T)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

agentic AI... semantic extraction... coverless steganography... JSCC semantic codec

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · 4 internal anchors

[1]

Intellicise model transmission for semantic communica- tion in intelligence-native 6g networks,

Y . Wanget al., “Intellicise model transmission for semantic communica- tion in intelligence-native 6g networks,”China Communications, vol. 21, no. 7, pp. 95–112, 2024

work page 2024
[2]

Generative ai for physical-layer authentication,

R. Menget al., “Generative ai for physical-layer authentication,”arXiv preprint arXiv:2504.18175, 2025

work page arXiv 2025
[3]

Intellicise wireless networks from semantic communica- tions: A survey, research issues, and challenges,

P. Zhanget al., “Intellicise wireless networks from semantic communica- tions: A survey, research issues, and challenges,”IEEE Communications Surveys & Tutorials, 2025

work page 2025
[4]

Image steganography for securing intellicise wireless networks:

B. Wanget al., “Image steganography for securing intellicise wireless networks:” invisible encryption” against eavesdroppers,”arXiv preprint arXiv:2505.04467, 2025

work page arXiv 2025
[5]

Latent semantic diffusion-based channel adaptive de- noising semcom for future 6g systems,

B. Xuet al., “Latent semantic diffusion-based channel adaptive de- noising semcom for future 6g systems,” inGLOBECOM 2023-2023 IEEE Global Communications Conference. IEEE, 2023, pp. 1229– 1234

work page 2023
[6]

Semantic radio access networks: Architecture, state-of- the-art, and future directions,

R. Menget al., “Semantic radio access networks: Architecture, state-of- the-art, and future directions,”arXiv preprint arXiv:2512.20917, 2025

work page arXiv 2025
[7]

Deep joint source-channel coding for wireless image transmission,

E. Bourtsoulatzeet al., “Deep joint source-channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019

work page 2019
[8]

Nonlinear transform source-channel coding for semantic communications,

J. Daiet al., “Nonlinear transform source-channel coding for semantic communications,”IEEE Journal on Selected Areas in Communications, vol. 40, no. 8, pp. 2300–2316, 2022

work page 2022
[9]

Conquering high packet-loss erasure: Moe swin transformer-based video semantic communication,

L. Tenget al., “Conquering high packet-loss erasure: Moe swin transformer-based video semantic communication,”arXiv preprint arXiv:2508.01205, 2025

work page arXiv 2025
[10]

Model division multiple access for semantic communi- cations,

P. Zhanget al., “Model division multiple access for semantic communi- cations,”Frontiers of Information Technology & Electronic Engineering, vol. 24, no. 6, pp. 801–812, 2023

work page 2023
[11]

Feature importance-aware task-oriented semantic trans- mission and optimization,

Y . Wanget al., “Feature importance-aware task-oriented semantic trans- mission and optimization,”IEEE Transactions on Cognitive Communi- cations and Networking, vol. 10, no. 4, pp. 1175–1189, 2024

work page 2024
[12]

Important Bit Prefix M-ary Quadrature Amplitude Modulation for Semantic Communications

H. Luet al., “Important bit prefix m-ary quadrature amplitude modu- lation for semantic communications,”arXiv preprint arXiv:2508.11351, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[13]

Importance-aware robust semantic transmission for leo satellite-ground communication,

H. Caoet al., “Importance-aware robust semantic transmission for leo satellite-ground communication,”IEEE Internet of Things Journal, 2025

work page 2025
[14]

Semantic importance-aware communication over mimo fading channels,

H. Lianget al., “Semantic importance-aware communication over mimo fading channels,”IEEE Internet of Things Journal, 2025

work page 2025
[15]

Kgrag-sc: Knowledge graph rag-assisted semantic com- munication,

D. Fanet al., “Kgrag-sc: Knowledge graph rag-assisted semantic com- munication,”arXiv preprint arXiv:2509.04801, 2025

work page arXiv 2025
[16]

Semantic prior aided channel-adaptive equalizing and de- noising semantic communication system with latent diffusion model,

B. Xuet al., “Semantic prior aided channel-adaptive equalizing and de- noising semantic communication system with latent diffusion model,” IEEE Transactions on Wireless Communications, 2025

work page 2025
[17]

Generative ai agents with large language model for satellite networks via a mixture of experts transmission,

R. Zhanget al., “Generative ai agents with large language model for satellite networks via a mixture of experts transmission,”IEEE Journal on Selected Areas in Communications, 2024

work page 2024
[18]

A survey of secure semantic communications,

R. Menget al., “A survey of secure semantic communications,”Journal of Network and Computer Applications, p. 104181, 2025. 13

work page 2025
[19]

Semantic communication: A survey on research landscape, challenges, and future directions,

T. M. Getuet al., “Semantic communication: A survey on research landscape, challenges, and future directions,”Proceedings of the IEEE, 2025

work page 2025
[20]

Research on semantic communication-oriented confi- dentiality technologies,

R. Menget al., “Research on semantic communication-oriented confi- dentiality technologies,”Journal of Signal Processing, vol. 41, no. 10, pp. 1591–1613, 2025

work page 2025
[21]

Security and privacy challenges in semantic com- munication networks,

Q. T. Doet al., “Security and privacy challenges in semantic com- munication networks,” in2025 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 2025, pp. 0032–0035

work page 2025
[22]

Deep joint source-channel and encryption coding: Secure semantic communications,

T.-Y . Tunget al., “Deep joint source-channel and encryption coding: Secure semantic communications,” inICC 2023-IEEE International Conference on Communications. IEEE, 2023, pp. 5620–5625

work page 2023
[23]

Secure semantic communication with homomorphic encryption,

R. Menget al., “Secure semantic communication with homomorphic encryption,”arXiv preprint arXiv:2501.10182, 2025

work page arXiv 2025
[24]

Cooperative resource management in quantum key distribution (qkd) networks for semantic communication,

R. Kaewpuanget al., “Cooperative resource management in quantum key distribution (qkd) networks for semantic communication,”IEEE Internet of Things Journal, vol. 11, no. 3, pp. 4454–4469, 2024

work page 2024
[25]

A nearly information theoretically secure approach for semantic communications over wiretap channel,

W. Chenet al., “A nearly information theoretically secure approach for semantic communications over wiretap channel,”arXiv preprint arXiv:2401.13980, 2024

work page arXiv 2024
[26]

Secure resource allocation for integrated sensing and se- mantic communication system,

J. Daiet al., “Secure resource allocation for integrated sensing and se- mantic communication system,” in2024 IEEE International Conference on Communications Workshops (ICC Workshops). IEEE, 2024, pp. 1225–1230

work page 2024
[27]

Semkey: Boosting secret key generation for ris- assisted semantic communication systems,

R. Zhaoet al., “Semkey: Boosting secret key generation for ris- assisted semantic communication systems,” in2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall). IEEE, 2022, pp. 1–5

work page 2022
[28]

Covert uav data transmission via semantic communica- tion: A drl-driven joint position and power optimization method,

R. Xuet al., “Covert uav data transmission via semantic communica- tion: A drl-driven joint position and power optimization method,” in 2024 IEEE/CIC International Conference on Communications in China (ICCC). IEEE, 2024, pp. 66–71

work page 2024
[29]

Multi-agent reinforcement learning for covert semantic communications over wireless networks,

Y . Wanget al., “Multi-agent reinforcement learning for covert semantic communications over wireless networks,” inICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5

work page 2023
[30]

Learning-based power control for secure covert semantic communication,

Y . Liuet al., “Learning-based power control for secure covert semantic communication,” in2025 International Wireless Communications and Mobile Computing (IWCMC). IEEE, 2025, pp. 257–262

work page 2025
[31]

Scf-stega: Controllable linguistic steganography based on semantic communications framework,

Y . Longet al., “Scf-stega: Controllable linguistic steganography based on semantic communications framework,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5

work page 2025
[32]

Multi-modal task-oriented secure semantic communication: A hide-and-deceive approach,

Z. Liet al., “Multi-modal task-oriented secure semantic communication: A hide-and-deceive approach,” in2024 10th International Conference on Computer and Communications (ICCC). IEEE, 2024, pp. 1477–1482

work page 2024
[33]

Towards secure semantic communications in the presence of intelligent eavesdroppers,

S. Tanget al., “Towards secure semantic communications in the presence of intelligent eavesdroppers,”arXiv preprint arXiv:2503.23103, 2025

work page arXiv 2025
[34]

Controllable steganography for robust and efficient seman- tic communication systems,

S. Niet al., “Controllable steganography for robust and efficient seman- tic communication systems,”IEEE Communications Magazine, 2025

work page 2025
[35]

Image semantic steganography: A way to hide informa- tion in semantic communication,

Y . Huoet al., “Image semantic steganography: A way to hide informa- tion in semantic communication,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 2, pp. 1951–1960, 2025

work page 1951
[36]

A coding-enhanced jamming approach for secure semantic communication over wiretap channels

W. Chenet al., “A coding-enhanced jamming approach for secure semantic communication over wiretap channels.”

work page
[37]

Semstediff: Generative diffusion model-based coverless semantic steganography communication,

S. Gaoet al., “Semstediff: Generative diffusion model-based coverless semantic steganography communication,”arXiv preprint arXiv:2509.04803, 2025

work page arXiv 2025
[38]

Rethinking secure semantic communications in the age of generative and agentic ai: Threats and opportunities,

S. Tanget al., “Rethinking secure semantic communications in the age of generative and agentic ai: Threats and opportunities,”arXiv preprint arXiv:2601.01791, 2026

work page arXiv 2026
[39]

Interactive ai with retrieval-augmented generation for next generation networking,

R. Zhanget al., “Interactive ai with retrieval-augmented generation for next generation networking,”IEEE Network, vol. 38, no. 6, pp. 414–424, 2024

work page 2024
[40]

Diffstega: towards universal training-free cover- less image steganography with diffusion models,

Y . Yanget al., “Diffstega: towards universal training-free cover- less image steganography with diffusion models,”arXiv preprint arXiv:2407.10459, 2024

work page arXiv 2024
[41]

Generative ai for space-air-ground integrated net- works,

R. Zhanget al., “Generative ai for space-air-ground integrated net- works,”IEEE Wireless Communications, 2024

work page 2024
[42]

Agentic ai-enhanced semantic communications: Founda- tions, architecture, and applications,

H. Gaoet al., “Agentic ai-enhanced semantic communications: Founda- tions, architecture, and applications,”arXiv preprint arXiv:2512.23294, 2025

work page arXiv 2025
[43]

From large ai models to agentic ai: A tutorial on future intelligent communications,

F. Jianget al., “From large ai models to agentic ai: A tutorial on future intelligent communications,”arXiv preprint arXiv:2505.22311, 2025

work page arXiv 2025
[44]

Lameta: Intent-aware agentic network optimization via a large ai model-empowered two-stage approach,

Y . Liuet al., “Lameta: Intent-aware agentic network optimization via a large ai model-empowered two-stage approach,”arXiv preprint arXiv:2505.12247, 2025

work page arXiv 2025
[45]

Hierarchical micro-segmentations for zero-trust services via large language model (llm)-enhanced graph diffusion,

——, “Hierarchical micro-segmentations for zero-trust services via large language model (llm)-enhanced graph diffusion,”arXiv preprint arXiv:2406.13964, 2024

work page arXiv 2024
[46]

Toward agentic ai: Generative information retrieval inspired intelligent communications and networking,

R. Zhanget al., “Toward agentic ai: generative information retrieval inspired intelligent communications and networking,”arXiv preprint arXiv:2502.16866, 2025

work page arXiv 2025
[47]

Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,

J. Liet al., “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” inInternational confer- ence on machine learning. PMLR, 2022, pp. 12 888–12 900

work page 2022
[48]

Realtime multi-person 2d pose estimation using part affinity fields,

Z. Caoet al., “Realtime multi-person 2d pose estimation using part affinity fields,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7291–7299

work page 2017
[49]

Adding conditional control to text-to-image diffusion models,

L. Zhanget al., “Adding conditional control to text-to-image diffusion models,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 3836–3847

work page 2023
[50]

Denoising diffusion implicit models

J. Songet al., “Denoising diffusion implicit models.”

work page
[51]

Edict: Exact diffusion inversion via coupled trans- formations,

B. Wallaceet al., “Edict: Exact diffusion inversion via coupled trans- formations,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 532–22 541

work page 2023
[52]

Learning transferable visual models from natural language supervision,

A. Radfordet al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PmLR, 2021, pp. 8748–8763

work page 2021
[53]

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

H. Yeet al., “Ip-adapter: Text compatible image prompt adapter for text- to-image diffusion models,”arXiv preprint arXiv:2308.06721, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[54]

Layer Normalization

J. L. Baet al., “Layer normalization,”arXiv preprint arXiv:1607.06450, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[55]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronnebergeret al., “U-net: Convolutional networks for biomedical image segmentation,” inInternational Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241

work page 2015
[56]

Apeg: Adaptive physical layer authentication with channel extrapolation and generative ai,

X. Chenget al., “Apeg: Adaptive physical layer authentication with channel extrapolation and generative ai,”IEEE Transactions on Infor- mation Forensics and Security, 2026

work page 2026
[57]

Attention is all you need,

A. Vaswaniet al., “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017

work page 2017
[58]

Classifier-free diffusion guidance

J. Hoet al., “Classifier-free diffusion guidance.”

work page
[59]

Microsoft coco: Common objects in context,

T.-Y . Linet al., “Microsoft coco: Common objects in context,” in European conference on computer vision. Springer, 2014, pp. 740– 755

work page 2014
[60]

Stargan v2: Diverse image synthesis for multiple domains,

Y . Choiet al., “Stargan v2: Diverse image synthesis for multiple domains,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8188–8197

work page 2020
[61]

Progressive growing of gans for improved quality, stability, and variation,

T. Karraset al., “Progressive growing of gans for improved quality, stability, and variation,” 2018

work page 2018
[62]

Llama 2: Open Foundation and Fine-Tuned Chat Models

H. Touvronet al., “Llama 2: Open foundation and fine-tuned chat models,”arXiv preprint arXiv:2307.09288, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[63]

Oneformer: One transformer to rule universal image segmentation,

J. Jainet al., “Oneformer: One transformer to rule universal image segmentation,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 2989–2998

work page 2023
[64]

Hand keypoint detection in single images using multiview bootstrapping,

T. Simonet al., “Hand keypoint detection in single images using multiview bootstrapping,” inProceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 1145–1153

work page 2017
[65]

Swinjscc: Taming swin transformer for deep joint source-channel coding,

K. Yanget al., “Swinjscc: Taming swin transformer for deep joint source-channel coding,”IEEE Transactions on Cognitive Communica- tions and Networking, 2024

work page 2024
[66]

From analog to digital: Multi-order digital joint coding-modulation for semantic communication,

G. Zhanget al., “From analog to digital: Multi-order digital joint coding-modulation for semantic communication,”IEEE Transactions on Communications, 2024

work page 2024
[67]

Ntire 2017 challenge on single image super- resolution: Dataset and study,

E. Agustssonet al., “Ntire 2017 challenge on single image super- resolution: Dataset and study,” inProceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 126–135

work page 2017
[68]

A style-based generator architecture for generative adversarial networks,

T. Karraset al., “A style-based generator architecture for generative adversarial networks,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410

work page 2019
[69]

The unreasonable effectiveness of deep features as a perceptual metric,

R. Zhanget al., “The unreasonable effectiveness of deep features as a perceptual metric,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595

work page 2018