In-Context Watermarks for Large Language Models

· 2025 · cs.CL · arXiv 2505.16934

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

The growing use of large language models (LLMs) for sensitive applications has highlighted the need for effective watermarking techniques to ensure the provenance and accountability of AI-generated text. However, most existing watermarking methods require access to the decoding process, limiting their applicability in real-world settings. One illustrative example is the use of LLMs by dishonest reviewers in the context of academic peer review, where conference organizers have no access to the model used but still need to detect AI-generated reviews. Motivated by this gap, we introduce In-Context Watermarking (ICW), which embeds watermarks into generated text solely through prompt engineering, leveraging LLMs' in-context learning and instruction-following abilities. We investigate four ICW strategies at different levels of granularity, each paired with a tailored detection method. We further examine the Indirect Prompt Injection (IPI) setting as a specific case study, in which watermarking is covertly triggered by modifying input documents such as academic manuscripts. Our experiments validate the feasibility of ICW as a model-agnostic, practical watermarking approach. Moreover, our findings suggest that as LLMs become more capable, ICW offers a promising direction for scalable and accessible content attribution. Our code is available at https://github.com/yepengliu/In-Context-Watermarks.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes

cs.IT · 2026-05-09 · unverdicted · novelty 5.0

Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of stationary ergodic stochastic processes.

Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption

cs.CR · 2025-10-21 · unverdicted · novelty 4.0

LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.

citing papers explorer

Showing 2 of 2 citing papers.

Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes cs.IT · 2026-05-09 · unverdicted · none · ref 33 · internal anchor
Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of stationary ergodic stochastic processes.
Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption cs.CR · 2025-10-21 · unverdicted · none · ref 41 · internal anchor
LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.

In-Context Watermarks for Large Language Models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer