A semantic invariant robust watermark for large language models.arXiv preprint arXiv:2310.06356

Aiwei Liu, Leyi Pan, Xuming Hu, Shiao Meng, Lijie Wen · 2024 · arXiv 2310.06356

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

SLAM: Structural Linguistic Activation Marking for Language Models

cs.CL · 2026-05-06 · unverdicted · novelty 8.0

SLAM achieves 100% detection on Gemma-2 models with only 1-2 point quality cost by causally steering SAE-identified residual-stream directions for linguistic structure.

RLCracker: Evaluating the Worst-Case Vulnerability of LLM Watermarks with Adaptive RL Attacks

cs.CR · 2025-09-25 · conditional · novelty 8.0

RLCracker is a reinforcement learning attack that erases LLM watermarks at 98.5% success rate with minimal data and generalizes across ten schemes and multiple model sizes.

SWAN: Semantic Watermarking with Abstract Meaning Representation

cs.CL · 2026-05-05 · unverdicted · novelty 7.0

SWAN uses AMR to embed semantic watermarks that persist through paraphrases, matching SOTA detection on original text and improving AUC by 13.9 points on paraphrased RealNews data.

Context-Fidelity Boosting: Enhancing Faithful Generation through Watermark-Inspired Decoding

cs.CL · 2026-04-24 · unverdicted · novelty 7.0

Context-Fidelity Boosting reduces faithfulness hallucinations by applying context-based logit boosts to source-supported tokens during LLM decoding.

Topic-Based Watermarks for Large Language Models

cs.CR · 2024-04-02 · unverdicted · novelty 7.0

A topic-guided watermarking scheme partitions the LLM vocabulary into topic-aligned token subsets and green-lists relevant tokens based on the input prompt to embed detectable marks while preserving text quality and improving robustness to attacks.

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

cs.CR · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

TextSeal provides a localized, distortion-free LLM watermark that outperforms baselines in detection strength, remains effective in mixed human-AI text, preserves model performance, and transfers through distillation for provenance tracking.

citing papers explorer

Showing 6 of 6 citing papers.

SLAM: Structural Linguistic Activation Marking for Language Models cs.CL · 2026-05-06 · unverdicted · none · ref 15
SLAM achieves 100% detection on Gemma-2 models with only 1-2 point quality cost by causally steering SAE-identified residual-stream directions for linguistic structure.
RLCracker: Evaluating the Worst-Case Vulnerability of LLM Watermarks with Adaptive RL Attacks cs.CR · 2025-09-25 · conditional · none · ref 20
RLCracker is a reinforcement learning attack that erases LLM watermarks at 98.5% success rate with minimal data and generalizes across ten schemes and multiple model sizes.
SWAN: Semantic Watermarking with Abstract Meaning Representation cs.CL · 2026-05-05 · unverdicted · none · ref 55
SWAN uses AMR to embed semantic watermarks that persist through paraphrases, matching SOTA detection on original text and improving AUC by 13.9 points on paraphrased RealNews data.
Context-Fidelity Boosting: Enhancing Faithful Generation through Watermark-Inspired Decoding cs.CL · 2026-04-24 · unverdicted · none · ref 1
Context-Fidelity Boosting reduces faithfulness hallucinations by applying context-based logit boosts to source-supported tokens during LLM decoding.
Topic-Based Watermarks for Large Language Models cs.CR · 2024-04-02 · unverdicted · none · ref 30
A topic-guided watermarking scheme partitions the LLM vocabulary into topic-aligned token subsets and green-lists relevant tokens based on the input prompt to embed detectable marks while preserving text quality and improving robustness to attacks.
TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection cs.CR · 2026-05-12 · unverdicted · none · ref 15 · 2 links
TextSeal provides a localized, distortion-free LLM watermark that outperforms baselines in detection strength, remains effective in mixed human-AI text, preserves model performance, and transfers through distillation for provenance tracking.

A semantic invariant robust watermark for large language models.arXiv preprint arXiv:2310.06356

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer