CLASP: Training-Free LLM-Assisted Source Code Watermarking via Semantic-Preserving Transformations
Pith reviewed 2026-05-18 07:55 UTC · model grok-4.3
The pith
CLASP encodes watermarks into source code by applying a fixed set of meaning-preserving transformations generated automatically by large language models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CLASP embeds watermark bits within a fixed space of semantics-preserving transformations, enabling automated watermark insertion with higher capacity while remaining reusable across programming languages and less dependent on brittle lexical features. To recover the watermark, CLASP uses reference-code retrieval and differential comparison to identify transformation traces, avoiding task-specific model training while improving robustness to structural edits and adaptive attacks.
What carries the argument
A fixed collection of semantics-preserving code transformations whose selection pattern encodes the watermark bits, generated by an LLM for insertion and recovered through differential comparison against a reference program.
If this is right
- Watermark capacity increases because multiple transformation choices can represent different bit combinations rather than single identifier swaps.
- The same watermarking process applies to code in Java, Python, C++ and other languages without language-specific retraining.
- Detection remains possible after structural edits because recovery relies on differential comparison rather than matching exact lexical features.
- No task-specific model training is needed at insertion or extraction time, allowing plug-and-play use.
- Code functionality and readability stay intact because every applied transformation is chosen to preserve semantics.
Where Pith is reading between the lines
- If the transformation space proves broad enough, similar techniques might protect other structured artifacts such as configuration files or data schemas.
- Routine use would encourage developers to retain original reference versions alongside watermarked releases for later verification.
- Attackers might shift focus to removing or randomizing the reference itself rather than the transformations, changing the threat model for code repositories.
Load-bearing premise
The method requires that large language models can produce a stable, detectable collection of meaning-preserving edits whose traces survive common code changes and can be matched back to the original reference without special training.
What would settle it
Apply CLASP to a program, then subject the output to standard refactoring tools or adaptive de-watermarking attacks that alter structure while preserving behavior, and measure whether watermark extraction accuracy falls below reliable thresholds.
read the original abstract
The proliferation of open-source code and large language models (LLMs) for code generation has amplified the risks of unauthorized reuse and intellectual property infringement. Source code watermarking offers a potential solution, yet existing methods typically encode watermarks through identifiers, local code patterns, or limited handcrafted edits, leaving them vulnerable to renaming, refactoring, and adaptive watermark removal. These limitations hinder the joint achievement of robustness, capacity, generalization, and deployment efficiency. We propose CLASP, a Code LLM-Assisted Semantic-Preserving watermarking framework that enables training-free, plug-and-play watermarking for source code. CLASP embeds watermark bits within a fixed space of semantics-preserving transformations, enabling automated watermark insertion with higher capacity while remaining reusable across programming languages and less dependent on brittle lexical features. To recover the watermark, CLASP uses reference-code retrieval and differential comparison to identify transformation traces, avoiding task-specific model training while improving robustness to structural edits and adaptive attacks. Experiments across multiple programming languages show that CLASP consistently outperforms existing baselines in watermark extraction accuracy and robustness, while maintaining code quality under both random removal and adaptive de-watermarking attacks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CLASP, a training-free LLM-assisted framework for source code watermarking. It embeds watermark bits by applying transformations drawn from a fixed space of semantics-preserving edits, using LLMs for automated insertion. Recovery relies on reference-code retrieval followed by differential comparison to detect the applied transformations. The authors claim that the method delivers higher capacity, cross-language reusability, robustness to random removal and adaptive de-watermarking attacks, and preserved code quality, outperforming prior baselines across multiple programming languages.
Significance. If the central claims are substantiated, the work would be significant for software IP protection. The training-free, plug-and-play design and reduced reliance on brittle lexical features address practical limitations of existing watermarking schemes. The emphasis on a reusable transformation space and differential recovery offers a potentially generalizable approach, though its value hinges on empirical verification of capacity, uniqueness of traces, and attack resistance.
major comments (3)
- [§4.2] §4.2 (Recovery via differential comparison): The load-bearing assumption that LLM-generated transformations produce unique, reliably recoverable differential traces is not sufficiently justified. LLM stochasticity and the possibility that distinct transformation sequences yield similar diffs, or that reference retrieval returns inexact matches for real-world variants, could produce ambiguous bit extraction and undermine the claimed robustness to adaptive attacks.
- [§3] §3 (Fixed space of semantics-preserving transformations): The central claim of higher capacity and cross-language reuse rests on the existence of a fixed, automatically generatable, and uniquely identifiable set of transformations. The manuscript provides no formal enumeration, cardinality bound, or proof of uniqueness for this space, leaving the capacity and generalization assertions difficult to evaluate.
- [§5] §5 (Experimental results): The abstract asserts consistent outperformance and robustness across languages and attack types, yet the evaluation lacks reported quantitative metrics (e.g., extraction accuracy with error bars), dataset sizes, ablation studies on transformation-space size, or statistical tests. Without these, the superiority claims cannot be verified as load-bearing evidence.
minor comments (2)
- [§2] Notation for the transformation space and bit-encoding mapping could be introduced earlier and used consistently to improve readability.
- [Figures] Figure captions should explicitly state the number of runs or seeds used for LLM sampling to clarify reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We have addressed each major comment point by point below, providing clarifications and committing to revisions where appropriate to strengthen the paper.
read point-by-point responses
-
Referee: [§4.2] §4.2 (Recovery via differential comparison): The load-bearing assumption that LLM-generated transformations produce unique, reliably recoverable differential traces is not sufficiently justified. LLM stochasticity and the possibility that distinct transformation sequences yield similar diffs, or that reference retrieval returns inexact matches for real-world variants, could produce ambiguous bit extraction and undermine the claimed robustness to adaptive attacks.
Authors: We agree that the recoverability of unique differential traces is central to the method. Section 4.2 explains the reference retrieval and differential comparison process, and Section 5 reports high extraction accuracy under adaptive attacks, indicating that traces are distinguishable in practice despite LLM stochasticity. We will revise the manuscript to add a dedicated discussion paragraph on empirical observations of trace uniqueness across LLM runs, the role of reference retrieval in handling variants, and potential limitations in edge cases. This addresses the concern through expanded justification and empirical support. revision: partial
-
Referee: [§3] §3 (Fixed space of semantics-preserving transformations): The central claim of higher capacity and cross-language reuse rests on the existence of a fixed, automatically generatable, and uniquely identifiable set of transformations. The manuscript provides no formal enumeration, cardinality bound, or proof of uniqueness for this space, leaving the capacity and generalization assertions difficult to evaluate.
Authors: Section 3 defines the transformation space operationally as the set of semantics-preserving edits generatable by LLMs, with concrete examples across languages to illustrate reusability and capacity. We do not provide a formal mathematical enumeration or uniqueness proof, as the space is practically defined by semantic validity rather than a closed theoretical set. We will revise Section 3 to include a clearer operational description of space construction, an empirical estimate of effective cardinality based on observed transformations, and additional cross-language examples to better support the claims. revision: yes
-
Referee: [§5] §5 (Experimental results): The abstract asserts consistent outperformance and robustness across languages and attack types, yet the evaluation lacks reported quantitative metrics (e.g., extraction accuracy with error bars), dataset sizes, ablation studies on transformation-space size, or statistical tests. Without these, the superiority claims cannot be verified as load-bearing evidence.
Authors: We acknowledge that the experimental section would benefit from more rigorous reporting. While Section 5 presents extraction accuracies and robustness results across languages and attacks, we will update the revised manuscript to include error bars on all accuracy metrics, explicitly state the sizes and sources of all datasets, add ablation studies varying the transformation space size, and incorporate statistical significance tests (such as paired t-tests) comparing against baselines. These additions will provide stronger quantitative support for the outperformance claims. revision: yes
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper presents CLASP as a training-free framework that applies LLM-generated semantics-preserving transformations for watermark embedding and uses reference retrieval plus differential comparison for extraction. No equations or steps reduce a claimed prediction or uniqueness result to a fitted parameter or self-citation by construction. The central claims rest on the external capabilities of LLMs and standard retrieval methods rather than tautological re-use of the paper's own outputs or prior self-citations as load-bearing proofs. Experiments are offered as empirical validation, not as inputs that are then re-predicted. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A sufficient set of semantics-preserving transformations exists that can encode watermark bits and be automatically generated by LLMs.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.