A Watermark for Vision-Language-Action and World Action Models
Pith reviewed 2026-06-26 07:39 UTC · model grok-4.3
The pith
A secret noise seed lets model owners recover proof of ownership from a robot's actions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The keyed latent-provenance verification method fingerprints the policy through the seed of the Gaussian noise vector drawn before generation. The owner substitutes a keyed seed with identical distribution so that fingerprinted actions remain statistically indistinguishable from ordinary runs. Verification records the action channels from the suspect model under authorized access, recovers the seed via gradient-based maximum a posteriori optimization, scores each rollout against the secret key, and aggregates the scores to decide ownership. Across two representative models the fingerprint is detected reliably, task performance changes little, and detectability holds under output-side removal
What carries the argument
The keyed latent-provenance verification method, which replaces the Gaussian noise seed with a keyed equivalent and recovers it by gradient-based MAP optimization from partial action records.
If this is right
- The fingerprint remains detectable across both VLA and WAM models.
- Task performance stays nearly unchanged after the seed substitution.
- Detection succeeds even after output-side removal attacks.
- Detection succeeds even after weight-level edits.
- The method can identify which of several possible keys a suspect model carries.
Where Pith is reading between the lines
- The same seed-substitution idea might transfer to other generative policies that sample from Gaussian noise at inference time.
- Recovery accuracy could be measured directly on models with altered noise schedules to test how general the optimization step is.
- The approach assumes the verifier can run the suspect model under controlled conditions, which limits its use against fully closed services.
Load-bearing premise
That the partial and possibly post-processed view of the policy's output contains sufficient information for the gradient-based MAP optimization to recover the keyed seed accurately.
What would settle it
If the MAP optimization fails to recover the correct keyed seed on a substantial fraction of rollouts, or if any tested removal attack consistently drives detection rates to chance level, the reliability claim would be refuted.
Figures
read the original abstract
Vision-language-action (VLA) models and world-action models (WAM) are the generative models now driving general-purpose robot control, turning raw camera input directly into motor commands. They are increasingly deployed as black-box services, where a partner runs the policy through an interface while the owner keeps the weights private. Training such a model takes proprietary data and heavy computational power, making the deployed model itself a valuable intellectual property. To address this, we propose the \emph{keyed latent-provenance verification} method, which fingerprints the policy through the seed of the Gaussian noise vector that the models draw before generation. At the injection stage, the owner swaps this seed for a keyed one with the same distribution as ordinary noise, so the fingerprinted actions are statistically identical to those of an ordinary run and an adversary watching the output finds no signal to detect or remove. At the verification stage, the owner runs the suspect model under authorized access and records the action channels the robot executes, a partial and possibly post-processed view of the policy's output. From this view, the verifier recovers the seed by gradient-based maximum a posteriori (MAP) optimization, tests it for the secret key to score each rollout, and aggregates these scores into a single decision on whether the suspect model belongs to the owner. We evaluate the method on two representative models across two robot suites. The experiments cover detection of the fingerprint, identification of which of several keys a suspect carries, robustness to a range of attacks, and an analysis of why the design works. Across both models, the fingerprint can be detected reliably with little change to task performance, and it remains detectable under output-side removal attacks and weight-level edits.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes keyed latent-provenance verification for fingerprinting VLA and WAM models: the owner substitutes a keyed Gaussian seed (statistically identical to standard noise) at generation time; verification recovers the seed via gradient-based MAP optimization over observed (partial, possibly post-processed) action channels, scores each rollout for the secret key, and aggregates into a detection decision. Experiments on two models across two robot suites report reliable detection, negligible task-performance impact, and robustness to output-side removal attacks and weight-level edits, with an accompanying analysis of why the design succeeds.
Significance. If the recovery step holds, the method supplies a practical, output-indistinguishable IP-protection mechanism for black-box-deployed robot policies—an increasingly relevant need. The approach is credited for using an independent optimization (avoiding circularity), evaluating on two representative models, and including an analysis of the design's properties.
major comments (1)
- [§4 and §5] §4 (Verification procedure) and §5 (Experiments): the central claim of reliable detection (including under output-side removal attacks) rests on the gradient-based MAP optimization recovering the keyed seed from a partial/post-processed view of the action channels. The manuscript must report quantitative recovery metrics—e.g., per-rollout seed-bit accuracy, optimization success rate, or loss-surface statistics—across the evaluated conditions; without these, the downstream detection statistic's reliability cannot be assessed.
minor comments (2)
- [Abstract] Abstract: the phrase 'two robot suites' should name the suites (e.g., in parentheses) so readers can immediately contextualize the evaluation scope.
- [§4] Notation: the mapping from observed action channels to the optimization objective should be stated explicitly (e.g., as an equation) rather than described only in prose.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The suggestion to include quantitative recovery metrics for the MAP optimization is well-taken and will improve the clarity of the verification claims. We address the major comment below and will revise the manuscript to incorporate the requested metrics.
read point-by-point responses
-
Referee: [§4 and §5] §4 (Verification procedure) and §5 (Experiments): the central claim of reliable detection (including under output-side removal attacks) rests on the gradient-based MAP optimization recovering the keyed seed from a partial/post-processed view of the action channels. The manuscript must report quantitative recovery metrics—e.g., per-rollout seed-bit accuracy, optimization success rate, or loss-surface statistics—across the evaluated conditions; without these, the downstream detection statistic's reliability cannot be assessed.
Authors: We agree that quantitative metrics on seed recovery would strengthen the presentation. In the revised manuscript we will add, in §4 and §5, tables reporting per-rollout seed-bit accuracy, optimization success rate (fraction of rollouts where MAP converges to the correct key within a tolerance), and summary loss-surface statistics (e.g., final MAP loss and gradient norm) for both models, both robot suites, and all attack conditions. These will be computed from the same rollouts already used for detection-rate experiments, allowing direct assessment of the recovery step that underpins the detection statistic. revision: yes
Circularity Check
No circularity; method is a self-contained proposal with independent verification
full rationale
The paper introduces a keyed latent-provenance verification technique that injects a secret Gaussian seed at generation time and recovers it at verification time via gradient-based MAP optimization on observed actions. This recovery step is an independent computational procedure whose success is evaluated empirically rather than defined into the method itself. No equations, self-citations, or ansatzes are shown that reduce the central claim to a tautology or fitted input. The derivation chain therefore remains non-circular and externally falsifiable through the reported detection rates and attack robustness experiments.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Gaussian noise seed can be substituted with a keyed seed having identical distribution
- domain assumption MAP optimization can recover the seed from partial action channels
Reference graph
Works this paper leans on
-
[1]
Remotely detectable robot policy watermarking.arXiv preprint arXiv:2512.15379, 2025
Michael Amir, Manon Flageat, and Amanda Prorok. Remotely detectable robot policy watermarking.arXiv preprint arXiv:2512.15379, 2025. 2, 3
arXiv 2025
-
[2]
Gr00t n1: An open foundation model for generalist humanoid robots
Johan Bjorck, Fernando Castañeda, Nikita Cherniadev, Xingye Da, Runyu Ding, Linxi Fan, Yu Fang, Dieter Fox, Fengyuan Hu, Spencer Huang, et al. Gr00t n1: An open foundation model for generalist humanoid robots. arXiv preprint arXiv:2503.14734, 2025. 1, 3
Pith/arXiv arXiv 2025
-
[3]
URL:https://arxiv.org/ abs/2410.24164,arXiv:2410.24164
Kevin Black, Noah Brown, Danny Driess, Adnan Es- mail, Michael Equi, Chelsea Finn, Niccolo Fusai, Lachy Groom, Karol Hausman, Brian Ichter, Szymon Jakubczak, Tim Jones, Liyiming Ke, Sergey Levine, Adrian Li-Bell, Mohith Mothukuri, Suraj Nair, Karl Pertsch, Lucy Xiaoyang Shi, James Tanner, Quan Vuong, Anna Walling, Haohuan Wang, and Ury Zhilin- sky.π 0: A ...
Pith/arXiv arXiv 2026
-
[4]
Ringid: Rethinking tree-ring watermarking for en- hanced multi-key identification
Hai Ci, Pei Yang, Yiren Song, and Mike Zheng Shou. Ringid: Rethinking tree-ring watermarking for en- hanced multi-key identification. InEuropean confer- ence on computer vision, pages 338–354. Springer,
-
[5]
Physical Intelligence, Bo Ai, Ali Amin, Raichelle An- iceto, Ashwin Balakrishna, Greg Balke, Kevin Black, George Bokinsky, Shihao Cao, Thomas Charbonnier, Vedant Choudhary, Foster Collins, Ken Conley, Grace Connors, James Darpinian, Karan Dhabalia, Maitrayee Dhaka, Jared DiCarlo, Danny Driess, Michael Equi, Adnan Esmail, Yunhao Fang, Chelsea Finn, Catheri...
Pith/arXiv arXiv 2026
-
[6]
Physical Intelligence, Ali Amin, Raichelle Aniceto, Ashwin Balakrishna, Kevin Black, Ken Conley, Grace Connors, James Darpinian, Karan Dhabalia, Jared DiCarlo, Danny Driess, Michael Equi, Adnan Es- mail, Yunhao Fang, Chelsea Finn, Catherine Glos- sop, Thomas Godden, Ivan Goryachev, Lachy Groom, Hunter Hancock, Karol Hausman, Gashon Hussein, Brian Ichter, ...
-
[7]
URL:https://arxiv.org/abs/2511.14759, arXiv:2511.14759. 1, 3
-
[8]
Physical Intelligence, Kevin Black, Noah Brown, James Darpinian, Karan Dhabalia, Danny Driess, Ad- nan Esmail, Michael Equi, Chelsea Finn, Niccolo Fu- sai, Manuel Y . Galliker, Dibya Ghosh, Lachy Groom, Karol Hausman, Brian Ichter, Szymon Jakubczak, Tim Jones, Liyiming Ke, Devin LeBlanc, Sergey Levine, Adrian Li-Bell, Mohith Mothukuri, Suraj Nair, Karl Pe...
Pith/arXiv arXiv 2025
-
[9]
Robobrain: A uni- fied brain model for robotic manipulation from abstract to concrete
Yuheng Ji, Huajie Tan, Jiayu Shi, Xiaoshuai Hao, Yuan Zhang, Hengyuan Zhang, Pengwei Wang, Mengdi Zhao, Yao Mu, Pengju An, et al. Robobrain: A uni- fied brain model for robotic manipulation from abstract to concrete. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 1724–1734, 2025. 1, 3
2025
-
[10]
Open- vla: An open-source vision-language-action model
Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, et al. Open- vla: An open-source vision-language-action model. arXiv preprint arXiv:2406.09246, 2024. 1, 3
Pith/arXiv arXiv 2024
-
[11]
Causal world modeling for robot control
Lin Li, Qihang Zhang, Yiming Luo, Shuai Yang, Ruilin Wang, Fei Han, Mingrui Yu, Zelin Gao, Nan Xue, Xing Zhu, et al. Causal world modeling for robot control. arXiv preprint arXiv:2601.21998, 2026. 1, 3
Pith/arXiv arXiv 2026
-
[12]
Fundamentals of statistical signal pro- cessing: detection theory, 1998
SK Sengupta. Fundamentals of statistical signal pro- cessing: detection theory, 1998. 5
1998
-
[13]
Ming Sun, Rui Wang, Xingrui Yu, Lihua Jing, Hangyu Du, Zhenglin Wan, Xu Pan, and Ivor Tsang. Towards backdoor-based ownership verifica- tion for vision-language-action models.arXiv preprint arXiv:2605.09005, 2026. 1, 3
Pith/arXiv arXiv 2026
-
[14]
Gemini robotics: Bringing ai into the physical world.arXiv preprint arXiv:2503.20020, 2025
Gemini Robotics Team, Saminda Abeyruwan, Joshua Ainslie, Jean-Baptiste Alayrac, Montserrat Gonzalez Arenas, Travis Armstrong, Ashwin Balakrishna, Robert Baruch, Maria Bauza, Michiel Blokzijl, et al. Gemini robotics: Bringing ai into the physical world.arXiv preprint arXiv:2503.20020, 2025. 1, 3
Pith/arXiv arXiv 2025
-
[15]
Yuxin Wen, John Kirchenbauer, Jonas Geiping, and Tom Goldstein. Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust.arXiv preprint arXiv:2305.20030, 2023. 2, 3, 4, 5
arXiv 2023
-
[16]
Gaussian shading: Prov- able performance-lossless image watermarking for dif- fusion models
Zijin Yang, Kai Zeng, Kejiang Chen, Han Fang, Weim- ing Zhang, and Nenghai Yu. Gaussian shading: Prov- able performance-lossless image watermarking for dif- fusion models. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 12162–12171, 2024. 2, 3, 4
2024
-
[17]
World action models are zero-shot policies
Seonghyeon Ye, Yunhao Ge, Kaiyuan Zheng, Shenyuan Gao, Sihyun Yu, George Kurian, Suneel Indupuru, You Liang Tan, Chuning Zhu, Jiannan Xi- ang, et al. World action models are zero-shot policies. arXiv preprint arXiv:2602.15922, 2026. 1, 3 .1 Method Hyperparameters All main experiments use latent injection strengthβ=1.0. In Equation (1), anyβ∈[0,1]preserves...
Pith/arXiv arXiv 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.