SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From
Pith reviewed 2026-05-18 11:57 UTC · model grok-4.3
The pith
Random initialization seeds create detectable prediction biases in LLMs that persist through pretraining and enable lineage verification from the start.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose SeedPrints, a method that leverages random initialization biases as persistent, seed-dependent identifiers present even before training begins. We show that untrained models exhibit reproducible prediction biases induced by their initialization seed, and that these weak signals remain statistically detectable throughout training, enabling high-confidence lineage verification. Unlike prior techniques that fail during early pretraining or degrade under distribution shifts, SeedPrints remains effective across all training stages, from initialization to large-scale pretraining and downstream adaptation.
What carries the argument
SeedPrints, which extracts persistent, seed-specific prediction biases from model outputs as an intrinsic fingerprint that exists from initialization onward.
If this is right
- A model can be traced back to its exact initialization seed even after full pretraining and fine-tuning.
- Lineage verification becomes possible at every stage, including the earliest phases of training.
- The method stays reliable when models encounter new domains or undergo parameter changes.
- High-confidence attribution works for both LLaMA-style and Qwen-style architectures.
Where Pith is reading between the lines
- If the biases truly survive arbitrary training, future releases of open models could include seed-based provenance tags as standard practice.
- The persistence of early signals suggests that training dynamics preserve more information about starting conditions than is usually assumed.
- Testing whether these biases can be deliberately amplified or suppressed would clarify how robust the fingerprint remains under adversarial fine-tuning.
Load-bearing premise
The reproducible prediction biases induced by the initialization seed remain statistically detectable and persistent throughout prolonged training, domain shifts, and parameter modifications.
What would settle it
A demonstration that, after enough training steps on the same data, the output distributions or bias patterns from two different initialization seeds become statistically indistinguishable would falsify the central claim.
Figures
read the original abstract
Fingerprinting Large Language Models (LLMs)is essential for provenance verification and model attribution. Existing fingerprinting methods are primarily evaluated after fine-tuning, where models have already acquired stable signatures from training data, optimization dynamics, or hyperparameters. However, most of a model's capacity and knowledge are acquired during pretraining rather than downstream fine-tuning, making large-scale pretraining a more fundamental regime for lineage verification. We show that existing fingerprinting methods become unreliable in this regime, as they rely on post-hoc signatures that only emerge after substantial training. This limitation contradicts the classical Galton notion of a fingerprint as an intrinsic and persistent identity. In contrast, we propose a stronger and more intrinsic notion of LLM fingerprinting: SeedPrints, a method that leverages random initialization biases as persistent, seed-dependent identifiers present even before training begins. We show that untrained models exhibit reproducible prediction biases induced by their initialization seed, and that these weak signals remain statistically detectable throughout training, enabling high-confidence lineage verification. Unlike prior techniques that fail during early pretraining or degrade under distribution shifts, SeedPrints remains effective across all training stages, from initialization to large-scale pretraining and downstream adaptation. Experiments on LLaMA-style and Qwen-style models demonstrate seed-level distinguishability and enable birth-to-lifecycle identity verification. Evaluations on large-scale pretraining trajectories and real-world fingerprinting benchmarks further confirm its robustness under prolonged training, domain shifts, and parameter modifications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes SeedPrints, a fingerprinting method for LLMs that exploits reproducible next-token prediction biases induced by the random initialization seed. These biases are claimed to be present even in untrained models and to remain statistically detectable across initialization, large-scale pretraining (billions of tokens), domain shifts, and downstream adaptation on LLaMA-style and Qwen-style models, enabling high-confidence birth-to-lifecycle lineage verification. This is positioned as superior to prior post-hoc fingerprinting techniques that only emerge after substantial training or degrade under shifts.
Significance. If the persistence of seed-specific biases holds under prolonged training, SeedPrints would represent a meaningful advance in model provenance and attribution by supplying an intrinsic, pre-training identifier aligned with the classical Galton notion of fingerprints. This addresses a gap in verifying lineage during the pretraining regime where most model capacity is acquired, with potential implications for security, copyright, and regulatory compliance in LLM deployment.
major comments (2)
- [Experiments on large-scale pretraining trajectories] The central claim that seed-dependent biases remain statistically detectable after prolonged pretraining requires explicit quantitative tracking of the distinguishability metric (e.g., seed-classification accuracy or output-distribution KL divergence) across training checkpoints. The abstract asserts robustness from initialization to large-scale pretraining, but without reported decay curves or values at intermediate steps (e.g., after 10B vs. 100B tokens), the load-bearing persistence assumption is not fully substantiated.
- [Evaluations on large-scale pretraining trajectories and real-world benchmarks] §4 (or equivalent experimental section): the evaluations on domain shifts and parameter modifications must include concrete details on the shifts tested, number of seeds/models evaluated, and statistical tests (with error bars or p-values) to support the high-confidence verification claim. The current description leaves the strength of seed-level distinguishability under distribution shifts difficult to assess.
minor comments (2)
- [Method] Clarify the precise mathematical definition of the SeedPrint extraction or bias measurement (e.g., how the reproducible prediction bias is formalized and compared across models).
- [Figures] Ensure all figures reporting distinguishability include axis labels, legends, and confidence intervals for reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. The comments highlight opportunities to strengthen the experimental evidence for the persistence of seed-specific biases. We address each major comment below and will incorporate revisions to improve clarity and substantiation.
read point-by-point responses
-
Referee: [Experiments on large-scale pretraining trajectories] The central claim that seed-dependent biases remain statistically detectable after prolonged pretraining requires explicit quantitative tracking of the distinguishability metric (e.g., seed-classification accuracy or output-distribution KL divergence) across training checkpoints. The abstract asserts robustness from initialization to large-scale pretraining, but without reported decay curves or values at intermediate steps (e.g., after 10B vs. 100B tokens), the load-bearing persistence assumption is not fully substantiated.
Authors: We agree that explicit tracking of the distinguishability metric across intermediate checkpoints would provide stronger support for the persistence claim. Our experiments already evaluate SeedPrints at initialization, after full-scale pretraining on hundreds of billions of tokens, and post-adaptation, demonstrating that seed-level distinguishability remains statistically significant without substantial degradation. However, we did not report fine-grained decay curves at specific token milestones such as 10B or 100B. In the revision we will add a dedicated analysis section with checkpointed results, including plots of seed-classification accuracy and output-distribution KL divergence over the course of pretraining. revision: yes
-
Referee: [Evaluations on large-scale pretraining trajectories and real-world benchmarks] §4 (or equivalent experimental section): the evaluations on domain shifts and parameter modifications must include concrete details on the shifts tested, number of seeds/models evaluated, and statistical tests (with error bars or p-values) to support the high-confidence verification claim. The current description leaves the strength of seed-level distinguishability under distribution shifts difficult to assess.
Authors: We appreciate the request for greater specificity. Section 4 already describes evaluations on LLaMA-style and Qwen-style models under domain shifts (including code and math corpora) and parameter modifications, using multiple random seeds per architecture and reporting average accuracies. To make the strength of the results easier to assess, we will revise the section to explicitly enumerate the exact shifts tested, the precise number of seeds and models, and to include error bars together with statistical significance tests (e.g., t-tests with p-values) supporting the high-confidence lineage verification claims. revision: yes
Circularity Check
No significant circularity; empirical persistence claims are externally testable
full rationale
The paper advances an empirical fingerprinting technique that detects reproducible next-token biases induced by random initialization seeds, then demonstrates via experiments that these signals remain detectable after pretraining and adaptation on LLaMA-style and Qwen-style models. No derivation chain, equation, or uniqueness theorem is invoked that reduces the target distinguishability metric to a fitted parameter or self-citation by construction. The load-bearing persistence claim is presented as an experimental result rather than a definitional or self-referential necessity, leaving the method open to falsification on independent training runs and benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
untrained models exhibit reproducible prediction biases induced by their initialization seed, and that these weak signals remain statistically detectable throughout training
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
The Structural Origin of Attention Sink: Variance Discrepancy, Super Neurons, and Dimension Disparity
Attention sinks arise from variance discrepancy in self-attention value aggregation, amplified by super neurons and first-token dimension disparity, and can be mitigated by head-wise RMSNorm to accelerate pre-training...
Reference graph
Works this paper leans on
-
[1]
Agnibh Dasgupta, Abdullah Tanvir, and Xin Zhong
URL https: //lmsys.org/blog/2023-03-30-vicuna/. Agnibh Dasgupta, Abdullah Tanvir, and Xin Zhong. Watermarking language models through language models.arXiv preprint arXiv:2411.05091,
-
[2]
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Ronen Eldan and Yuanzhi Li. Tinystories: How small can language models be and still speak coherent english?arXiv preprint arXiv:2305.07759,
work page internal anchor Pith review arXiv
-
[3]
Dmitri Iourovitski, Sanat Sharma, and Rakshak Talwar
Fine-tuned from NousResearch/Llama-2-7b-hf; MIT License; accessed 2025-09-02. Dmitri Iourovitski, Sanat Sharma, and Rakshak Talwar. Hide and seek: Fingerprinting large language models with evolutionary learning.arXiv preprint arXiv:2408.02871,
-
[4]
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Jianguang Lou, Chongyang Tao, Xiubo Geng, Qingwei Lin, Shifeng Chen, and Dongmei Zhang. Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct.arXiv preprint arXiv:2308.09583,
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
Base 7 billion-parameter Code Llama model for code synthesis and understanding; trained between January and July 2023; licensed under Meta Llama 2 license; accessed 2025-09-02. Anshul Nasery, Jonathan Hayase, Creston Brooks, Peiyao Sheng, Himanshu Tyagi, Pramod Viswanath, and Sewoong Oh. Scalable fingerprinting of large language models. InICLR 2025 Worksh...
-
[6]
Sok: Large language model copyright auditing via fingerprinting.arXiv preprint arXiv:2508.19843,
Shuo Shao, Yiming Li, Yu He, Hongwei Yao, Wenyuan Yang, Dacheng Tao, and Zhan Qin. Sok: Large language model copyright auditing via fingerprinting.arXiv preprint arXiv:2508.19843,
-
[7]
Natural fingerprints of large language models.arXiv preprint arXiv:2504.14871,
Teppei Suzuki, Ryokan Ri, and Sho Takase. Natural fingerprints of large language models.arXiv preprint arXiv:2504.14871,
-
[8]
Qwen Team. Qwen2 technical report.arXiv preprint arXiv:2407.10671, 2,
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971,
work page internal anchor Pith review Pith/arXiv arXiv
-
[10]
Rofl: Robust fingerprinting of language models.arXiv preprint arXiv:2505.12682,
Yun-Yun Tsai, Chuan Guo, Junfeng Yang, and Laurens van der Maaten. Rofl: Robust fingerprinting of language models.arXiv preprint arXiv:2505.12682,
-
[11]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art...
work page 2020
-
[12]
URLhttps://www.aclweb.org/anthology/2020.emnlp-demos.6
Association for Computational Linguistics. URLhttps://www.aclweb.org/anthology/2020.emnlp-demos.6. Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, and Muhao Chen. Instruc- tional fingerprinting of large language models.arXiv preprint arXiv:2401.12255,
-
[13]
Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model!
Do-hyeon Yoon, Minsoo Chun, Thomas Allen, Hans Müller, Min Wang, and Rajesh Sharma. Intrinsic fingerprint of llms: Continue training is not all you need to steal a model!arXiv preprint arXiv:2507.03014,
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
Protecting intellectual property of deep neural networks with watermarking
12 Preprint Jialong Zhang, Zhongshu Gu, Jiyong Jang, Hui Wu, Marc Ph Stoecklin, Heqing Huang, and Ian Molloy. Protecting intellectual property of deep neural networks with watermarking. InProceedings of the 2018 on Asia conference on computer and communications security, pp. 159–172,
work page 2018
-
[15]
Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong Liu, Yu Qiao, and Jing Shao. Reef: Representation encoding fingerprints for large language models.arXiv preprint arXiv:2410.14273,
-
[16]
Ruichong Zhang. Matrix-driven instant review: Confident detection and reconstruction of llm plagiarism on pc.arXiv preprint arXiv:2508.06309,
-
[17]
Renjie Zhu, Xinpeng Zhang, Mengte Shi, and Zhenjun Tang. Secure neural network watermarking protocol against forging attack.EURASIP Journal on Image and Video Processing, 2020(1):37,
work page 2020
-
[18]
13 Preprint A LLM USAGE We use AI assistants, i.e., ChatGPT and Gemini, for writing and formatting support. Their use covers grammar and style checks, improving clarity of figure and table captions, and other surface-level edits. For programming-related tasks, we occasionally use GitHub Copilot and Claude as coding assistants, e.g., for code auto-completi...
work page 2020
-
[19]
for distributed runs. All open-source models are loaded from their official Hugging Face releases and used under their original licenses: Llama models under the Meta Llama Community License, and other models under Apache-2.0. All datasets are downloaded via the Hugging Face Datasets library (the library is Apache-2.0); dataset content follows each dataset...
work page 2000
-
[20]
Consistent with the LLaMA-style results, all seed–model pairs yield p-values below 0.01. This indicates that training does not erase the initialization fingerprint; instead, the signature is preserved in the descendant model. 14 Preprint Table 10: Trained models share the same fingerprint behaviors as their initialization models (p-value < 0.01). Model Pa...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.