pith. sign in

arxiv: 2509.26404 · v2 · submitted 2025-09-30 · 💻 cs.CR · cs.AI· cs.CL

SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From

Pith reviewed 2026-05-18 11:57 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CL
keywords LLM fingerprintingmodel provenanceinitialization seedlineage verificationpretrainingprediction biasmodel attribution
0
0 comments X

The pith

Random initialization seeds create detectable prediction biases in LLMs that persist through pretraining and enable lineage verification from the start.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to establish that random seeds used to initialize language models produce reproducible prediction biases that act as intrinsic identifiers. These biases stay statistically detectable even after massive pretraining, domain shifts, and later adaptations, unlike earlier fingerprinting approaches that only appear reliably after fine-tuning. A sympathetic reader would care because most of a model's knowledge forms during pretraining, so a method that works from initialization onward gives a more fundamental way to verify where a model came from. The work shows these signals remain usable across LLaMA-style and Qwen-style training runs and real-world benchmarks.

Core claim

We propose SeedPrints, a method that leverages random initialization biases as persistent, seed-dependent identifiers present even before training begins. We show that untrained models exhibit reproducible prediction biases induced by their initialization seed, and that these weak signals remain statistically detectable throughout training, enabling high-confidence lineage verification. Unlike prior techniques that fail during early pretraining or degrade under distribution shifts, SeedPrints remains effective across all training stages, from initialization to large-scale pretraining and downstream adaptation.

What carries the argument

SeedPrints, which extracts persistent, seed-specific prediction biases from model outputs as an intrinsic fingerprint that exists from initialization onward.

If this is right

  • A model can be traced back to its exact initialization seed even after full pretraining and fine-tuning.
  • Lineage verification becomes possible at every stage, including the earliest phases of training.
  • The method stays reliable when models encounter new domains or undergo parameter changes.
  • High-confidence attribution works for both LLaMA-style and Qwen-style architectures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the biases truly survive arbitrary training, future releases of open models could include seed-based provenance tags as standard practice.
  • The persistence of early signals suggests that training dynamics preserve more information about starting conditions than is usually assumed.
  • Testing whether these biases can be deliberately amplified or suppressed would clarify how robust the fingerprint remains under adversarial fine-tuning.

Load-bearing premise

The reproducible prediction biases induced by the initialization seed remain statistically detectable and persistent throughout prolonged training, domain shifts, and parameter modifications.

What would settle it

A demonstration that, after enough training steps on the same data, the output distributions or bias patterns from two different initialization seeds become statistically indistinguishable would falsify the central claim.

Figures

Figures reproduced from arXiv: 2509.26404 by Haonan Wang, Kenji Kawaguchi, Siquan Li, Tianyang Hu, Yao Tong.

Figure 1
Figure 1. Figure 1: Initialization-born token bias persists through training. Upper Left: With uniform random inputs, a randomly initialized LLaMA-2–style model assigns highly non-uniform next-token probabilities, concentrating on a subset of tokens. Lower Left: An 80/20 coverage pattern: roughly 20% of tokens are selected for the next-token of 80% inputs. Upper Right: During training, these within-set preference remain align… view at source ↗
Figure 2
Figure 2. Figure 2: Fingerprint verifies lineage at every checkpoint (p-values < 0.01) [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Fingerprint verifies lineage at every checkpoint (p [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
read the original abstract

Fingerprinting Large Language Models (LLMs)is essential for provenance verification and model attribution. Existing fingerprinting methods are primarily evaluated after fine-tuning, where models have already acquired stable signatures from training data, optimization dynamics, or hyperparameters. However, most of a model's capacity and knowledge are acquired during pretraining rather than downstream fine-tuning, making large-scale pretraining a more fundamental regime for lineage verification. We show that existing fingerprinting methods become unreliable in this regime, as they rely on post-hoc signatures that only emerge after substantial training. This limitation contradicts the classical Galton notion of a fingerprint as an intrinsic and persistent identity. In contrast, we propose a stronger and more intrinsic notion of LLM fingerprinting: SeedPrints, a method that leverages random initialization biases as persistent, seed-dependent identifiers present even before training begins. We show that untrained models exhibit reproducible prediction biases induced by their initialization seed, and that these weak signals remain statistically detectable throughout training, enabling high-confidence lineage verification. Unlike prior techniques that fail during early pretraining or degrade under distribution shifts, SeedPrints remains effective across all training stages, from initialization to large-scale pretraining and downstream adaptation. Experiments on LLaMA-style and Qwen-style models demonstrate seed-level distinguishability and enable birth-to-lifecycle identity verification. Evaluations on large-scale pretraining trajectories and real-world fingerprinting benchmarks further confirm its robustness under prolonged training, domain shifts, and parameter modifications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes SeedPrints, a fingerprinting method for LLMs that exploits reproducible next-token prediction biases induced by the random initialization seed. These biases are claimed to be present even in untrained models and to remain statistically detectable across initialization, large-scale pretraining (billions of tokens), domain shifts, and downstream adaptation on LLaMA-style and Qwen-style models, enabling high-confidence birth-to-lifecycle lineage verification. This is positioned as superior to prior post-hoc fingerprinting techniques that only emerge after substantial training or degrade under shifts.

Significance. If the persistence of seed-specific biases holds under prolonged training, SeedPrints would represent a meaningful advance in model provenance and attribution by supplying an intrinsic, pre-training identifier aligned with the classical Galton notion of fingerprints. This addresses a gap in verifying lineage during the pretraining regime where most model capacity is acquired, with potential implications for security, copyright, and regulatory compliance in LLM deployment.

major comments (2)
  1. [Experiments on large-scale pretraining trajectories] The central claim that seed-dependent biases remain statistically detectable after prolonged pretraining requires explicit quantitative tracking of the distinguishability metric (e.g., seed-classification accuracy or output-distribution KL divergence) across training checkpoints. The abstract asserts robustness from initialization to large-scale pretraining, but without reported decay curves or values at intermediate steps (e.g., after 10B vs. 100B tokens), the load-bearing persistence assumption is not fully substantiated.
  2. [Evaluations on large-scale pretraining trajectories and real-world benchmarks] §4 (or equivalent experimental section): the evaluations on domain shifts and parameter modifications must include concrete details on the shifts tested, number of seeds/models evaluated, and statistical tests (with error bars or p-values) to support the high-confidence verification claim. The current description leaves the strength of seed-level distinguishability under distribution shifts difficult to assess.
minor comments (2)
  1. [Method] Clarify the precise mathematical definition of the SeedPrint extraction or bias measurement (e.g., how the reproducible prediction bias is formalized and compared across models).
  2. [Figures] Ensure all figures reporting distinguishability include axis labels, legends, and confidence intervals for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. The comments highlight opportunities to strengthen the experimental evidence for the persistence of seed-specific biases. We address each major comment below and will incorporate revisions to improve clarity and substantiation.

read point-by-point responses
  1. Referee: [Experiments on large-scale pretraining trajectories] The central claim that seed-dependent biases remain statistically detectable after prolonged pretraining requires explicit quantitative tracking of the distinguishability metric (e.g., seed-classification accuracy or output-distribution KL divergence) across training checkpoints. The abstract asserts robustness from initialization to large-scale pretraining, but without reported decay curves or values at intermediate steps (e.g., after 10B vs. 100B tokens), the load-bearing persistence assumption is not fully substantiated.

    Authors: We agree that explicit tracking of the distinguishability metric across intermediate checkpoints would provide stronger support for the persistence claim. Our experiments already evaluate SeedPrints at initialization, after full-scale pretraining on hundreds of billions of tokens, and post-adaptation, demonstrating that seed-level distinguishability remains statistically significant without substantial degradation. However, we did not report fine-grained decay curves at specific token milestones such as 10B or 100B. In the revision we will add a dedicated analysis section with checkpointed results, including plots of seed-classification accuracy and output-distribution KL divergence over the course of pretraining. revision: yes

  2. Referee: [Evaluations on large-scale pretraining trajectories and real-world benchmarks] §4 (or equivalent experimental section): the evaluations on domain shifts and parameter modifications must include concrete details on the shifts tested, number of seeds/models evaluated, and statistical tests (with error bars or p-values) to support the high-confidence verification claim. The current description leaves the strength of seed-level distinguishability under distribution shifts difficult to assess.

    Authors: We appreciate the request for greater specificity. Section 4 already describes evaluations on LLaMA-style and Qwen-style models under domain shifts (including code and math corpora) and parameter modifications, using multiple random seeds per architecture and reporting average accuracies. To make the strength of the results easier to assess, we will revise the section to explicitly enumerate the exact shifts tested, the precise number of seeds and models, and to include error bars together with statistical significance tests (e.g., t-tests with p-values) supporting the high-confidence lineage verification claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical persistence claims are externally testable

full rationale

The paper advances an empirical fingerprinting technique that detects reproducible next-token biases induced by random initialization seeds, then demonstrates via experiments that these signals remain detectable after pretraining and adaptation on LLaMA-style and Qwen-style models. No derivation chain, equation, or uniqueness theorem is invoked that reduces the target distinguishability metric to a fitted parameter or self-citation by construction. The load-bearing persistence claim is presented as an experimental result rather than a definitional or self-referential necessity, leaving the method open to falsification on independent training runs and benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the persistence of seed-induced biases as an intrinsic property; no free parameters, axioms, or invented entities are explicitly introduced or quantified in the abstract.

pith-pipeline@v0.9.0 · 5803 in / 1201 out tokens · 29908 ms · 2026-05-18T11:57:57.920168+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Structural Origin of Attention Sink: Variance Discrepancy, Super Neurons, and Dimension Disparity

    cs.LG 2026-05 unverdicted novelty 6.0

    Attention sinks arise from variance discrepancy in self-attention value aggregation, amplified by super neurons and first-token dimension disparity, and can be mitigated by head-wise RMSNorm to accelerate pre-training...

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · cited by 1 Pith paper · 5 internal anchors

  1. [1]

    Agnibh Dasgupta, Abdullah Tanvir, and Xin Zhong

    URL https: //lmsys.org/blog/2023-03-30-vicuna/. Agnibh Dasgupta, Abdullah Tanvir, and Xin Zhong. Watermarking language models through language models.arXiv preprint arXiv:2411.05091,

  2. [2]

    TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

    Ronen Eldan and Yuanzhi Li. Tinystories: How small can language models be and still speak coherent english?arXiv preprint arXiv:2305.07759,

  3. [3]

    Dmitri Iourovitski, Sanat Sharma, and Rakshak Talwar

    Fine-tuned from NousResearch/Llama-2-7b-hf; MIT License; accessed 2025-09-02. Dmitri Iourovitski, Sanat Sharma, and Rakshak Talwar. Hide and seek: Fingerprinting large language models with evolutionary learning.arXiv preprint arXiv:2408.02871,

  4. [4]

    WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

    Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Jianguang Lou, Chongyang Tao, Xiubo Geng, Qingwei Lin, Shifeng Chen, and Dongmei Zhang. Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct.arXiv preprint arXiv:2308.09583,

  5. [5]

    Anshul Nasery, Jonathan Hayase, Creston Brooks, Peiyao Sheng, Himanshu Tyagi, Pramod Viswanath, and Sewoong Oh

    Base 7 billion-parameter Code Llama model for code synthesis and understanding; trained between January and July 2023; licensed under Meta Llama 2 license; accessed 2025-09-02. Anshul Nasery, Jonathan Hayase, Creston Brooks, Peiyao Sheng, Himanshu Tyagi, Pramod Viswanath, and Sewoong Oh. Scalable fingerprinting of large language models. InICLR 2025 Worksh...

  6. [6]

    Sok: Large language model copyright auditing via fingerprinting.arXiv preprint arXiv:2508.19843,

    Shuo Shao, Yiming Li, Yu He, Hongwei Yao, Wenyuan Yang, Dacheng Tao, and Zhan Qin. Sok: Large language model copyright auditing via fingerprinting.arXiv preprint arXiv:2508.19843,

  7. [7]

    Natural fingerprints of large language models.arXiv preprint arXiv:2504.14871,

    Teppei Suzuki, Ryokan Ri, and Sho Takase. Natural fingerprints of large language models.arXiv preprint arXiv:2504.14871,

  8. [8]

    Qwen2 Technical Report

    Qwen Team. Qwen2 technical report.arXiv preprint arXiv:2407.10671, 2,

  9. [9]

    LLaMA: Open and Efficient Foundation Language Models

    Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971,

  10. [10]

    Rofl: Robust fingerprinting of language models.arXiv preprint arXiv:2505.12682,

    Yun-Yun Tsai, Chuan Guo, Junfeng Yang, and Laurens van der Maaten. Rofl: Robust fingerprinting of language models.arXiv preprint arXiv:2505.12682,

  11. [11]

    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art...

  12. [12]

    URLhttps://www.aclweb.org/anthology/2020.emnlp-demos.6

    Association for Computational Linguistics. URLhttps://www.aclweb.org/anthology/2020.emnlp-demos.6. Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, and Muhao Chen. Instruc- tional fingerprinting of large language models.arXiv preprint arXiv:2401.12255,

  13. [13]

    Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model!

    Do-hyeon Yoon, Minsoo Chun, Thomas Allen, Hans Müller, Min Wang, and Rajesh Sharma. Intrinsic fingerprint of llms: Continue training is not all you need to steal a model!arXiv preprint arXiv:2507.03014,

  14. [14]

    Protecting intellectual property of deep neural networks with watermarking

    12 Preprint Jialong Zhang, Zhongshu Gu, Jiyong Jang, Hui Wu, Marc Ph Stoecklin, Heqing Huang, and Ian Molloy. Protecting intellectual property of deep neural networks with watermarking. InProceedings of the 2018 on Asia conference on computer and communications security, pp. 159–172,

  15. [15]

    Reef: Representation encoding fingerprints for large language models.arXiv preprint arXiv:2410.14273,

    Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong Liu, Yu Qiao, and Jing Shao. Reef: Representation encoding fingerprints for large language models.arXiv preprint arXiv:2410.14273,

  16. [16]

    Matrix-driven instant review: Confident detection and reconstruction of llm plagiarism on pc.arXiv preprint arXiv:2508.06309,

    Ruichong Zhang. Matrix-driven instant review: Confident detection and reconstruction of llm plagiarism on pc.arXiv preprint arXiv:2508.06309,

  17. [17]

    Secure neural network watermarking protocol against forging attack.EURASIP Journal on Image and Video Processing, 2020(1):37,

    Renjie Zhu, Xinpeng Zhang, Mengte Shi, and Zhenjun Tang. Secure neural network watermarking protocol against forging attack.EURASIP Journal on Image and Video Processing, 2020(1):37,

  18. [18]

    Their use covers grammar and style checks, improving clarity of figure and table captions, and other surface-level edits

    13 Preprint A LLM USAGE We use AI assistants, i.e., ChatGPT and Gemini, for writing and formatting support. Their use covers grammar and style checks, improving clarity of figure and table captions, and other surface-level edits. For programming-related tasks, we occasionally use GitHub Copilot and Claude as coding assistants, e.g., for code auto-completi...

  19. [19]

    for distributed runs. All open-source models are loaded from their official Hugging Face releases and used under their original licenses: Llama models under the Meta Llama Community License, and other models under Apache-2.0. All datasets are downloaded via the Hugging Face Datasets library (the library is Apache-2.0); dataset content follows each dataset...

  20. [20]

    PT” and “IT

    Consistent with the LLaMA-style results, all seed–model pairs yield p-values below 0.01. This indicates that training does not erase the initialization fingerprint; instead, the signature is preserved in the descendant model. 14 Preprint Table 10: Trained models share the same fingerprint behaviors as their initialization models (p-value < 0.01). Model Pa...