De-Anonymization at Scale via Tournament-Style Attribution
Pith reviewed 2026-05-16 13:11 UTC · model grok-4.3
The pith
Large language models can link anonymous texts to their authors even from pools of tens of thousands of candidates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
De-Anonymization at Scale attributes authorship by randomly partitioning a large candidate corpus into fixed-size groups, prompting an LLM to select the text most likely written by the same author as a given query, and iteratively narrowing the surviving candidates; dense-retrieval prefiltering reduces the initial search space and majority-voting aggregation across independent runs improves ranking precision, enabling recovery of same-author texts from pools of tens of thousands with accuracy well above chance on anonymized review data.
What carries the argument
Tournament-style sequential partitioning and LLM querying, augmented by dense-retrieval prefiltering and majority-voting aggregation, that iteratively narrows authorship candidates from large pools.
If this is right
- Anonymous platforms such as double-blind review systems face a concrete risk of author identification from large candidate sets.
- The method scales authorship attribution beyond what direct pairwise comparison could handle while improving accuracy on standard benchmarks like Enron emails.
- Dense retrieval plus tournament querying together make LLM-based de-anonymization practical at real-world sizes.
- Majority voting over independent runs reduces sensitivity to individual model mistakes in the selection steps.
- The approach works on anonymized review data, showing the vulnerability is not limited to artificial or easy cases.
Where Pith is reading between the lines
- Similar risks could appear on any platform that publishes many anonymous texts from overlapping author pools if the texts remain accessible.
- Testing the method on texts that authors have intentionally altered to hide style would clarify how robust the signals are to user countermeasures.
- The scalability gain suggests the technique could be applied to much larger public corpora, such as social-media archives, once retrieval costs are managed.
- Future platform defenses might need to combine text obfuscation with limits on how many candidate texts an attacker can access at once.
Load-bearing premise
Large language models can reliably detect authorship signals in anonymized texts even when topic, content, or deliberate style changes might mislead them.
What would settle it
Apply the same DAS procedure to a corpus of texts that have been deliberately rewritten by their authors to erase stylistic markers and check whether top-k accuracy falls to chance levels.
read the original abstract
As LLMs rapidly advance and enter real-world use, their privacy implications are increasingly important. We study an authorship de-anonymization threat: using LLMs to link anonymous documents to their authors, potentially compromising settings such as double-blind peer review. We propose De-Anonymization at Scale (DAS), a large language model-based method for attributing authorship among tens of thousands of candidate texts. DAS uses a sequential progression strategy: it randomly partitions the candidate corpus into fixed-size groups, prompts an LLM to select the text most likely written by the same author as a query text, and iteratively re-queries the surviving candidates to produce a ranked top-k list. To make this practical at scale, DAS adds a dense-retrieval prefilter to shrink the search space and a majority-voting style aggregation over multiple independent runs to improve robustness and ranking precision. Experiments on anonymized review data show DAS can recover same-author texts from pools of tens of thousands with accuracy well above chance, demonstrating a realistic privacy risk for anonymous platforms. On standard authorship benchmarks (Enron emails and blog posts), DAS also improves both accuracy and scalability over prior approaches, highlighting a new LLM-enabled de-anonymization vulnerability.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes De-Anonymization at Scale (DAS), an LLM-based authorship attribution method that randomly partitions large candidate pools (tens of thousands of texts), sequentially queries an LLM to select the most likely same-author match, and aggregates rankings over multiple runs with dense-retrieval prefiltering. Experiments on anonymized review data claim recovery of same-author texts well above chance, and DAS is reported to improve accuracy and scalability over prior methods on Enron emails and blog-post benchmarks.
Significance. If the results hold after controlling for content leakage, the work would demonstrate a scalable, realistic de-anonymization threat to anonymous platforms such as double-blind review, extending LLM capabilities into a new privacy attack vector with practical implications for system design.
major comments (1)
- [Abstract / Experiments on anonymized review data] Abstract and experimental description: no topic-matched or subfield-controlled baselines are mentioned for the review-data experiments. Same-author documents in academic review pools are likely to share research topics, so sequential LLM selection could succeed via semantic similarity rather than stylistic authorship signals; this directly affects whether the above-chance accuracy supports the claimed de-anonymization capability.
minor comments (1)
- [Method] The description of the majority-voting aggregation and the exact prompting templates would benefit from explicit pseudocode or a numbered algorithmic listing to allow reproduction.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and constructive feedback. We address the major comment on the review-data experiments below and will revise the manuscript accordingly to strengthen the claims.
read point-by-point responses
-
Referee: Abstract and experimental description: no topic-matched or subfield-controlled baselines are mentioned for the review-data experiments. Same-author documents in academic review pools are likely to share research topics, so sequential LLM selection could succeed via semantic similarity rather than stylistic authorship signals; this directly affects whether the above-chance accuracy supports the claimed de-anonymization capability.
Authors: We agree that isolating stylistic authorship signals from topical similarity is essential for validating the de-anonymization claim. The review-data experiments were designed to reflect realistic double-blind review settings, where same-author documents frequently share subfields. To address this directly, we will add a content-only baseline that uses only the dense-retrieval prefilter (without the LLM tournament or majority voting) and report its accuracy against full DAS on the same pools. We will also include a comparison to a simple topic-similarity oracle if subfield metadata can be recovered from the anonymized data. These additions will quantify how much the LLM-based attribution improves over pure semantic matching and clarify the contribution of authorship signals. revision: yes
Circularity Check
Empirical algorithm with no circular derivation chain
full rationale
The paper presents DAS as a practical algorithmic procedure (random partitioning into groups, LLM-based selection of same-author candidates, dense-retrieval prefilter, and majority-voting aggregation) and evaluates it via experiments on anonymized review data plus standard benchmarks. No equations, self-definitions, or load-bearing self-citations are shown that would reduce any claimed prediction or result to its own inputs by construction. The central claims rest on reported experimental accuracies rather than any mathematical reduction or renamed known result.
Axiom & Free-Parameter Ledger
free parameters (2)
- group size
- number of independent runs
axioms (1)
- domain assumption LLMs can detect authorship signals in text even after anonymization
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.